gedcom2scope

The GEDCOM format is odd. It's ancient and was designed to work with Hollerith-style IBM cards. However, nothing suitable has come around to replace it and it remains essential to the genealogy community.

This converts a GEDCOM file to "scoped" data blocks, making it a bit easier to deal with. Of all of GEDCOM's oddities, I've dealt with only one here: the "0" record. There is no attempt to deal with badly badly-formatted or illogical data. The script reads from STDIN and prints to STDOUT.

#!/usr/bin/perl ## GEDCOM to 'scope' conversion. ## Michael Cooley ## Version 1 (12 June 2009) ## Make $pretty=0 to get rid of the needless human-readable formatting. undef my $newline; my $pretty=3; if ($pretty) {$newline="\n"} while (<>) { chomp; # Strip out nasty Window$ carriage returns. s/\r//g; ($level,$tag,@data)= split / /; $data=join " ",@data; if ($level < $lastlevel) { ## We may have multiple blocks to close. for($i = $lastlevel; $i > $level; $i--) { if ($pretty) { printf "%*s",(($i-1)*($pretty)); } ## Close the block ## print "}$newline"; } } ## This is the primary record identifier. The "0" record is intertpreted ## differenthy then the other records: "$tag" is the record ID and "$data" ## is the record type. I'm going to make "$data" a subordinate record of ## the level "1" type. In so doing, I have to give it the field name TYPE. ## All other records are of the key/data kind of pairing. if ($level==0) { print "$tag"; if ($pretty) { if ($tag ne ("HEAD" || "TRLR")) {print "$newline"} } ## Work around for the odd placement of TYPE. if ($data) { ## I can't make "$pretty" work here. printf "{$newline TYPE$newline {$newline %s$newline }$newline",$data; ## Tricking the program to suppress a "{". $level++; } } else { ## Open the block. ## if ($level > $lastlevel) { printf "$newline%*s{",($level-1)*$pretty; } if ($pretty) { if ($level >= $lastlevel) {print "$newline";} printf "%*s",$level*$pretty; } ## The real data. ## print "$tag"; if ($pretty) { print " "; } if ($data) { printf "$newline%*s{",$level*$pretty; printf "$newline%*s$data",($level+1)*$pretty; printf "$newline%*s}$newline",($level)*$pretty; } } $lastlevel=$level; }