gedcom2scope
The GEDCOM format is odd. It's ancient and was designed to work with
Hollerith-style IBM cards. However, nothing suitable has come around to
replace it and it remains essential to the genealogy community.
This converts a GEDCOM file to "scoped" data blocks, making it a bit
easier to deal with. Of all of GEDCOM's oddities, I've dealt with only one
here: the "0" record. There is no attempt to deal with badly badly-formatted
or illogical data. The script reads from STDIN and prints to STDOUT.
#!/usr/bin/perl
## GEDCOM to 'scope' conversion.
## Michael Cooley
## Version 1 (12 June 2009)
## Make $pretty=0 to get rid of the needless human-readable formatting.
undef my $newline;
my $pretty=3;
if ($pretty) {$newline="\n"}
while (<>)
{
chomp;
# Strip out nasty Window$ carriage returns.
s/\r//g;
($level,$tag,@data)= split / /;
$data=join " ",@data;
if ($level < $lastlevel)
{
## We may have multiple blocks to close.
for($i = $lastlevel; $i > $level; $i--)
{
if ($pretty)
{
printf "%*s",(($i-1)*($pretty));
}
## Close the block ##
print "}$newline";
}
}
## This is the primary record identifier. The "0" record is intertpreted
## differenthy then the other records: "$tag" is the record ID and "$data"
## is the record type. I'm going to make "$data" a subordinate record of
## the level "1" type. In so doing, I have to give it the field name TYPE.
## All other records are of the key/data kind of pairing.
if ($level==0)
{
print "$tag";
if ($pretty)
{
if ($tag ne ("HEAD" || "TRLR")) {print "$newline"}
}
## Work around for the odd placement of TYPE.
if ($data)
{
## I can't make "$pretty" work here.
printf "{$newline TYPE$newline {$newline %s$newline }$newline",$data;
## Tricking the program to suppress a "{".
$level++;
}
}
else
{
## Open the block. ##
if ($level > $lastlevel)
{
printf "$newline%*s{",($level-1)*$pretty;
}
if ($pretty)
{
if ($level >= $lastlevel) {print "$newline";}
printf "%*s",$level*$pretty;
}
## The real data. ##
print "$tag";
if ($pretty)
{
print " ";
}
if ($data)
{
printf "$newline%*s{",$level*$pretty;
printf "$newline%*s$data",($level+1)*$pretty;
printf "$newline%*s}$newline",($level)*$pretty;
}
}
$lastlevel=$level;
}