Nothing you have said so far explains the need for YAML/Data:Dump etc serialisation to me. You data is already human readable. Storing the original and processed data in the same file is a simple as adding a separator. You don't need any fancy modules to do it.
while(<DATA>) {
if (m/<ORIG DATA ABOVE MUNGE BELOW>/) {
$munge .= $_ while <DATA>; # slurp
} else {
$orig .= $_;
}
}
As an added benefit of keeping it simple you can leverage diff to do the data comparison to do your eq_or_diff() routine if a simple eq test fails. I really think you are over complicating the task by adding a middleware serialisation layer. You are not actually using it to reconstitute a data structure, nor is there any real need as all you want to do is reformat the old data into the new format so you can process it. Why add useless middleware that only offers the opportunity to include bugs for no real gain?
As I see it you need a base class that has the functions:
my ($orig,$munge) = load_file($file); # munge may be NULL
my $data = parse($orig); # process current format data o
+nly
my $cur_format = serialise($data); # output current format
write_file($orig, $current_format); # write to file with separator
my $invalid = eq_or_diff($munge,$cur_format);
print "$file\n$invalid\n" if $invalid; # diff output, null if OK
Each filter class only requires a parse() method to generate whatever data structure you want to work with in your ultimate program.
You probably already have parse code to work with current data. The serialise method simply writes this data struct back into a sting that you can save. For current data this may or may not be identical to the current data format, but the process is valid if a base class parse on the original and munge data serialises to the same result as it is then round tripping.
Essentially what I am saying is don't use serialisation middleware. Write your own code that takes your data structure (which you need) and serialises it *into the current format* (which you need, mostly for validation). The filters become simply a parse method to generate your standard internal data structure. Note that if your internal data structure uses hashes ensure you apply a sort or a list ordering to the keys during serialisation. If you don't it will probably bite you. It has bitten me before as key return order is not guaranteed and is different in different versions of perl on different OSs for exactly the same data.
Doing it this way gives you:
- Human readable output
- The old data in new data format in the same file
- A format that can easily be munged by diff to show the exact differences, probably in the most intuitively understandable format.
- No useless middleware bugs to deal with. You will personally own all bugs :-)
- A simple one method filter that does the absolute minimal task required - convert old data into a standardised internal representation ready to either work with or write back to file.
|