Think about Loose Coupling | |
PerlMonks |
Re: Processing Two XML Files in Parallelby ambrus (Abbot) |
on Jul 24, 2011 at 20:39 UTC ( [id://916445]=note: print w/replies, xml ) | Need Help?? |
I agree with the previous replies in that running two XML parsers each in its own Coros seems to be a good way to do this. However, I'd like to show a solution not using Coro, just for the challenge of it. This solution uses the stream parsing capability of XML::Parse. The documentation of XML::Twig states that you probably should not use with XML::Twig and is untested. We read the input XML files in small chunks (20 bytes here for demonstration, but should be much more than that in the real application). In each loop iteration, we read from the file that's behind the other, that is, the one from which we have read less items so far. This way, the files remain in sync even if the length of the items differ. Once the xml parser has found an item from both files, we pair these and print an item with the two texts concatenated. The warnings I have commented out show that the files are indeed read in parallel. I also hope that chunks of the file we have processed don't remain in memory, and there are no other bugs, but then you should of course verify this if you want to use this code in production.
Update 2013-04-23: RFC: Simulating Ruby's "yield" and "blocks" in Perl may be related.
In Section
Seekers of Perl Wisdom
|
|