Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Processing Two XML Files in Parallel

by Logicus (Initiate)
on Jul 21, 2011 at 21:29 UTC ( [id://916002]=note: print w/replies, xml ) Need Help??


in reply to Processing Two XML Files in Parallel

Does each element have a line to itself or is the data multiline? As in, if we read line say 123 from file A, will line 123 in file B be the correct line to do the processing with?

If that is the case, then you could just read both files a line at a time, and use a simple regex to get the value out of the <elem> wrapper;

my ($a,$b,$value_a,$value_b); while (1) { $a = <A>; $b = <B>; if ($a =~ m/<elem>(.*?)</elem>/) { $value_a = $1; } if ($b =~ m/<elem>(.*?)</elem>/) { $value_b = $1; } last if !defined $value_a || !defined $value_b; print data_transform($value_a, $value_b); }

I'm sure better perl adepts than me could write it better/faster, but I think that would work if the files have a line for line concurrency.

Replies are listed 'Best First'.
Re^2: Processing Two XML Files in Parallel
by tinita (Parson) on Jul 23, 2011 at 12:15 UTC

    So you like catch phrases, uh?
    Let me tell you something:
    In about 97% of the time, parsing XML with regexes is the root of all evil. The remaining 3% are left for one-time, quick & dirty scripts and maybe some special cases (where you can assure the XML will stay exactly like that).
    Let me tell you why:
    The creator of the XML to parse might change it. All elements might be on one line. Maybe there will be some empty lines between the tags. Maybe the elem tags will get attributes in the future. In all cases your script will suddenly stop to work, although the actual content you want didn't change. And somebody has to fix it quickly. In the end it's more work then just doing it right from the beginning, and potentially you annoyed a customer and your boss.

    That's how experienced programmers think. Because they know that things like that happen.
    You not only posted a quick & dirty solution, you even bashed someone for posting a clean and correct solution. A quick & dirty solution is ok (although it would be nice to comment that it depends on the exact XML format), and you actually got some ++ for it, but then bashing someone elses correct solution is just infantile.

    A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://916002]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 23:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found