Note that the XML chunk you posted is not a well-formed XML, as it lacks a root node. I wrapped it into <root>
...
</root>
and used XML::LibXML to get the desired output:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use XML::LibXML;
my @files = @ARGV[0, 1];
my %extracted;
for my $xml_file (@files) {
my $dom = 'XML::LibXML'->load_xml(location => $xml_file);
for my $file ($dom->findnodes('/root/file')) {
my $original = $file->{original};
for my $unit ($file->findnodes('body/unit')) {
my $id = $unit->{id};
my $title = $unit->findvalue('title');
$extracted{$original}{$id}{$xml_file} = $title;
}
}
}
for my $file (keys %extracted) {
for my $id (keys %{ $extracted{$file} }) {
say join "\t", $file, $id, @{ $extracted{$file}{$id} }{@files}
+;
}
}
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord
}map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|