Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Hi All,

It is probably my misunderstanding, your advices are very much appreciated. I made an xml file to show the problem, the real file(s) are much more complex.

<?xml version="1.0" encoding="utf-8"?> <root> <departments> <department> <name>A</name> <articles> <article> <production> <country>France</country> <year>1989</year> </production> <key>001</key> </article> <article> <production> <country>Italy</country> <year>1991</year> </production> <key>002</key> </article> <article> <extra> <production> <country>Germany</country> <year>1995</year> </production> <extrakey>003</extrakey> </extra> </article> </articles> </department> <department> <name>B</name> <articles> <article> <key>004</key> </article> <article> <key>005</key> </article> </articles> </department> <department> <name>C</name> </department> </departments> </root>

I parse the file with XML::Rabbit using the following library file (made for this example as well).

package House { use XML::Rabbit::Root; has_xpath_object_list depts => '/root/departments/department' => 'House::Depts'; finalize_class(); } package House::Depts { use XML::Rabbit; has_xpath_object_list articles => './articles/article[key]|./artic +les/article/extra' => 'House::Article'; has_xpath_value name => './name'; finalize_class(); } package House::Article { use XML::Rabbit; has_xpath_value key => './key|./extrakey'; has_xpath_object prod => './production' => 'Article::Production'; finalize_class(); } package Article::Production { use XML::Rabbit; has_xpath_value country => './country'; has_xpath_value year => './year'; finalize_class(); } 1;

Now I use this library file in the following script.

#!/perl use strict; use warnings FATAL => qw(all); use Encode; use FindBin qw($Bin); use lib $Bin; use testlib; use Try::Tiny; my $path = shift or die "No source!\n"; my $string = do { local $/ = undef; open my $in, "<", "$path" or die "$!";<$in>;}; my $xml = decode('utf8', $string); my $h = House->new(xml => $xml); for my $d ( @{$h->depts} ) { for my $art ( @{$d->articles} ) { my $prod = try {$art->prod} catch {0}; # $art->prod // 0; my $country = $prod ? $prod->country : ''; print join("\t", $d->name, $art->key, $country), $/; } }

You see that not all the articles have the "production" child. The line my $prod = $art->prod//0; (commented) would produce an error: "Attribute (prod) does not pass the type constraint because: Validation failed for 'Article::Production' with value undef at reader House::Article::prod (unknown origin)..." If I use try {...} catch {...}, then I get the awaited output.

A 001 France A 002 Italy A 003 Germany B 004 B 005

Now two questions.

  • First, is the use of Try::Tiny here "decent" or is it just a hack and the output should be achieved without it by changing the code (probably in the library file)?
  • Second, the script complains over missing xpath_object ("production") but does not seem to bother if xpath_object_list is not there: the loop over articles does not produce an error though there are not articles in the department C. Why does it make a difference?
Thank you very much in advance.


In reply to Parsing XML with XML::Rabbit - two questions by vagabonding electron

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2024-04-24 16:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found