Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

need a xml parser module

by Anonymous Monk
on Dec 12, 2007 at 03:34 UTC ( [id://656542]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,
i need to parse a xml file and take the parsed contents in to a database
is there any module which i can you that does this
ex:
<InteractionComponentList>
<InteractionComponent role_type="input"
molecule_idref="200568"> </InteractionComponent>
i need the output like this:
role_type="input"
molecule_idref="200568"

Replies are listed 'Best First'.
Re: need a xml parser module
by mirod (Canon) on Dec 12, 2007 at 06:11 UTC

      As it seems that mirod still did not have time to update those pages and to make the comparison more complete ... here's the solution using XML::Rules:

      #!/usr/bin/perl -w use strict; use XML::Rules; use FindBin qw($Bin); use lib $Bin; use wtr2_base; my $DEBUG=0; init_db(); my $parser = XML::Rules->new( start_rules => [ Finvoice => sub { $_[1]->{errors} = []; # initialization reset_default_row_id(); return 1; # necessary so that the tag is actually processe +d! }, 'PaymentTermsDetails,VatSpecificationDetails,EpiDetails,Seller +AccountDetails,BuyerPostalAddressDetails,SellerInformationDetails' => + 'skip', # I don't care about those at all ], rules => [ _default => 'content', # unless I say otherwise I'm only interested in the conten +t 'PaymentStatusDetails,BuyerCommunicationDetails,SellerPostalAd +dressDetails,SellerCommunicationDetails' => 'pass no content', # I want to dissolve those into their parent tag. I'm not +interested in the whitespace around the child nodes 'DeliveredQuantity,OrderedQuantity' => 'as is', # I need both the content and the attributes of those two BuyerPartyDetails => sub { delete $_[1]->{_content}; check_buyer( $_[1]->{BuyerPartyIdentifier}, $_[1]->{Buyer +OrganisationName}, $_[3]->[0]{errors} ); return $_[0] => $_[1]; }, OrderIdentifier => sub { check_po( $_[1]->{_content}, $_[3]->[0]{errors}); return OrderIdentifier => $_[1]->{_content}; }, InvoiceRow => sub { delete $_[1]->{_content}; $_[1]->{RowIdentifier} = default_row_id() unless $_[1]->{R +owIdentifier}; print "checking row $_[1]->{RowIdentifier}\n" if $DEBUG; check_qtty( $_[1]->{RowIdentifier}, $_[1]->{DeliveredQuantity}->{_content}, $_[1]->{DeliveredQuantity}->{QuantityUnitCode}, $_[1]->{OrderedQuantity}->{_content}, $_[1]->{OrderedQuantity}->{QuantityUnitCode}, $_[3]->[0]{errors} ); return '@invoicerow' => { row_id => $_[1]->{RowIdentifier}, sku => $_[1]->{ArticleIdentifier}, name => $_[1]->{ArticleName}, qty => $_[1]->{DeliveredQuantity}->{_content +}, qty_unit => $_[1]->{DeliveredQuantity}->{Quantity +UnitCode}, unit_price => $_[1]->{UnitPriceAmount}, amount_no_tax => $_[1]->{RowVatExcludedAmount}, tax => $_[1]->{RowVatAmount}, amount => $_[1]->{RowAmount}, } }, InvoiceDetails => sub { return invoice => { number => $_[1]->{InvoiceNumber}, date => $_[1]->{InvoiceDate}, po => $_[1]->{OrderIdentifier}, amount_no_tax => $_[1]->{InvoiceTotalVatExcludedAmoun +t}, tax => $_[1]->{InvoiceTotalVatAmount}, amount => $_[1]->{InvoiceTotalVatIncludedAmoun +t}, }, }, SellerPartyDetails => sub { return seller => { identifier => $_[1]->{SellerPartyIdentifier}, name => $_[1]->{SellerOrganisationName}, tax_code => $_[1]->{SellerOrganisationTaxCode}, }, address => { street => $_[1]->{SellerStreetName}, town => $_[1]->{SellerTownName}, zip => $_[1]->{SellerPostCodeIdentifier}, country_code => $_[1]->{CountryCode}, po_box => $_[1]->{SellerPostOfficeBoxIdentifie +r}, } }, Finvoice => sub { delete $_[1]->{_content}; $_[1]->{invoice}{payment_status} = delete $_[1]->{PaymentS +tatusCode}; return invoice => $_[1]->{invoice}, seller => $_[1]->{seller}, address => $_[1]->{address}, invoicerow => $_[1]->{invoicerow}, errors => $_[1]->{errors}, contact => { name => $_[1]->{SellerContactPersonName} +, phone => $_[1]->{SellerPhoneNumberIdentif +ier}, email => $_[1]->{SellerEmailaddressIdenti +fier}, }; }, ], ); my $filter = XML::Rules->new( style => 'filter', rules => [ _default => 'raw', Finvoice => sub { push @{$_[1]->{_content}}, " ", [errors => {error => $_[4 +]->{parameters}}], "\n"; return ($_[0] => $_[1]); } ] ); my @files= @ARGV || (<$dir{invoices}/*.xml>); foreach my $file (@files) { print "Processing $file\n" if $DEBUG; my $data = $parser->parsefile($file); if (@{$data->{errors}}) { my $errors = $data->{errors}; print "ERROR in $file\n ", join( "\n ", @$errors), "\n"; my $rejected_file= rejected( $file); print "adding errors in $rejected_file\n" if( $DEBUG); open my $OUT, '>', $rejected_file or die "Can't open '$rejecte +d_file' for writing: $^E\n"; $filter->filterfile( $file, $OUT, $errors); close $OUT; } else { print "storing invoice $data->{invoice}->{number}\n"; store_all( $data); } }
Re: need a xml parser module
by Gangabass (Vicar) on Dec 12, 2007 at 03:44 UTC
    use XML::Simple; use DBI;

    Also use <code> tag for posting code

Re: need a xml parser module
by Erez (Priest) on Dec 12, 2007 at 08:40 UTC
    Check the Perl-XML FAQ for assistance regarding choosing the right XML parser.
    Any XML-to-Database module usage depends on your exact needs; if you only need to store a couple of values in a DB, using a module for automating the parsing-extracting-storage process might be overkill.

    Software speaks in tongues of man; I debug, therefore I code.

Re: need a xml parser module
by gube (Parson) on Dec 12, 2007 at 10:35 UTC

    Hi, Use this below code

    #!/usr/local/bin/perl use strict; use warnings; use XML::Simple qw/:strict/; use Data::Dumper; my $xml = qq{ <InteractionComponentList> <InteractionComponent role_type="input" molecule_idref="200568"></Inte +ractionComponent></InteractionComponentList>}; my $xml_simple = XML::Simple->new(KeyAttr => 1, KeepRoot => 1, ForceAr +ray => 1); my $output = $xml_simple->XMLin($xml); my %hash = %{$output->{InteractionComponentList}[0]{InteractionCompone +nt}[0]}; print Dumper(\%hash);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://656542]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2024-04-26 00:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found