Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Parsing XML

by calebcall (Sexton)
on Mar 08, 2014 at 06:20 UTC ( [id://1077492]=perlquestion: print w/replies, xml ) Need Help??

calebcall has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse some XML. I've been searching for some tutorials, reading perldocs, trying to find examples but can't find anything that's quite what I need. I'm trying to query namecheap.com's api to get a list of my domains and the date they expire. An example of how it outputs is:

<?xml version="1.0" encoding="utf-8"?> <ApiResponse Status="OK" xmlns="http://api.namecheap.com/xml.response" +> <Errors /> <Warnings /> <RequestedCommand>namecheap.domains.getList</RequestedCommand> <CommandResponse Type="namecheap.domains.getList"> <DomainGetListResult> <Domain ID="8888888" Name="Domain1.com" Expires="03/31/2015"/> <Domain ID="8888889" Name="Domain2.com" Expires="02/25/2015"/> <Domain ID="8888899" Name="Domain3.com" Expires="04/01/2015"/> <Domain ID="8888999" Name="Domain4.com" Expires="05/20/2015"/> </DomainGetListResult> <Paging> <TotalItems>4</TotalItems> <CurrentPage>1</CurrentPage> <PageSize>50</PageSize> </Paging> </CommandResponse> <Server>API02</Server> <GMTTimeDifference>--5:00</GMTTimeDifference> <ExecutionTime>0.008</ExecutionTime> </ApiResponse>

I've tried many things, but I think I'm close with the following:

#!/usr/bin/env perl use XML::Simple; use Data::Dumper; $xml = new XML::Simple; $result = $xml->XMLin("myouput.xml"); foreach my $domain (@{$result->{DomainGetListResult}}) { print $domain->{DomainGetListResult}->{Domain}->{Name} . "\n"; }

However, this returns nothing. I can use Data Dumper and dump out my $result and it gives me the info posted above (but in a hash). So I know it's getting the data. Just how do I specify the nested values? Do I start at the farthest out? (in this case ApiResponse?)

I'm also not opposed to using something other than XML::Simple if there is a module that is easier to use or better suited to what I am trying to do.

Any help would be appreciated. Thanks

Replies are listed 'Best First'.
Re: Parsing XML
by McA (Priest) on Mar 08, 2014 at 06:40 UTC

    Hi,

    Look at this snippet:

    foreach my $domain (@{$result->{CommandResponse}->{DomainGetListResult +}->{Domain}}) { print $domain->{Name} . "\n"; print $domain->{ID} . "\n"; print $domain->{Expires} . "\n"; }

    Best regards
    McA

      Perfect! Thanks. So simple now that I see what you did, it works great.

        You're welcome. But be aware, XML::Simple plays not nice sometimes. Change your xml file to the following and show what happens:

        <?xml version="1.0" encoding="utf-8"?> <ApiResponse Status="OK" xmlns="http://api.namecheap.com/xml.response" +> <Errors /> <Warnings /> <RequestedCommand>namecheap.domains.getList</RequestedCommand> <CommandResponse Type="namecheap.domains.getList"> <DomainGetListResult> <Domain ID="8888999" Name="Domain4.com" Expires="05/20/2015"/> </DomainGetListResult> <Paging> <TotalItems>4</TotalItems> <CurrentPage>1</CurrentPage> <PageSize>50</PageSize> </Paging> </CommandResponse> <Server>API02</Server> <GMTTimeDifference>--5:00</GMTTimeDifference> <ExecutionTime>0.008</ExecutionTime> </ApiResponse>

        You'll see your script dies.

        XML::Simple creates an array ref when it sees more than one element, but a simple hash ref when there is one.

        Instantiate you XML::Simple object this way to force that the one element is always handles as an array:

        my $result = $xml->XMLin("myouput.xml", ForceArray => ['Domain']);

        When you start learning XML handling modules have a look at XML::Twig.

        McA

      So this works great if there are multiple items (domains in this case) that return. I've got cases where only a singe domain will return and I get the following error:

      Not an ARRAY reference at ./namecheap.pl line 99

      Line 99 in the line my foreach loop begins on. In my uneducated opinion, it would be due to returning only a single result and referencing it using @. I've tried a few other stabs at being able to handle both a single and multiple results but can't come up with anything. My next guess is to try to find how long my results are (i.e. greater than one) and handle them separately...but that seems like a lot of redundant work having the exact same thing but for different results.

        Did you use ForceArray as McA suggested?
        That is problem with XML::Simple :) sure there are options but its easier to use XML::Rules , its is XML::Simple on steriods, exampe
Re: Parsing XML
by kcott (Archbishop) on Mar 08, 2014 at 07:01 UTC

    G'day calebcall,

    Using Data::Dump to show the structure of $result:

    #!/usr/bin/env perl use strict; use warnings; use XML::Simple; my $xml = XML::Simple->new(); my $result = $xml->XMLin(<<'EOX'); <?xml version="1.0" encoding="utf-8"?> <ApiResponse Status="OK" xmlns="http://api.namecheap.com/xml.response" +> <Errors /> <Warnings /> <RequestedCommand>namecheap.domains.getList</RequestedCommand> <CommandResponse Type="namecheap.domains.getList"> <DomainGetListResult> <Domain ID="8888888" Name="Domain1.com" Expires="03/31/2015"/> <Domain ID="8888889" Name="Domain2.com" Expires="02/25/2015"/> <Domain ID="8888899" Name="Domain3.com" Expires="04/01/2015"/> <Domain ID="8888999" Name="Domain4.com" Expires="05/20/2015"/> </DomainGetListResult> <Paging> <TotalItems>4</TotalItems> <CurrentPage>1</CurrentPage> <PageSize>50</PageSize> </Paging> </CommandResponse> <Server>API02</Server> <GMTTimeDifference>--5:00</GMTTimeDifference> <ExecutionTime>0.008</ExecutionTime> </ApiResponse> EOX use Data::Dump; dd $result;

    I get:

    { CommandResponse => { DomainGetListResult => { Domain => [ { Expires => "03/31/2015", ID => 8888888, Name => "Domain1.com +" }, { Expires => "02/25/2015", ID => 8888889, Name => "Domain2.com +" }, { Expires => "04/01/2015", ID => 8888899, Name => "Domain3.com +" }, { Expires => "05/20/2015", ID => 8888999, Name => "Domain4.com +" }, ], }, Paging => { CurrentPage => 1, PageSize => 50, TotalItems => 4 }, Type => "namecheap.domains.getList", }, Errors => {}, ExecutionTime => 0.008, GMTTimeDifference => "--5:00", RequestedCommand => "namecheap.domains.getList", Server => "API02", Status => "OK", Warnings => {}, xmlns => "http://api.namecheap.com/xml.response", }

    So, changing your loop to this:

    foreach my $domain (@{$result->{CommandResponse}{DomainGetListResult}{ +Domain}}) { print $domain->{Name} . "\n"; }

    I get:

    Domain1.com Domain2.com Domain3.com Domain4.com

    -- Ken

Re: Parsing XML
by choroba (Cardinal) on Mar 11, 2014 at 09:02 UTC
    Instead of XML::Simple, I prefer XML::XSH2, a wrapper around XML::XSH2.
    open myoutput.xml ; register-namespace nc http://api.namecheap.com/xml.response ; for //nc:Domain echo @Name 'expires on' @Expires ;
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Parsing XML
by Discipulus (Canon) on Mar 11, 2014 at 08:58 UTC
    hello, sorry if i missed this...
    I can suggest you to switch to XML::Twig? I was in a fix in choosing the rigth tool and now i'm very happy with this one.

    Here the proposed solution using XML::Twig
    #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( pretty_print => 'indented', twig_handlers => { #'Domain'=>sub{$_[1]->print,"\n";}, ##print +the raw xml 'Domain'=>sub{print $_[1]->att('Name'),' exp +ires ',$_[1]->att('Expires'),"\n" ;}, } ); $/=''; $t->parse(<DATA>); __DATA__ <?xml version="1.0" encoding="utf-8"?> <ApiResponse Status="OK" xmlns="http://api.namecheap.com/xml.response" +> <Errors /> <Warnings /> <RequestedCommand>namecheap.domains.getList</RequestedCommand> <CommandResponse Type="namecheap.domains.getList"> <DomainGetListResult> <Domain ID="8888888" Name="Domain1.com" Expires="03/31/2015"/> <Domain ID="8888889" Name="Domain2.com" Expires="02/25/2015"/> <Domain ID="8888899" Name="Domain3.com" Expires="04/01/2015"/> <Domain ID="8888999" Name="Domain4.com" Expires="05/20/2015"/> </DomainGetListResult> <Paging> <TotalItems>4</TotalItems> <CurrentPage>1</CurrentPage> <PageSize>50</PageSize> </Paging> </CommandResponse> <Server>API02</Server> <GMTTimeDifference>--5:00</GMTTimeDifference> <ExecutionTime>0.008</ExecutionTime> </ApiResponse> __OUTPUT__ Domain1.com expires 03/31/2015 Domain2.com expires 02/25/2015 Domain3.com expires 04/01/2015 Domain4.com expires 05/20/2015

    HtH
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1077492]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-04-24 03:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found