Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Want to fetch inner most child element first

by dharan (Initiate)
on Sep 23, 2015 at 13:05 UTC ( [id://1142802]=perlquestion: print w/replies, xml ) Need Help??

dharan has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I'm new to perl. I want to fetch the innermost child elements first, then parent node using XML::XPath module. Below is the xml content.
<mml:math> <mml:mi>Goa</mml:mi> <mml:mo>abu</mml:mo> <mml:msub> <mml:mrow> <mml:mi>China</mml:mi> </mml:mrow> <mml:mrow> <mml:msub> <mml:mrow> <mml:mi>poland</mml:mi> </mml:mrow> <mml:mrow> <mml:msub>swift</mml:msub> <mml:mi>a</mml:mi> </mml:mrow> </mml:msub> </mml:mrow> </mml:msub> <mml:mo>miot</mml:mo> <mml:msub> <mml:mrow> <mml:mi>Canada</mml:mi> </mml:mrow> <mml:mrow> <mml:msup> <mml:mrow> <mml:mi>police</mml:mi> </mml:mrow> <mml:mrow> <mml:mi>bangalore</mml:mi> </mml:mrow> </mml:msup> </mml:mrow> </mml:msub> </mml:math>
Below is the sequence that should be accessed first (inner most, then parent ...)
1. <mml:msub><A>swift</A></mml:msub> 2. <mml:msub> <mml:mrow> <mml:mi>poland</mml:mi> </mml:mrow> <mml:mrow> <mml:msub><A>swift</A></mml:msub> <mml:mi>a</mml:mi> </mml:mrow> </mml:msub> 3. <mml:msub> <mml:mrow> <mml:mi>China</mml:mi> </mml:mrow> <mml:mrow> <mml:msub> <mml:mrow> <mml:mi>poland</mml:mi> </mml:mrow> <mml:mrow> <mml:msub><A>swift</A></mml:msub> <mml:mi>a</mml:mi> </mml:mrow> </mml:msub> </mml:mrow> </mml:msub> ..... .....
Thanks in Advance.

Replies are listed 'Best First'.
Re: Want to fetch inner most child element first
by choroba (Cardinal) on Sep 23, 2015 at 14:05 UTC
    XML::XPath is old and unmaintained. I prefer XML::LibXML, which has a similar API. In XML::XSH2, a wrapper around it, you can easily do
    for &{ sort :d :k count(ancestor::*) //mml:msub } ls . ;

    Output:

    Directly in XML::LibXML:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; my $xml = 'XML::LibXML'->load_xml(location => '/path/to/file.xml'); say for map $_->[0], sort { $b->[1] <=> $a->[1] } map [ $_, $_->findvalue('count(ancestor::*)')], $xml->findnodes('//mml:msub');
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Want to fetch inner most child element first
by flexvault (Monsignor) on Sep 23, 2015 at 13:21 UTC

    Welcom dharan,

      "...using XML::XPath..."

    And the code that you are having a problem with is where?

    Regards...Ed

    "Well done is better than well said." - Benjamin Franklin

      Thanks flex for swift reply, below is the code and not sure how to proceed further.
      use XML::XPath; my $xp = XML::XPath->new(filename => 'mathml.txt'); my $nodeset = $xp->find('/mml:math/*'); foreach my $node ($nodeset->get_nodelist) { $l = XML::XPath::XMLParser::as_string($node), }

        The XPath expression that you have written merely matches the outermost node.   What you might want to use here are axes.

        Try something like this:   (extemporaneous answer, check it yourself)

        //xml:msub[count(descendant::xml:msub) = 0] to retrieve all msubs who do not have any descendants of the same type.   These must be the leaves of the structure.

        Now, for each of these, perform a separate query for ancestor::xml:msub which will return an ordered list (from closest to farthest, or is it the other way around?) of all ancestors back to the top of the tree.

Re: Want to fetch inner most child element first
by Anonymous Monk on Sep 24, 2015 at 04:17 UTC
    #!/usr/bin/perl # http://perlmonks.org/?node_id=1142802 # want inner child first use strict; use warnings; $_ = join '', <DATA>; my @elements; s/<mml:msub>\K(\w+)(?=<\/mml:msub>)/<A>$1<\/A>/g; push @elements, $1 while s/(<mml:msub>(?:(?!<mml:msub>).)*?<\/mml:msub>)/ '<' . @elements . ' +>'/se; for my $n (1..@elements) { local $_ = $elements[$n - 1]; 1 while s/<(\d+)>/$elements[$1]/; print "$n: $_\n\n"; } __DATA__ <mml:math> <mml:mi>Goa</mml:mi> <mml:mo>abu</mml:mo> <mml:msub> <mml:mrow> <mml:mi>China</mml:mi> </mml:mrow> <mml:mrow> <mml:msub> <mml:mrow> <mml:mi>poland</mml:mi> </mml:mrow> <mml:mrow> <mml:msub>swift</mml:msub> <mml:mi>a</mml:mi> </mml:mrow> </mml:msub> </mml:mrow> </mml:msub> <mml:mo>miot</mml:mo> <mml:msub> <mml:mrow> <mml:mi>Canada</mml:mi> </mml:mrow> <mml:mrow> <mml:msup> <mml:mrow> <mml:mi>police</mml:mi> </mml:mrow> <mml:mrow> <mml:mi>bangalore</mml:mi> </mml:mrow> </mml:msup> </mml:mrow> </mml:msub> </mml:math>

    hehehe

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1142802]
Approved by kevbot
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-03-29 07:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found