Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

XML::Twig 'cut' and 'paste' Question

by prasadbabu (Prior)
on Jul 28, 2006 at 14:10 UTC ( [id://564360]=perlquestion: print w/replies, xml ) Need Help??

prasadbabu has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am doing a big project using mainly XML::Twig. Today during testing i come across a strange problem. I tried in several ways but i could not able to find the why the problem comes. So i need some help here from you guys.

<?xml version="1.0"?> <article> <label>here label for testing</label> <fig id="fig001" position="float"><label>Fig. 1.</label><caption><p>fi +rst</p></caption></fig> <fig id="fig002" position="float"><label>Fig. 2.</label><caption><p>se +cond</p></caption></fig> <fig id="fig003" position="float"><label>Fig. 3.</label><caption><p>th +ird</p></caption></fig> <fig id="fig004" position="float"><label>Fig. 4.</label><caption><p>fo +urth</p></caption></fig> <fig id="fig005" position="float"><label>Fig. 5.</label><caption><p>fi +fth</p></caption></fig> <fig id="fig006" position="float"><label>Fig. 6.</label><caption><p>si +xth</p></caption></fig> </article>

The above is the sample xml file. There i need to cut the 'label' element and place it as a first_child for the 'caption'. So i tried as shown below. The below code is sample code from my original code.

use strict; use XML::Twig; my $twig = new XML::Twig( ); $twig->parsefile('1.xml'); my $count = $twig->get_xpath('//fig//label'); for my $c (0..($count-1)) { my $t = $twig->get_xpath('//fig//label', $c); #//fig/la +bel -here problem comes my $pl = $twig->get_xpath('//fig//caption', $c); #my $cc = $t->print;# print for testing #my $aa = $twig->print; #print for testing #print "$cc\n\n$c\n**$aa\n"; + my $cut = $t->cut ; $cut->paste('first_child', $pl) ; + } $twig->print;

I got the exact output as shown below by running the above code.

Correct Output: XPath Expression - "//fig//label" --------------- <?xml version="1.0"?> <article> <label>here label for testing</label> <fig id="fig001" position="float"><caption><label>Fig. 1.</label><p>fi +rst</p></caption></fig> <fig id="fig002" position="float"><caption><label>Fig. 2.</label><p>se +cond</p></caption></fig> <fig id="fig003" position="float"><caption><label>Fig. 3.</label><p>th +ird</p></caption></fig> <fig id="fig004" position="float"><caption><label>Fig. 4.</label><p>fo +urth</p></caption></fig> <fig id="fig005" position="float"><caption><label>Fig. 5.</label><p>fi +fth</p></caption></fig> <fig id="fig006" position="float"><caption><label>Fig. 6.</label><p>si +xth</p></caption></fig> </article>

My question comes here, I am getting the exact output when i give the XPath Expression as '//fig//label'. But if i change the XPath Expression as '//fig/label', i am getting error. To identify the problem i checked by printing in the loop every time and the final output i got is as shown below. The XPath Expression is correct, because it is very first child, but i am getting error and not able to get exact output. Where i am going wrong?

Wrong output: XPath Expression - "//fig/label" ------------- <article> <label>here label for testing</label> <fig id="fig001" position="float"><caption><label>Fig. 1.</label><p>fi +rst</p></caption></fig> <fig id="fig002" position="float"><label>Fig. 2.</label><caption><labe +l>Fig. 3.</label><p>second</p></caption></fig> <fig id="fig003" position="float"><caption><p>third</p></caption></fig +> <fig id="fig004" position="float"><label>Fig. 4.</label><caption><p>fo +urth</p></caption></fig> <fig id="fig005" position="float"><label>Fig. 5.</label><caption><p>fi +fth</p></caption></fig> <fig id="fig006" position="float"><label>Fig. 6.</label><caption><p>si +xth</p></caption></fig> </article>

Regards,
Prasad

Replies are listed 'Best First'.
Re: XML::Twig 'cut' and 'paste' Question
by Tanktalus (Canon) on Jul 28, 2006 at 19:19 UTC

    I'm wondering if you're complicating things a bit. Rather than count, and then loop, try simply looping. I have a suspicion that your cuts and pastes are changing the number of matches that //fig/label sees - once you've changed the first fig/label, it's no longer a fig/label. Try this:

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $twig = new XML::Twig( pretty_print => 'record_c' ); $twig->parsefile('1.xml'); for my $fig ( $twig->get_xpath('//fig') ) { my $label = $fig->get_xpath('./label', 0); my $caption = $fig->get_xpath('./caption', 0); next unless $label and $caption; $label->cut(); $label->paste('first_child', $caption); } $twig->print;
    That seems to do the trick. Also note the better variable names ;-)

Re: XML::Twig 'cut' and 'paste' Question
by Ieronim (Friar) on Jul 28, 2006 at 19:30 UTC
    You modify the document with every cut'n'paste—the <label> are direct children of the <fig> element the first time, but then the first label becomes a grandchild of <fig>! Of course, you get the error Can't call method "cut" on an undefined value when you try to get the $c'th element from the list containing only one label.

    This error appeared because you chose an extremely ineffective way—you call get_xpath on every element in the list instead of fetching all elements once.

    The corected variant of your code:

    The more efficient and clean way resulting in the same output:

    #!/usr/bin/perl use warnings; use strict; use XML::Twig; my $twig = new XML::Twig( pretty_print => 'indented', ); $twig->parse(<<'XML'); <?xml version="1.0"?> <article> <label>here label for testing</label> <fig id="fig001" position="float"><label>Fig. 1.</label><caption><p>fi +rst</p></caption></fig> <fig id="fig002" position="float"><label>Fig. 2.</label><caption><p>se +cond</p></caption></fig> <fig id="fig003" position="float"><label>Fig. 3.</label><caption><p>th +ird</p></caption></fig> <fig id="fig004" position="float"><label>Fig. 4.</label><caption><p>fo +urth</p></caption></fig> <fig id="fig005" position="float"><label>Fig. 5.</label><caption><p>fi +fth</p></caption></fig> <fig id="fig006" position="float"><label>Fig. 6.</label><caption><p>si +xth</p></caption></fig> </article> XML my @labels = $twig->get_xpath('/article/fig/label'); my @captions = $twig->get_xpath('/article/fig/caption'); for my $i (0..$#labels) { $labels[$i]->move('first_child', $captions[$i]); } $twig->print;
    But i myself would prefer to let twig_handlers do the job:
    #!/usr/bin/perl use warnings; use strict; use XML::Twig; my $twig = new XML::Twig( twig_handlers => { '/article/fig' => sub { my $label = $_->first_child('label') or return; my $caption = $_->first_child('caption') or return; $label->move('first_child', $caption); $_->flush; } }, pretty_print => 'indented', ); $twig->parse(<<'XML'); <?xml version="1.0"?> <article> <label>here label for testing</label> <fig id="fig001" position="float"><label>Fig. 1.</label><caption><p>fi +rst</p></caption></fig> <fig id="fig002" position="float"><label>Fig. 2.</label><caption><p>se +cond</p></caption></fig> <fig id="fig003" position="float"><label>Fig. 3.</label><caption><p>th +ird</p></caption></fig> <fig id="fig004" position="float"><label>Fig. 4.</label><caption><p>fo +urth</p></caption></fig> <fig id="fig005" position="float"><label>Fig. 5.</label><caption><p>fi +fth</p></caption></fig> <fig id="fig006" position="float"><label>Fig. 6.</label><caption><p>si +xth</p></caption></fig> </article> XML $twig->print;

         s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print
Re: XML::Twig 'cut' and 'paste' Question
by GrandFather (Saint) on Jul 28, 2006 at 19:43 UTC

    In XPATH syntax / specifies an absolute path where as // specefies a relative path. /article/fig/label would work.

    See http://www.w3.org/TR/xpath for the full XPATH spec.


    DWIM is Perl's answer to Gödel
      You are not fully right. Directly from the document you cite:
      //para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node
      So //fig/label selects all 'label' childern of all 'fig' elements in the document.

           s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://564360]
Approved by Hue-Bond
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-18 23:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found