This worked for me, if I understand you correctly:
#!/usr/bin/perl -w
use strict;
use XML::Twig;
my $twig= XML::Twig->new();
my $file = "message.xml";
$twig ->parsefile( $file );
my $root = $twig->root;
my @all_text = gather_text($root);
print join ("\n---", @all_text), "\n";
sub gather_text {
my $node = shift;
my (@children) = $node->children();
if (not @children) {
# this tag has no children. grab its text data, with no
# surrounding tag, and return it.
return $node->sprint('NOTAGS');
}
else {
# recurse into each child
my (@text);
foreach (@children) {
push @text, (gather_text($_));
}
return @text;
}
}
When I test it on
the XML of your original post, I get the following results:
[jeremy@serpent pm-test]$ ./term-xml.pl
perlquestion
---
Isanchez
---
Hi,
I have to recursively go over xml files (that look very
differently from each other) in a folder and collect every
content for every tag. I have tried with TWig code but it
doesn't work because it grabs all tags including parent tags
and then prints first the contents that belong to the
parents i.e. all and then the contents again but this time
for each doughter node. Can any wise monk give some idea of
what to do ?
thanks,
---
4
[jeremy@serpent pm-test]$
That's only once for each tag, rather than once for each tag and then again for its children.
Does that help?
(Nice to see you here, Isanchez!)
Update: corrected link to XML.