Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

prune xml of empty elements

by metaperl (Curate)
on Jul 14, 2011 at 16:51 UTC ( [id://914390]=perlquestion: print w/replies, xml ) Need Help??

metaperl has asked for the wisdom of the Perl Monks concerning the following question:

I would like to use a CPAN module to clean my XML of any empty tags. Ie., given:
<note> <to> <person>Satan</person> </to> <from via="postcard" russia="with love">moneypenny</from> <heading></heading> <body></body> </note>
I would like to see body and heading elements stripped out and the xml with only content-laden elements in it.



The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development.

-- Terence Parr, "Enforcing Strict Model View Separation in Template Engines"

Replies are listed 'Best First'.
Re: prune xml of empty elements
by toolic (Bishop) on Jul 14, 2011 at 17:11 UTC
    XML::Twig can prune out all elements which have no text content:
    use warnings; use strict; use XML::Twig; my $str = <<EOF; <note> <to> <person>Satan</person> </to> <from via="postcard" russia="with love">moneypenny</from> <heading></heading> <body></body> </note> EOF my $t = XML::Twig->new( pretty_print => 'indented', twig_handlers => { _all_ => sub { $_->delete() unless $_->text() } + } ); $t->parse($str); $t->print(); __END__ <note> <to> <person>Satan</person> </to> <from russia="with love" via="postcard">moneypenny</from> </note>
      Updated based on this clarification
      #!/usr/bin/perl -- use warnings; use strict; use XML::Twig; my $str = <<'EOF'; <note> <to> <person>Satan</person> </to> <from via="postcard" russia="with love">moneypenny</from> <heading></heading> <body></body> <fudge a="body"></fudge> <fudge><a></a></fudge> <fudge><a body="good"></a></fudge> </note> EOF my $t = XML::Twig->new( pretty_print => 'indented', twig_handlers => { _all_ => sub { $_->delete() unless $_->has_child or $_->has_atts }, }, ); $t->parse($str); $t->print(); __END__ <note> <to> <person>Satan</person> </to> <from russia="with love" via="postcard">moneypenny</from> <fudge a="body"></fudge> <fudge> <a body="good"></a> </fudge> </note>
Re: prune xml of empty elements
by MidLifeXis (Monsignor) on Jul 14, 2011 at 17:48 UTC

    Is a tag with only attributes considered empty?

    --MidLifeXis

Re: prune xml of empty elements
by Logicus (Initiate) on Jul 14, 2011 at 20:58 UTC
    Easiest way to do this is :
    use Modern::Perl; my $str = qq@ <note> <to> <person>Satan</person> </to> <from via="postcard" russia="with love">moneypenny</from> <heading></heading> <body></body> </note>@; $str =~ s@<(.*?)></\1>@@g; say $str;
    Output :
    <note> <to> <person>Satan</person> </to> <from via="postcard" russia="with love">moneypenny</from> </note>
    No modules or parsing needed, just a simple one line regex.
      No modules or parsing needed, just a simple one line regex.

      The only problem is that it doesn't prune every valid empty XML tag. If getting the correct result is important, use an XML parser.

      $str =~ s@<(.*?)/>@@g; 1 while $str =~ s@<(.*?)>\s*</\1>@@g; $str =~ s@(?<=>)\s+(?=\n)@@g;
      s@<(.*?)></\1>@@g;

      I know one guy who writes regexes like that.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://914390]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (9)
As of 2024-04-23 08:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found