Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Search and substitute into data structures

by rbi (Monk)
on Apr 30, 2007 at 18:45 UTC ( [id://612830]=perlquestion: print w/replies, xml ) Need Help??

rbi has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
I wanted to remove some common stuff from the text contained into some structures. I just came up with the attached code. However, is there anything simpler, maybe a search and substitute one-line instruction ?
Thank you.
use strict; use warnings; use Data::Dumper; my $var = '<Country>&lt;![CDATA[US]]&gt;</Country>'; $var = unwrap_cdata($var); print $var."\n"; my $hash = { america => '<Country>&lt;![CDATA[US]]&gt;</Country>', europe => ['<Country>&lt;![CDATA[IT]]&gt;</Country>','<Co +untry>&lt;![CDATA[UK]]&gt;</Country>'], }; print Dumper $hash; $hash = unwrap_cdata($hash); print Dumper $hash; sub unwrap_cdata { my $var = shift(); if (ref($var)) { $var = unwrap_hash($var) if (ref($var) eq 'HASH'); $var = unwrap_array($var) if (ref($var) eq 'ARRAY'); } else { $var = unwrap_scalar($var); } return $var; } sub unwrap_hash { my $href = shift(); foreach my $key (keys %{$href}) { my $hk = $href->{$key}; $hk = unwrap_hash($hk) if (ref($hk) eq 'HASH'); $hk = unwrap_array($hk) if (ref($hk) eq 'ARRAY'); $hk = unwrap_scalar($hk) if (not ref($hk)); $href->{$key} = $hk; } return $href; } sub unwrap_array { my $aref = shift(); my @array; my $i = -1; foreach my $av (@{$aref}) { $av = unwrap_hash($av) if (ref($av) eq 'HASH'); $av = unwrap_array($av) if (ref($av) eq 'ARRAY'); $av = unwrap_scalar($av) if (not ref($av)); push @array,$av; } return \@array; } sub unwrap_scalar { my $var = shift(); $var =~ s/<!\[CDATA\[//; $var =~ s/&lt;!\[CDATA\[//; $var =~ s/]]>//; $var =~ s/]]&gt;//; return $var; }

Replies are listed 'Best First'.
Re: Search and substitute into data structures
by Sidhekin (Priest) on Apr 30, 2007 at 19:05 UTC

    Substitution is conceptually a destructive operation, particularly on structures, and that is how you use it, so I would rewrite your unwrap_cdata as explicitly so:

    use strict; use warnings; use Data::Dumper; my $var = '<Country>&lt;![CDATA[US]]&gt;</Country>'; unwrap_cdata($var); print $var."\n"; my $hash = { america => '<Country>&lt;![CDATA[US]]&gt;</Country>', europe => ['<Country>&lt;![CDATA[IT]]&gt;</Country>','<Co +untry>&lt;![CDATA[UK]]&gt;</Country>'], }; print Dumper $hash; unwrap_cdata($hash); print Dumper $hash;

    If you ever really want to make copies, pass the copies to unwrap_cdata instead, and make sure you make deep copies (like with Storable::dclone).

    And now for my trick:

    sub unwrap_cdata { for (@_) { eval { unwrap_cdata(@$_); 1 } and next; eval { unwrap_cdata(values %$_); 1 } and next; s/<!\[CDATA\[//; s/&lt;!\[CDATA\[//; s/]]>//; s/]]&gt;//; } }

    Recursion is fun. Block eval is great fun. :-)

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

      Thanks! And what about wrap_cdata, to wrap the content with <![CDATA ]> in the approach of your trick ?

        Thanks! And what about wrap_cdata, to wrap the content with [I assume] <![CDATA[ ]]> in the approach of your trick ?

        Same thing, as long as you know how to do it with a single string. Perhaps something like this:

        sub wrap_cdata { for (@_) { eval { wrap_cdata(@$_); 1 } and next; eval { wrap_cdata(values %$_); 1 } and next; s/(?<![^>])([^<]+)/<![CDATA[$1]]>/g; } }

        Or not. I don't know why your original unwrap did not use the /g modifier. So please, season to taste.

        print "Just another Perl ${\(trickster and hacker)},"
        The Sidhekin proves Sidhe did it!

Re: Search and substitute into data structures
by merlyn (Sage) on Apr 30, 2007 at 18:52 UTC
    You're missing a close curly brace just before sub unwrap_hash. Or at least, I hope you are, because otherwise, you're trying to create nested subroutines, which definitely don't work in Perl (at least not how mortals expect them to work). You have oddly indented the next subroutines, making me wonder if you really do have those "inside" the other subroutine.

    Have you seen Data::Visitor? I think it would abstract away most of what you're writing and rewriting, and let you concentrate on your transformations.


    update: Yeah, you do have a broken subroutine-inside-a-subroutine. Please pull those out. Better yet, look at Data::Visitor.
      Sorry, I did not paste the last lines of the code.. I fixed it now.
Re: Search and substitute into data structures
by jwkrahn (Abbot) on Apr 30, 2007 at 19:59 UTC
    You could use recursion to simplify that a bit:
    sub unwrap_cdata { my @array; for ( @_ ) { if ( ref ) { push @array, ref eq 'ARRAY' ? [ unwrap_cdata( @$_ ) ] : ref eq 'HASH' ? { unwrap_cdata( %$_ ) } : ref eq 'SCALAR' ? \unwrap_cdata( $$_ ) : (); } else { ( my $var = $_ ) =~ s/<!\[CDATA\[//; $var =~ s/&lt;!\[CDATA\[//; $var =~ s/]]>//; $var =~ s/]]&gt;//; push @array, $var; } } return wantarray ? @array : $array[ 0 ]; }
Re: Search and substitute into data structures
by ferreira (Chaplain) on Apr 30, 2007 at 19:04 UTC

    You may try out Data::SearchReplace, a module whose 1.02 version was recently released. From the docs, it looks like it should be quite easy to do what you want using the module API. I would guess the solution you're looking for to be something like:

    use Data::SearchReplace qw(sr); sr({ SEARCH => '<![CDATA[', REPLACE => '' }, $hash); sr({ SEARCH => '&lt;![CDATA[', REPLACE => '' }, $hash); sr({ SEARCH => ']]>', REPLACE => '' }, $hash); sr({ SEARCH => ']]&gt;', REPLACE => '' }, $hash);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://612830]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-04-24 04:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found