Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

regex parsing brainteaser

by perrin (Chancellor)
on Apr 01, 2008 at 20:26 UTC ( [id://677826]=perlquestion: print w/replies, xml ) Need Help??

perrin has asked for the wisdom of the Perl Monks concerning the following question:

I have a simple solution for this. Just wondering if there's a cool regex trick to do it in one shot.

Given data like this:

<%def .errors> missing_name: You must provide your name. missing_email: You must provide your email address. </%def>
I want to end up with a hash containing the keys and values implied by this format. This was my first try:
my %errors = ( $data =~ m|<\%def \.errors>.*?(?:(^\w+):(.*?)$)+.*?</\%def>|sgm );
But of course that only gets the first one. Is there a way to get them all without first trimming the text?

Replies are listed 'Best First'.
Re: regex parsing brainteaser
by Roy Johnson (Monsignor) on Apr 01, 2008 at 20:35 UTC
    my %errors = ( $data =~ m#(?:<\%def \.errors>|\G).*?(?:(^\w+):(.*?)$)(?=.*?</\%de +f>)#sgm );
    It's not immune to matching before or after the tags, though. That gets uglier:
    my %errors = ( $data =~ m#(?:<\%def \.errors>|(?<=.)\G)(?:(?!</\%def>).)*?(?:(^\w ++):([^\n]*))(?=.*?</\%def>)#sgm );

    Caution: Contents may have been coded under pressure.
Re: regex parsing brainteaser
by ikegami (Patriarch) on Apr 01, 2008 at 22:32 UTC
    You really need a parser. Using extended regexp features, you can write a parser in a regexp.
    use strict; use warnings; my $text = <<'__EOI__'; <%def .errors> missing_name: You must provide your name. missing_email: You must provide your email address. </%def> __EOI__ # Perl code in regexps close over lexicals when # the regexp is compiled. It's best to avoid # using lexical variables declared outside the # regexp in Perl code in regexps. local our %errors; $text =~ / (?{ +{} }) <%def\ \.errors> \n (?: (\w+): \s* (.*) \n (?{ +{ %{$^R}, $1 => $2 } }) )* <\/%def> \n (?{ %errors = %{$^R} }) /x or die("Bad text\n"); for (keys %errors) { print("$_ => $errors{$_}\n"); }

    5.10 has features that make this even easier, but I haven't taken the time to look at them yet.

Re: regex parsing brainteaser
by FunkyMonk (Chancellor) on Apr 01, 2008 at 22:58 UTC
    I'm all for simplicity, so I'd split this into two steps:
    my $data = '<%def .errors> missing_name: You must provide your name. missing_email: You must provide your email address. </%def>'; my ( $errors ) = $data =~ m[\Q<%def .errors>\E\n (.*\n) \Q</%def>\E ]sx; my %errors = $errors =~ m{(\w+):\s*(.*?)\n}g; print Dumper \%errors;

    Output:

    $VAR1 = { 'missing_email' => 'You must provide your email address.', 'missing_name' => 'You must provide your name.' };

Re: regex parsing brainteaser
by poolpi (Hermit) on Apr 02, 2008 at 13:32 UTC


    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; $_= q{<%def .errors> missing_name: You must provide your name. missing_email: You must provide your email address. </%def>}; my %errors; my $start = q{<%def .errors>}; my $end = q{</%def>}; /$start/ .. /$end/ and %errors = ( $_ =~/ (.+) [:] (.+) [.] /gx ); print Dumper \%errors;
    Output: $VAR1 = { 'missing_email' => ' You must provide your email address', 'missing_name' => ' You must provide your name' };

    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://677826]
Approved by Skeeve
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-04-23 23:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found