Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Pattern Match n00b

by USP45 (Novice)
on Mar 14, 2008 at 20:30 UTC ( [id://674287]=perlquestion: print w/replies, xml ) Need Help??

USP45 has asked for the wisdom of the Perl Monks concerning the following question:

This seems so simple but I must be missing something:
my $pet_list = 'dog:boston.terrier;cat:orange.tabby';
To extract boston.terrier out why wouldn't this work?
@{$pets} = $1 if $pet_list =~ /$some_variable\:(.+);??/;
Doesn't the ;?? mean match semi-colon 0 or 1 times and be non-greedy?

Replies are listed 'Best First'.
Re: Pattern Match n00b
by pc88mxer (Vicar) on Mar 14, 2008 at 20:43 UTC
    Doesn't the ;?? mean match semi-colon 0 or 1 times and be non-greedy?

    Yes, it does, but that's not what you want. You need the .+ to be non-greedy or else disallow semi-colons in your values:

    @{$pets} = $1 if $pet_list =~ /$some_variable\:(.*?)(;|\z)/; # or not allow ';' in values: @{$pets} = $1 if $pet_list =~ /$some_variable\:([^;]*)(;|\z)/;
    To make the matching more robust, you'll want to do something like this:
    @{$pets} = $1 if $pet_list =~ /(?:\A|;)\Q$some_variable\E\:([^;]*)(;|\ +z)/;
Re: Pattern Match n00b
by BrowserUk (Patriarch) on Mar 14, 2008 at 21:02 UTC
    Doesn't the ;?? mean match semi-colon 0 or 1 times and be non-greedy?

    It does, but it only makes that part of the match non-greedy. The (.+) remains greedy.

    But if you make that part non-greedy, then you will be asking to capture as little as possible following the ':', that might or might not be followed by a ';'. Which means it will capture just a single character.

    A couple of ways to approach the problem:

    1. Make capture non-greedy and have a definite terminating condition:/$some_variable\:(.+?)(?:;|$)/

      Where (?:;|$) requires a semicolon or the end-of-string, (prefering the former) to terminate the capture.

    2. Or as you know the bit you want to capture cannot contain a ';', reduce the scope of the capture using [^;]+ in place of '.' and omit the terminator.
      /$some_variable\:([^;]+)/

    Either works:

    $pet_list = 'dog:boston.terrier;cat:orange.tabby';; $pets=[]; $pet_list =~ /$_\:(.+?)(?:;|$)/ and push @{$pets},$1 for qw[cat dog]; print for @{$pets};; orange.tabby boston.terrier $pets=[]; $pet_list =~ /$_\:([^;]+)/ and push @{$pets},$1 for qw[cat dog]; print for @{$pets};; orange.tabby boston.terrier

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Pattern Match n00b
by skirnir (Monk) on Mar 14, 2008 at 20:59 UTC
    No, to make the .+ part non-greedy you'd have to place the '?' directly after the "+":

    @{$pets} = $1 if $pet_list =~ /$some_variable\:(.+?);?/;

    But as the ';' is optional, the non-greedy match would stop matching after the first non-newline character. In your example you would get "b" in $1. It's best not to use "." if you don't really mean "any non-newline character". In your case I think [^;] would be the right thing to use, given that I understood properly what you want to do.
Re: Pattern Match n00b
by jeroenes (Priest) on Mar 14, 2008 at 21:27 UTC
    You could go splitting instead. Nested splits on ; and : could work. Or you could do it all in one statement with SuperSplit:

    $array=supersplit(':',';',$mystring);

    provided that the (semi)colons are true delimiters.

    Cheers,
    Jeroen

    Couldn't resist the temptation after all those years ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://674287]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-19 21:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found