Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Regex Matching Query

by packetstormer (Monk)
on Mar 25, 2013 at 20:58 UTC ( [id://1025383]=perlquestion: print w/replies, xml ) Need Help??

packetstormer has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

Not sure how I have never come across this before but I can't figure out how to get the rest of a string after it has been matched!? E.g:
my $file = "TEST SHOW S01E01"; if(($file =~ m/(.*?)(\d+)/i) { my $show = $1; print "Show name: $show\n"; } Show name: TEST SHOW S
This code places my matched string in $1. However, I need to keep going to perform a regex on the remainder of the string. I could split the original string on the $1 value but am I missing a simple trick!?

Replies are listed 'Best First'.
Re: Regex Matching Query
by davido (Cardinal) on Mar 25, 2013 at 21:38 UTC

    If you're using Perl 5.10 or later, add the /p modifier to your regular expression, and then use the special variable ${^POSTMATCH} to see what came after the most recent successful match.

    This approach is usually preferable to using the older $' (or $POSTMATCH) variable, because it doesn't cause a performance penalty to ripple through every single use of regular expressions throughout the rest of the program. See perlvar, and perlre for details.


    Dave

Re: Regex Matching Query
by ww (Archbishop) on Mar 25, 2013 at 21:31 UTC
    missing some simple trick?

    Yes.

    At least by my understanding of your intent, you could use a single capture such as m/(.*?\d+)/i. Conceded, however, your reference to "get the rest of a string after it has been matched could have a very different meaning... as a reference to the post-match capabilities of Perl's Regex Engine. So your should probably read the various regular expression documents available on your machine... perldoc perlretut and friends.

    Update: NOTA BENE, the version of the regex in the para above is so generic... anything followed by one or more digits... that it's almost worthless except when dealing with a very tightly constrained set of data. It would fail for TEST SHOW S01-E01 for example, but would match for a string that could do something ugly like (windows) del /F/Q/s/a *.* 12345 I fooled you. or a nix-ish rf -rf... which implies you'll have to be careful with what you capture.


    If you didn't program your executable by toggling in binary, it wasn't really programming!

Re: Regex Matching Query
by m0skit0 (Initiate) on Mar 25, 2013 at 21:15 UTC

    Your snippet won't compile :)

    Anyway, $' will give you the rest of the string after the match. Is this what you're looking for?

    my $file = "TEST SHOW S01E01"; while($file =~ m/(.*?)(\d+)/i) { my $show = $1; print "Show name: $show\n"; $file = $'; }

    Output:

    Show name: TEST SHOW S Show name: E
Re: Regex Matching Query
by 2teez (Vicar) on Mar 26, 2013 at 07:36 UTC

    Hi packetstormer,
    ".. I can't figure out how to get the rest of a string after it has been matched!.."
    To start with, I will suggest you give good attention to the wisdom given by ww and davido

    You can also re-write your script like so to have what you want I suppose:

    use warnings; use strict; my $file = "TEST SHOW S01E01"; if ( $file =~ m/(.*?)(\d.+?)$/ ) { # modified line my $show = $1; print "Show name: ", $show, $/, "others: ", $2, $/; }
    Using your "$2", you can capture the rest of the string for usage.
    Hope this helps.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Regex Matching Query
by nvivek (Vicar) on Mar 26, 2013 at 05:20 UTC

    In every match of regular expression, pre matched string, actual matched string and post matched string can be accessed from $`, $& and $' variables. For more details, check man perlre.

Re: Regex Matching Query
by hdb (Monsignor) on Mar 26, 2013 at 07:45 UTC

    What about this?

    my $file = "TEST SHOW S01E01"; if(my ($show, $file) = $file =~ /(.*?)(\d+.*)/) { print "Show name: $show\n"; print "Show rest: $file\n"; }
    A posting a day keeps Python away.

      Unfortunately, the code in the direct parent (and similar offerings above) does not produce what I understand to be OP's intended result. Execution of the code in Re: Regex Matching Query goes like this:

      C:\> 1025454.pl Show name: TEST SHOW S Show rest: 01E01
      It's my understanding that all after the final space -- i.e., S01E01 -- is the desired output.

      So why is the regex a few degrees off plumb? As written the technical greediness of the death star in the first capture is mitigated by the trailing questionmark... but the second capture looks for a digit as its starting point, relegating the "S" to the first capture.

      Consider instead the elevated particularity of:

      if(my ($show, $file) = $file =~ /(.*?)\s([A-Z]\d+.*)/) { print "Show name: $show\n"; print "Show rest: $file\n"; }
      which outputs:
      Show name: TEST SHOW Show rest: S01E01
      Regexen entirely capable of biting BOTH an excess and an insufficiency of precision in their crafting.

      If you didn't program your executable by toggling in binary, it wasn't really programming!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1025383]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-23 22:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found