Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Grouped characters inside character class.

by Enlil (Parson)
on Jun 02, 2006 at 01:39 UTC ( [id://553198]=note: print w/replies, xml ) Need Help??


in reply to Grouped characters inside character class.

This works:
use strict; use warnings; my $source = 'Posted by mad max beyond eggdome on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on /i ) { print qq("$1") . "\n"; }
which matches:
C:\>perl -MYAPE::Regex::Explain -e "print YAPE::Regex::Explain->new(qr +/^Posted by (.*?) on /)->explain()" The regular expression: (?-imsx:^Posted by (.*?) on ) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- Posted by 'Posted by ' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .*? any character except \n (0 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- on ' on ' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

-enlil

Replies are listed 'Best First'.
Re^2: Grouped characters inside character class.
by m.att (Pilgrim) on Jun 02, 2006 at 01:49 UTC
    The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '. For example:

    my $source = 'Posted by getting on your nerves on September 04, 2003';

    It's probably a good idea to anchor on more than just the ' on ' part like:

    my $source = 'Posted by getting on your nerves on September 04, 2003'; if ($source =~ /Posted by (.+?) on \w+ \d{2}, \d{4}$/) { ... }

    Regards

    m.att

      The only issue with this regex (and the poster's original idea as well) is it will not properly capture the username if it contains ' on '

      Noted. So we capture up until the last ' on '.

      use strict; use warnings; my $source = 'Posted by getting on my nerves on September 04, 2003'; if ( $source =~ /^Posted by (.*?) on (?!.* on )/i ) { print qq("$1") . "\n"; }

      blokhead is right.. and I will go lick my wounds now.

        That's the same as just being greedy:
        /^Posted by (.*) on /i

        blokhead

Re^2: Grouped characters inside character class.
by the_0ne (Pilgrim) on Jun 02, 2006 at 01:45 UTC
    Thanks enlil for the response. That does work. My only problem is, I've been warned on perlmonks several times of using .*. I guess in this case it would be fine though because I do want everything grabbed up until the (space)on(space). Maybe I was just trying to be too fancy. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://553198]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-19 14:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found