Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

perl regex referencing

by ocs (Monk)
on Sep 18, 2007 at 08:10 UTC ( [id://639573]=perlquestion: print w/replies, xml ) Need Help??

ocs has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I got a small problem concerning references in perls regex flavor.
For example, I got a string like this

$lala = "Hello you therre."

And I want to match all double characters (a-z), i.e. ll and rr in the given string and contract them to one char. (Please do not take this seriously, its just an example.)

So I went like this:

$lala =~ s/([a-z])$1/$1/g

It does not work. So I played around:

$lala =~ s/([a-z])\1/$1/g

This works. But what I kept in mind was this: Warning on \1 vs $1

So I don't know why in the first version the $1 in the matching part (not the substitute part) does not work but with \1 in the second version? I thought \1 is obsolete and just a relic of a sed styled referencing.

This is nothing big, but ... did I miss something?

Thanks in advance,

ocs.

tennis players have fuzzy balls.

Replies are listed 'Best First'.
Re: perl regex referencing
by zshzn (Hermit) on Sep 18, 2007 at 08:26 UTC
    perlre explains this as
    The bracketing construct ( ... ) creates capture buffers. To refer to +the digit'th buffer use \<digit> within the match. Outside the match +use "$" instead of "\".
Re: perl regex referencing
by Prof Vince (Friar) on Sep 18, 2007 at 08:22 UTC
    It's not a problem to use \1 and friends in the LHS of the substitution because it doesn't behave like a quoted string : the <backslash-digit> token has a well defined meaning there. Moreover, $1 can already be a capture in a previous successful regexp, so it shouldn't be reset before the end of the matching part of the substitution.
Re: perl regex referencing
by bruceb3 (Pilgrim) on Sep 18, 2007 at 08:18 UTC
    The back slash is used inside the match and the dollar sign is used outside of the match. In this case "inside the match" refers to (a-z)\1 because the text is being matched against this regex. The dollar 1 is not being matched against. It's part of the substitution.
Re: perl regex referencing
by ikegami (Patriarch) on Sep 18, 2007 at 14:38 UTC

    Some alternatives:

    ≥ 5.10

    $lala =~ s/([a-z])\K\1//g;

    See the perlre for 5.10 (or the one for 5.9.5).

    < 5.10

    use Regexp::Keep; $lala =~ s/([a-z])\K\1//g;

    See Regexp::Keep.

    Update: Added links to documentation as privately requested.

Re: perl regex referencing
by eff_i_g (Curate) on Sep 18, 2007 at 14:16 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://639573]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-25 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found