Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Using multiple regex's

by leons (Pilgrim)
on Jan 22, 2001 at 20:16 UTC ( [id://53510] : perlquestion . print w/replies, xml ) Need Help??

leons has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

There is something I am not too sure about and I hope you can give me some advice on ...
Right now I am writing something in which I use similar statements as the following ones:

my ($value1,$value2)=("something","whatever"); my ($var1,$var2) = ($1,$2) if ($value1=~/something/ && $value2=~/(what +)(ever)/);

I have the feeling that it is not very okay to use two regex's while using i.e. the $1,$2...
variables of one of them. In this case the $1 and $2 variables, come from the 2nd regex,
but are there similar situations in which this can cause trouble ? Or could this be some-
thing that might change in future Perl releases ? In other words, I was wondering whether
this is a safe thing to do.

Any help, advice, commentary or whatever is very welcome and highly appreciated
... et cetera .... Thanks ! ;-)

Romani Ite Domum

Replies are listed 'Best First'.
Re: Using multiple regex's
by chipmunk (Parson) on Jan 22, 2001 at 20:35 UTC
    Your use of $1 and $2 should be fine, because it only occurs if both regexes match, and the second one (with the capturing parens) will be matched last.

    More problematic is the use of my() with the if modifier. The behavior of my in such a construct is actually little-known bug/feature of Perl.

    my() is partly executed at compile time, and partly executed at runtime. At compile time, some space is allocated for the lexical variable. At run time, the variable is reset, which happens each time execution leaves the scope of the my declaration.

    Using my() with the if modifier means that the space is allocated at compile time as usual, but the variable is reset only if the conditional is true. If the conditional is false, the value of the variable is preserved for the next execution. Here's an example of this odd behavior:

    #!perl -l sub blah { my $x if $foo; print $x++; } for (0..3) { blah() } $foo = 1; for (0..3) { blah() } __END__ 0 1 2 3 4 0 0 0
    This behavior, which I believe was originally unintentional, has been left in because people are using it to get static variables, which are private variables that keep their values between executions of a subroutine.

    You can avoid this behavior in your script by separating the my and the assignment:

    my ($var1,$var2); ($var1,$var2) = ($1,$2) if ($value1=~/something/ && $value2=~/(what)(e +ver)/);
Re: Using multiple regex's
by Fastolfe (Vicar) on Jan 22, 2001 at 20:21 UTC
    By using the && operator, you're pretty much guaranteeing that the first regex will be done first, and the second will be done next. It's safe to assume $1 and $2 will be set, assuming the second regex succeeds. There are other Perl constructs, though (such as lists), where each expression in the list isn't necessarily guaranteed to be evaluated in the obvious order.
Re: Using multiple regex's
by jeroenes (Priest) on Jan 22, 2001 at 20:24 UTC
    I don't see the harm here. $value2's regex is done later than that of value1. So the latest $1,$2 are from $value2. If value1's regex doesn't return true, nothing is assigned. Same for value2's regex. Seems OK to me.

    "We are not alone"(FZ)