Re: difference in regex

You will find the answer to your question in "Regexp Quote-Like Operators" in perlop - basically, different regex operations have different return values in different contexts. See also perlretut for a tutorial.

Operation	Context	`()` Capturing Groups	Return Value on Match (and notes on behavior)	Return Value on Failure	Example
`m//`	scalar	-	true	false	`my $x = "foobar"=~/[aeiou]/; # => $x is true my $y = "foobar"=~/[xyz]/; # => $y is false` [download]
`m//g`	scalar	-	true (each execution of `m//g` finds the next match, see "Global matching" in perlretut)	false if there is no further match	`my $str = "foobar"; my $x = $str=~/[aeiou]/g; # matches first "o" => $x is true, pos($str) is 2 $x = $str=~/[aeiou]/g; # matches second "o" => $x is true, pos($str) is 3 $x = $str=~/[aeiou]/g; # matches "a" => $x is true, pos($str) is 5 $x = $str=~/[aeiou]/g; # no more matches => $x is false, pos($str) is undef` [download]
`m//`	list	no	the list `(1)`	the empty list `()`	`my ($x) = "foobar"=~/[aeiou]/; # => $x is 1` [download]
`m//g`	list	no	a list of all the matched strings, as if there were parentheses around the whole pattern	the empty list `()`	`my ($x,$y,$z) = "foobar"=~/[aeiou]/g; # => $x is "o", $y is "o", $z is "a"` [download]
`m//`	list	yes	a list consisting of the subexpressions matched by the parentheses in the pattern, that is, (`$1`, `$2`, `$3`...)	the empty list `()`	`my ($x,$y) = "foobar"=~/([aeiou])(.)/; # => $x is "o", $y is "o"` [download]
`m//g`	list	yes	a list of the substrings matched by any capturing parentheses in the regular expression, that is, (`$1`, `$2`...) repeated for each match	the empty list `()`	`my ($w,$x,$y,$z) = "foobar"=~/([aeiou])(.)/g; # => $w is "o", $x is "o", $y is "a", $z is "r"` [download]
`s///`	-	-	the number of substitutions made	false	`my $x = "foobar"; my $y = $x=~s/[aeiou]/x/g; # => $y is 3` [download]
`s///r`	-	-	a copy of the original string with substitution(s) applied (available since Perl 5.14)	the original string	`my $x = "foobar"=~s/[aeiou]/x/gr; # => $x is "fxxbxr"` [download]

In this table, "true" and "false" refer to Perl's notion of Truth and Falsehood. Remember not to rely on any of the capture variables like $1, $2, etc. unless the match succeeds!

In my $foo = "bar"=~/a/;, the right-hand side of the assignment ("bar"=~/a/) is in scalar context. In my ($foo) = "bar"=~/a/; or my @foo = "bar"=~/a/;, the right-hand side is in list context. That's why, in your example, you need those parens in ($value): because you want the matching operation to return the contents of the capture group.

Note that your expressions can be slightly simplified, not all the parens you showed are needed:

my ($value) = $row =~ /.*,(.*)/;
# and
$row =~ s/,[^,]*$//;
[download]

A few additional comments on your code:

($row =~ s/,[^,]*$//); # gets substring before the last comma - this comment isn't quite right or at least potentially misleading, since it deletes the string ~~before~~ after and including the last comma.
/.*,(.*)/ matches any comma anywhere in the string, for simple input strings it may behave correctly, but I'd strongly recommend coding more defensively and writing it like your second expression: my ($value) = $row=~/,([^,]*)$/; - the $ anchor makes sure that the regex only matches the last comma and what follows it (unless you use the /m modifier, since it changes the meaning of $).
While the use of Scalar::Util's looks_like_number is often a good idea, note that if you don't mind being a little more restrictive, Regexp::Common (or a hand-written regex) would allow you to combine the two regular expressions:
```
use Regexp::Common qw/number/;

my $row = "a,b,c,d,15";

if ( $row=~s/,($RE{num}{real})$// ) {
    print "matched <$1>\n";
}
print "row is now <$row>\n";

__END__

matched <15>
row is now <a,b,c,d>
[download]
```
If this is a CSV file, consider using Text::CSV (also install Text::CSV_XS for speed)

Update: Added s///r to the table and added a few more doc links. A few other edits and updates. 2019-02-16: Added "Return Value on Failure" column to table, and a few other small updates. 2019-08-17: Updated the link to "Truth and Falsehood".

Comment on Re: difference in regex Select or Download Code

Replies are listed 'Best First'.
Re^2: difference in regex by ovedpo15 (Pilgrim) on May 29, 2018 at 14:11 UTC
Thank you for the replay! As I mentioned on one of the posts on this thread - I would like to split it somehow into two scalars. I can use `my ($a,$b) = ($row=~ /(.),(.)/);` But if $row doesn't have commas it won't work. how do I make always put a string into $path for example: if "abc" it will be $path = "abc" and $value is undefined. if "abc,5" it will be $path = "abc" and $value = 5 if "a,b,c,5" it will be $path = "a,b,c" and $value = 5	[reply] [d/l]
Re^3: difference in regex by haukex (Archbishop) on May 29, 2018 at 14:30 UTC
Although personally I'd still use a conditional, of course it's possible to do it all in one regex. One way is by making the comma optional by putting a `?` on a group, in this case I'm using a non-capturing `(?:...)` group, and I had to make the first part of the regex non-greedy so that it doesn't swallow an existing comma: `use warnings; use strict; use Test::More; my $regex = qr/ ^ (.?) (?: , ([^,]) )? $ /x; ok "abc"=~$regex; is $1, "abc"; is $2, undef; ok "abc,5"=~$regex; is $1, "abc"; is $2, 5; ok "a,b,c,5"=~$regex; is $1, "a,b,c"; is $2, 5; done_testing;` [download] Update: An alternative that says a little more explicitly: either match a string with no commas in it, or, if there are commas, I want to match the thing after the last one: `/^ (?\| ([^,]) \| (.) , ([^,]) ) $/x` Update 2:* And it turns out this regex is much faster than the above! (try using it in this benchmark)	[reply] [d/l] [select]


Clear questions and runnable code get the best and fastest answer
	PerlMonks