Re: Regexes Are Not For String Comparison
by frag (Hermit) on Apr 24, 2001 at 19:56 UTC
|
I've done this sort of thing lots of times before
myself (using \Q\E or quotemeta(), at least). Reading your post, I've been trying to figure out why, and I've reached this conclusion: When your most impressive tools are regexes, everything looks like a pattern.
In the big cognitive precedence tables stored in most
Perl users'(*) wetware, m// ranks above lc(). So writing this kind of code is understandable. Plus, lc() is usually presented (or else, can get pigeonholed after learning about it) as a tool for changing a string, so someone trying this sort of comparison might think "I need to compare two things, not change the identity of one". So using lc() with eq might not even come up for consideration.
I'm just meditating on the reasons why we pick certain coding forms. We all know TMTOWTDI, but how do people come to choose their ways?
-- Frag.
(*)IMHO, this isn't just me.
| [reply] |
|
| [reply] |
|
| [reply] |
|
|
(dws)Re: Regexes Are Not For String Comparison
by dws (Chancellor) on Apr 24, 2001 at 21:00 UTC
|
<counteropinion>
Unless performance is a concern, go with what's most readable.
For a lone string test, use lc() and eq.
But if you've already used a couple of regex's to test $foo, it may be more readable to throw in another one, rather than switching gears and using lc() and string compare.
Consider adding a test below this fragment:
next if $foo =~ /^#/;
if ( $foo =~ /^(?:this|that|or|something|else)$/ { ... }
Now which reads better
if ( $foo =~ /^$bar$/i ) { ... }
or
if ( lc($foo) eq lc($bar) ) { ... }
I'll argue that it's at best a toss-up.
As a reader of fragments like this, I see the pattern of a variable being tested against a sequence of regex's, and my brian goes into regex scanning mode. Mixing in a string comparision might be technically correct, but it breaks up the reading flow, at least in examples like this.
</counteropinion>
| [reply] [d/l] [select] |
Re: Regexes Are Not For String Comparison
by buckaduck (Chaplain) on Apr 25, 2001 at 02:04 UTC
|
I will timidly confess to a preference for the regex
solution when comparing to $_, such as:
foreach (@names) {
next if /^buckaduck$/i;
...
}
Because I have a real distaste for writing $_ explicitly
whenever it's not absolutely necessary.
This is partly for readability -- some of the Perl
novices at my work are still afraid of $_ ... (And yet
they don't mind the implicit use of $_. Strange.) And
it's also just plain easier to type than the more proper alternative:
foreach (@names) {
next if ( lc($_) eq lc('buckaduck') );
...
}
buckaduck | [reply] [d/l] [select] |
Re: Regexes Are Not For String Comparison
by princepawn (Parson) on Apr 24, 2001 at 23:45 UTC
|
uhm, japhy, unless I am mistaken
this node is a rehash of your node which is a
perlmonks best nodes : Code Smarter.
But the more times people that see your wisdom
re-distilled and re-stated, they may actually
allow it to seep into their skulls which are only
1/10th as thick as mine.
| [reply] |
|
Yes, it is a repetition of a main point in that node, but I felt I had to bring it to light again, since there were some nodes about such misuse (in my opinion) of regexes.
japhy --
Perl and Regex Hacker
| [reply] |
Re: Regexes Are Not For String Comparison
by DeusVult (Scribe) on Apr 25, 2001 at 23:51 UTC
|
Regexes are for patterns -- =~ is the "pattern-match binding operator".
I have to mildly disagree. I agree that /^$bar$/ is a dumb regex, but often string literal regexes without either the ^ or the $ are a perfectly reasonable idea.
Now it may be the sort of scripts I've been writing recently, but I almost never find an occasion to use the "eq" operator. And it isn't just that I want case insensitive matches. It's most often whitespace or added junk characters. If I'm trying to match the string "foobar" I don't really care if it is actually "foobar ". Or if I'm looking for the string "no such file or directory" I don't care if I hit "no such file or directory at line 26 of foobar.pl". So I'll use /^no such file or directory/i without a second thought.
So I'm not really sure how strict you were being in your definition of patterns (personally, when I read the word "pattern" I think of masses of '\'ed characters), but I often find regexes a nice fit for matching plain old strings with a bit of leeway. I don't know if I'm interpreting you as being more draconian than you intended, but there is a place for regexes in certain types of string comparison.
But /^$bar$/ really is dumb. Although my personal favorite for stupid regexes is /.*$bar.*/
If you have any trouble sounding condescending, find a Unix user to show you how it's done.
- Scott Adams | [reply] [d/l] [select] |