Re: Regex with variables
by pme (Monsignor) on Jun 29, 2015 at 13:55 UTC
|
Just swap $test and $targets in the 'if' statement. | [reply] |
Re: Regex with variables
by stevieb (Canon) on Jun 29, 2015 at 13:58 UTC
|
In perlop Binding Operators it states "The right argument is a search pattern, substitution, or transliteration. The left argument is what is supposed to be searched, substituted, or transliterated...".
So, you're very close, but you need to put the search pattern $test on the right side of the =~, and what is supposed to be searched $targets on the left: $targets =~ /$test/.
-stevieb
| [reply] [d/l] [select] |
Re: Regex with variables
by 1nickt (Canon) on Jun 29, 2015 at 14:04 UTC
|
Yep, switch the variables in your match test.
But if your data are really like that then you should have them in a hash and check them there rather than stringifying and testing with a regexp.
#!/usr/bin/env perl
use strict;
use warnings;
my %hash;
for(qw/1689 9679 6978 2792 2514 5472 1520 9342 5544 1268 0165 1979 731
+4 2101 7221
+9539 3882 1812/) {
$hash{$_}++;
}
my $tested = '2101';
if( exists $hash{$tested} ) {
print "OK";
}
else {
print "Not there";
}
| [reply] [d/l] |
|
If there are numerous lookups, I agree, but if it's just a couple of quick lookups, that's not necessarily true. To extract the string into a hash seems to take significantly longer than just a straight regex lookup:
#!/usr/bin/perl
use warnings;
use strict;
use Benchmark qw(:all);
cmpthese ( 5000000, {
'string' => 'string()',
'hash' => 'hash()',
});
sub string {
my $str = "1689 9679 2792 2514 5472 1520 9342 5544 1268 0165 1979
+7314 2101 7221 9539 3882 1812";
my $test = 2101;
if ($str =~ $test){
# do stuff
}
}
sub hash {
my $str = "1689 9679 2792 2514 5472 1520 9342 5544 1268 0165 1979
+7314 2101 7221 9539 3882 1812";
my $test = 2101;
my %hash;
for (split /\s+/, $str){
$hash{$_}++;
}
if (exists $hash{$test}){
# do stuff
}
}
__END__
Rate hash string
hash 267237/s -- -92%
string 3521127/s 1218% --
| [reply] [d/l] |
|
Hi Stevieb,
I wasn't considering performance but rather the ease of maintenance of a list of items. The OP has $targets (plural) as the name of his string, so I assumed that at some point he had a list, and then put it into a string.
I agree that if he originally had a string like that (e.g. from a log file) then a regexp would be the way to find a substring of it.
But if he has a list then putting it into a string and looking for a match is headed in the wrong direction, IMO.
Update: I wanted to know, so now I do. In this case join()ing the list into a string and then using a regexp is still much faster than populating a hash:
use Benchmark qw(:all);
cmpthese ( 5000000, {
'string' => 'string()',
'hash' => 'hash()',
'list2string' => 'list2string()',
});
# other subs as before
sub list2string {
my $str = join( ' ', qw/1689 9679 2792 2514 5472 1520 9342 5544 1268
+ 0165 1979 7314 2101 7221 9539 3882 1812/ );
my $test = 2101;
if ($str =~ $test){
# do stuff
}
}
__END__
Rate hash list2string st
+ring
hash 62406/s -- -88%
+-95%
list2string 530223/s 750% --
+-61%
string 1355014/s 2071% 156%
+ --
But I still wouldn't do it myself as combining discrete values into a string just feels wrong.
| [reply] [d/l] [select] |
|
|
Doh! I told you it was a stupid question. Thanks for the replies.
| [reply] |
Re: Regex with variables
by Monk::Thomas (Friar) on Jun 29, 2015 at 21:36 UTC
|
Some additional thought:
Where does the data format come from? Is it guaranteed you will always be matching a 4-digit-number against other 4-digit-numbers?
I guess the safe bet would be to go for the hash-based solution that was already recommended. That way you can also match different sized values and do not need to fear accidentally matching 210 against 2102. (Bugfix: Modify regexp to include whitespace. Bugfix to the bugfix: properly handle first and last value in $targets...)
But it's also possible the regexp-solution is actually better suited to your need. Consider your options and pick wisely. ;)
| [reply] |
Re: Regex with variables
by marinersk (Priest) on Jun 30, 2015 at 00:46 UTC
|
All the above are good; just a note for safety's sake, if your use of such a variable as the regular expression ever falls into risk of including metacharacters, you may need to use quotemetato convert it to a regular expression:
Results:
| [reply] [d/l] [select] |
|
c:\@Work\Perl\monks>perl -wMstrict -le
"my @inputData = qw(test.dat test.exe testadat.exe testaexe.dat);
;;
my @search_strings = qw(test.dat test.exe);
;;
print '----- With quotemeta ------------------';
foreach my $search (@search_strings) {
print qq{Using search string '\Q$search\E':};
foreach my $inputLine (@inputData) {
if ($inputLine =~ m{ \Q$search\E }xms) {
print qq{ Matched: '$inputLine'};
}
}
}
"
----- With quotemeta ------------------
Using search string 'test\.dat':
Matched: 'test.dat'
Using search string 'test\.exe':
Matched: 'test.exe'
Give a man a fish: <%-(-(-(-<
| [reply] [d/l] [select] |