Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Regex with variables

by cormanaz (Deacon)
on Jun 29, 2015 at 13:51 UTC ( [id://1132457]=perlquestion: print w/replies, xml ) Need Help??

cormanaz has asked for the wisdom of the Perl Monks concerning the following question:

Howdy bros. This is bound to wind up being a stupid question, but why doesn't the following regex work? $test is a substring of $targets, so...
#!/usr/bin/perl -w use strict; my $targets = '1689 9679 6978 2792 2514 5472 1520 9342 5544 1268 0165 +1979 7314 2101 7221 9539 3882 1812'; my $test = "2101"; if ($test =~ /$targets/) { print "OK"; } else { print "Not there"; }

Replies are listed 'Best First'.
Re: Regex with variables
by pme (Monsignor) on Jun 29, 2015 at 13:55 UTC
    Just swap $test and $targets in the 'if' statement.
Re: Regex with variables
by stevieb (Canon) on Jun 29, 2015 at 13:58 UTC

    In perlop Binding Operators it states "The right argument is a search pattern, substitution, or transliteration. The left argument is what is supposed to be searched, substituted, or transliterated...".

    So, you're very close, but you need to put the search pattern $test on the right side of the =~, and what is supposed to be searched $targets on the left: $targets =~ /$test/.

    -stevieb

Re: Regex with variables
by 1nickt (Canon) on Jun 29, 2015 at 14:04 UTC

    Yep, switch the variables in your match test.

    But if your data are really like that then you should have them in a hash and check them there rather than stringifying and testing with a regexp.

    #!/usr/bin/env perl use strict; use warnings; my %hash; for(qw/1689 9679 6978 2792 2514 5472 1520 9342 5544 1268 0165 1979 731 +4 2101 7221 +9539 3882 1812/) { $hash{$_}++; } my $tested = '2101'; if( exists $hash{$tested} ) { print "OK"; } else { print "Not there"; }

      If there are numerous lookups, I agree, but if it's just a couple of quick lookups, that's not necessarily true. To extract the string into a hash seems to take significantly longer than just a straight regex lookup:

      #!/usr/bin/perl use warnings; use strict; use Benchmark qw(:all); cmpthese ( 5000000, { 'string' => 'string()', 'hash' => 'hash()', }); sub string { my $str = "1689 9679 2792 2514 5472 1520 9342 5544 1268 0165 1979 +7314 2101 7221 9539 3882 1812"; my $test = 2101; if ($str =~ $test){ # do stuff } } sub hash { my $str = "1689 9679 2792 2514 5472 1520 9342 5544 1268 0165 1979 +7314 2101 7221 9539 3882 1812"; my $test = 2101; my %hash; for (split /\s+/, $str){ $hash{$_}++; } if (exists $hash{$test}){ # do stuff } } __END__ Rate hash string hash 267237/s -- -92% string 3521127/s 1218% --

        Hi Stevieb,

        I wasn't considering performance but rather the ease of maintenance of a list of items. The OP has $targets (plural) as the name of his string, so I assumed that at some point he had a list, and then put it into a string.

        I agree that if he originally had a string like that (e.g. from a log file) then a regexp would be the way to find a substring of it.

        But if he has a list then putting it into a string and looking for a match is headed in the wrong direction, IMO.

        Update: I wanted to know, so now I do. In this case join()ing the list into a string and then using a regexp is still much faster than populating a hash:

        use Benchmark qw(:all); cmpthese ( 5000000, { 'string' => 'string()', 'hash' => 'hash()', 'list2string' => 'list2string()', }); # other subs as before sub list2string { my $str = join( ' ', qw/1689 9679 2792 2514 5472 1520 9342 5544 1268 + 0165 1979 7314 2101 7221 9539 3882 1812/ ); my $test = 2101; if ($str =~ $test){ # do stuff } } __END__ Rate hash list2string st +ring hash 62406/s -- -88% +-95% list2string 530223/s 750% -- +-61% string 1355014/s 2071% 156% + --

        But I still wouldn't do it myself as combining discrete values into a string just feels wrong.

      Doh! I told you it was a stupid question. Thanks for the replies.
Re: Regex with variables
by Monk::Thomas (Friar) on Jun 29, 2015 at 21:36 UTC

    Some additional thought:

    Where does the data format come from? Is it guaranteed you will always be matching a 4-digit-number against other 4-digit-numbers?

    I guess the safe bet would be to go for the hash-based solution that was already recommended. That way you can also match different sized values and do not need to fear accidentally matching 210 against 2102. (Bugfix: Modify regexp to include whitespace. Bugfix to the bugfix: properly handle first and last value in $targets...)

    But it's also possible the regexp-solution is actually better suited to your need. Consider your options and pick wisely. ;)

Re: Regex with variables
by marinersk (Priest) on Jun 30, 2015 at 00:46 UTC

    All the above are good; just a note for safety's sake, if your use of such a variable as the regular expression ever falls into risk of including metacharacters, you may need to use quotemetato convert it to a regular expression:

    Results:

      ... use quotemeta ...

      ... or  \Q$string\E its interpolative equivalent:

      c:\@Work\Perl\monks>perl -wMstrict -le "my @inputData = qw(test.dat test.exe testadat.exe testaexe.dat); ;; my @search_strings = qw(test.dat test.exe); ;; print '----- With quotemeta ------------------'; foreach my $search (@search_strings) { print qq{Using search string '\Q$search\E':}; foreach my $inputLine (@inputData) { if ($inputLine =~ m{ \Q$search\E }xms) { print qq{ Matched: '$inputLine'}; } } } " ----- With quotemeta ------------------ Using search string 'test\.dat': Matched: 'test.dat' Using search string 'test\.exe': Matched: 'test.exe'


      Give a man a fish:  <%-(-(-(-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1132457]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-20 03:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found