http://qs321.pair.com?node_id=11113236


in reply to strange behavior of regex

WORKSFORME: output of adjusted code on Perl Banjo with perl 5.30 is ...

g29205.t1 g29176.t1
#!/usr/bin/perl use warnings; use strict; while (my $record = <DATA>){ $record =~ s/\R//g; if ($record =~ /^.*transcript_id "([^"]*).*class_code "([^"]*)/){ my $trans = $1; my $class = $2; #if($class eq 's' | $class eq 'x' | $class eq 'u'){ if( 'sux' =~ /$class/ ){ print "$trans\n"; } } } __DATA__ . transcript_id "g29202.t1"; gene_id "g29202"; gene_name "G42051"; xlo +c "XLOC_053322"; cmp_ref "G42051.1"; class_code "c"; tss_id "TSS54758 +"; . transcript_id "g29205.t1"; gene_id "g29205"; xloc "XLOC_053323"; cla +ss_code "u"; tss_id "TSS54760"; . transcript_id "g29176.t1"; gene_id "g29176"; xloc "XLOC_053324"; cla +ss_code "u"; tss_id "TSS54761"; . transcript_id "g29178.t1"; gene_id "g29178"; gene_name "G42030"; xlo +c "XLOC_053326"; cmp_ref "G42030.1"; class_code "o"; tss_id "TSS54763 +";

NEVERMIND: I missed /g flag when matching 'sux' against /$class/: I had typed the test instead of copying from OP. I just did not think it was needed as the single captured letter will match the string without the flag. Yes, OP's problem persists if /g is insisted.

After a session of perl -Mre=debug ... as I understand the behaviour, when /g flag is used ('sux' =~ /$class/g), the last matched position in "sux" is remembered; next match is then started after that position. So if "u" was matched once, then next time match will start at "x". That will fail if the value of "class" on next line is "u" also.

The correct test would be: $class =~ /[sux]/  # /g is not needed; does not hurt either.