http://qs321.pair.com?node_id=11118554


in reply to Case insensitive string comparison

Your code seems ok to me:

c:\@Work\Perl\monks>perl -wMstrict -le "my @strings = ( 'SMS,SMS1,20190811', 'SMS,SMSh,20190811', 'SMS,SMSH,20190811', 'SMS,SMSx,20190811', 'SMS,SMSi,20190811', 'SMS,SMSX,20190811', 'SMS,SMSI,20190811', ); ;; for my $s (@strings) { my $ref_s = \$s; print qq{'$$ref_s' matches} if $$ref_s =~ /SMSi/i || $$ref_s =~ /SMSI/i || $$ref_s =~ /SMSh/i || $$ref_s =~ /SMSH/i || $$ref_s =~ /SMS1/ ; } " 'SMS,SMS1,20190811' matches 'SMS,SMSh,20190811' matches 'SMS,SMSH,20190811' matches 'SMS,SMSi,20190811' matches 'SMS,SMSI,20190811' matches
What am I doing differently from what you're doing?

Update 1: Here's a variation showing assignment via reference (and aliasing). Again, I think it works the way I think you think it should work. (Maybe take a look at Short, Self-Contained, Correct Example.)

c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @strings = ( 'SMS,SMS1,20190811', 'SMS,SMSh,20190811', 'SMS,SMSH,20190811', 'SMS,SMSx,20190811', 'SMS,SMSi,20190811', 'SMS,SMSX,20190811', 'SMS,SMSI,20190811', ); ;; for my $s (@strings) { my $ref_s = \$s; $$ref_s = 'SMSblk' if $$ref_s =~ /SMSi/i || $$ref_s =~ /SMSI/i || $$ref_s =~ /SMSh/i || $$ref_s =~ /SMSH/i || $$ref_s =~ /SMS1/ ; } ;; dd \@strings; " [ "SMSblk", "SMSblk", "SMSblk", "SMS,SMSx,20190811", "SMSblk", "SMS,SMSX,20190811", "SMSblk", ]

Update 2: BTW: I'd tend to write something like this a bit differently:

c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @strings = ( 'SMS,SMS1,20190811', 'SMS,SMSh,20190811', 'SMS,SMSH,20190811', 'SMS,SMSx,20190811', 'SMS,SMSi,20190811', 'SMS,SMSX,20190811', 'SMS,SMSI,20190811', ); ;; for my $s (@strings) { my $ref_s = \$s; $$ref_s .= ' is SMSblk' if $$ref_s =~ /SMS[iIhH1]/; } ;; dd \@strings; " [ "SMS,SMS1,20190811 is SMSblk", "SMS,SMSh,20190811 is SMSblk", "SMS,SMSH,20190811 is SMSblk", "SMS,SMSx,20190811", "SMS,SMSi,20190811 is SMSblk", "SMS,SMSX,20190811", "SMS,SMSI,20190811 is SMSblk", ]
And maybe also throw in some kind of boundary assertion like  \b so the final regex might look like
    / \b SMS[iIhH1] \b /x
to prevent a string like  'SMS,xSMSHx,20190811' from matching.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^2: Case insensitive string comparison (updated x2)
by Marshall (Canon) on Jun 28, 2020 at 01:40 UTC
    I was also confused about what this $$blk_ref was about. And I just punted that issue in my direct post. It would be helpful if the OP showed more of his application. Dereferencing a ref to a single scalar is a relatively rare thing in Perl. That is because Perl array iterator operations are very good at hiding this nastiness.

    For fun, I used your example data and coded the loop a couple of different ways. Neither of which use an explicit dereferencing operation.

    use strict; use warnings; use Data::Dump qw(dd); my @strings = ( 'SMS,SMS1,20190811', 'SMS,SMSh,20190811', 'SMS,SMSH,20190811', 'SMS,SMSx,20190811', 'SMS,SMSi,20190811', 'SMS,SMSX,20190811', 'SMS,SMSI,20190811', ); # map{} is a logical thought for an array transformation # # Could assign back to @strings or can make # a new array, @strings2 # Could use an "if" and concatenate a message if true # and return $_ in any event. Ternary operator here # gives a place to put a single token that is # not "SMSblk" my @strings2 = map{/SMS[1HI]/i ? "$_ is SMSblk":$_}@strings; dd \@strings2; =prints: [ "SMS,SMS1,20190811 is SMSblk", "SMS,SMSh,20190811 is SMSblk", "SMS,SMSH,20190811 is SMSblk", "SMS,SMSx,20190811", "SMS,SMSi,20190811 is SMSblk", "SMS,SMSX,20190811", "SMS,SMSI,20190811 is SMSblk", ] =cut # Or with a for loop instead of map{} # to modify original array: # Foreach creates an alias and modifying that # alias modifies the original array # No tricky dereferencing is needed. foreach (@strings) { $_ .= " is SMSblk" if /SMS[1HI]/i; } dd \@strings; =prints: [ "SMS,SMS1,20190811 is SMSblk", "SMS,SMSh,20190811 is SMSblk", "SMS,SMSH,20190811 is SMSblk", "SMS,SMSx,20190811", "SMS,SMSi,20190811 is SMSblk", "SMS,SMSX,20190811", "SMS,SMSI,20190811 is SMSblk", ] =cut
      ... a ref to a single scalar ...

      My guess about a possible rationale for this is that the strings being handled are in reality very long and DAN0207 wants to avoid making a bunch of copies of these long strings (e.g., to pass to subroutines). In a case like this, I'd agree that taking a reference to a scalar (string) and passing it around could be quite advantageous in the right circumstances. But this is all just guesswork.


      Give a man a fish:  <%-{-{-{-<

        My guess is that the OP has an array of similar lines, not just a ref to a single line. If he wants to send that array info to a sub, then pass a reference to that array. The OP provided nothing in terms of work flow and we are left to guess. With the info given, passing a ref to a single line doesn't appear to make any sense. Passing a ref to the entire array of strings would make sense.

        I also don't really understand why some csv lines should be transformed into a single token and others left as they are. This transformation is presumably being done for some future processing. Ok, what processing is that? And why is condensing a subset of lines to a particular single token result necessary?

        Maybe what is of interest are the lines that don't match the pattern? Maybe the returned array should be limited to the lines that don't match? I have no idea because this is application specific.