List comparison problem

perlmonkster has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I somehow lost a final version of a simple Perl script to compare 2 lists, and cannot seem to figure out what is causing a problem with the version I have.

The simple .pl script takes ListL.txt, compares it to ListH.txt, and then "Flags" any entries from ListH.txt that are on ListL.txt (plus gives two separate Counts at the bottom of the output). Using two short sample lists, the Counts are both correct, but for some reason one of the ListL.txt items that *should* show a ListH.txt "Flag" in the output does not. I've tried switching around Count statements, etc., but am completely baffled. Any insight as to how to fix things would be greatly appreciated.

Here's the short code


use strict;  
use warnings;  
my %H_list;  

open my $H_list, '<', 'listH.txt' or die "Cannot open listH.txt: $!"; 
+ 
while (my $line = <$H_list>) { 
    chomp $line; 
    $line =~ s/\r//g; # removes windows CR characters 
    $line =~ s/\s+$//; # removes trailing white spaces 
    $H_list{$line} = 1 
}  
close $H_list;  

my ($L_count, $H_count); 
open my $L_list, '<', 'listL.txt' or die "Cannot open listL.txt: $!"; 
+ 
while (<$L_list>) {  
    chomp;  
    s/\r//; 
    s/\s+$//; 
    $L_count ++; 
    print;  
    $H_count ++ and print ' On List H' if exists $H_list{$_};
    print "\n";  
} 

print "List L UNIQUES: $L_count; FLAGGED From List H: $H_count \n";
[download]

Here are the two short Test Lists and Test output:

(ListL.txt)
ABC123
DEF456
GHI789

(ListH.txt)
ABC123
GHI789

(Test Output)
ABC123
DEF456
GHI789 On List H
List L UNIQUES: 3; FLAGGED From List H: 2
[download]

As you can see, ABC123 should be also "Flagged" as "On List H", and is driving me NUTS as to why not.

Thanks very much.

-perlmonkster

Comment on List comparison problem Select or Download Code

Replies are listed 'Best First'.
Re: List comparison problem by swl (Parson) on Aug 16, 2019 at 02:46 UTC
The problem is in the postfix increment operation. `$H_count ++ and print ' On List H' if exists $H_list{$_};` Use an if block (or similar) instead to avoid conditional dependence on $H_count under the increment operator. Others will be able able to explain the reasons in detail. use strict; use warnings; my %H_list; open my $H_list, '<', 'listH.txt' or die "Cannot open listH.txt: $!"; + while (my $line = <$H_list>) { chomp $line; $line =~ s/\r//g; # removes windows CR characters $line =~ s/\s+$//; # removes trailing white spaces $H_list{$line} = 1 } close $H_list; my ($L_count, $H_count); open my $L_list, '<', 'listL.txt' or die "Cannot open listL.txt: $!"; + while (<$L_list>) { chomp; s/\r//; s/\s+$//; $L_count ++; print; if (exists $H_list{$_}) { $H_count ++; print ' On List H'; } print "\n"; } print "List L UNIQUES: $L_count; FLAGGED From List H: $H_count \n"; [download] UPDATE: See for example node 776720.	[reply] [d/l] [select]
Re: List comparison problem by perlmonkster (Initiate) on Aug 16, 2019 at 03:35 UTC
swl: I was working on this some more with no success, and then just read your solution. Now I can sleep tonight. THANK YOU VERY MUCH !!! -perlmonkster	[reply]
Re^2: List comparison problem by Laurent_R (Canon) on Aug 16, 2019 at 09:04 UTC
The reason of the problem is that when you run: `$H_count ++ and ...` [download] for the first time, the post increment operator sets `$H_count` to 1 and returns 0, i.e. a false value. Therefore, the statement following the `and` operator is not executed. The next time you run the same post-increment statement, it will return 1 (and subsequently other true values) and it will work fine as shown in the following test under the Perl debugger: `DB<1> $h++ and print "foo"; DB<2> $h++ and print "foo"; foo` [download] This would work fine with the pre-increment operator: `DB<3> ++$i and print "foo"; foo` [download] The solution suggested by swl is probably better because there is no hidden surprise in it. As a side comment, please note that in these two code lines: `$line =~ s/\r//g; # removes windows CR characters $line =~ s/\s+$//; # removes trailing white spaces` [download] the first line isn't useful, because the second code line will remove all trailing white spaces, including the `\r` Windows CR character.	[reply] [d/l] [select]
Re^3: List comparison problem by hippo (Bishop) on Aug 16, 2019 at 09:34 UTC
the first line isn't useful, because the second code line will remove all trailing white spaces, including the `\r` Windows CR character. While I take your point, they are not entirely equivalent. The difference is that the first line removes all the `\r` characters wherever they appear in the line. The second does not do that. `use strict; use warnings; use Test::More tests => 2; my $have = "foo\rbar\rbaz\r\n"; my $want = "foobarbaz"; $have =~ s/\s+$//; isnt $have, $want, 'Not all carriage returns removed'; $have =~ s/\r//g; is $have, $want, 'All carriage returns removed';` [download] I've spent far too much time over the years fighting poorly-formed, non-compliant, randomly-encoded data originating from Windows to assume anything about the quality of such data. YMMV.	[reply] [d/l] [select]
Re^4: List comparison problem by Laurent_R (Canon) on Aug 16, 2019 at 21:14 UTC
Re^4: List comparison problem by perlmonkster (Initiate) on Aug 17, 2019 at 00:52 UTC


Problems? Is your data what you think it is?
	PerlMonks