http://qs321.pair.com?node_id=939555


in reply to comparing file contents

The problems are numerous. A line-by-line walkthrough is needed to cover them. Since you have persevered, and desire to improve your Perl skills, here is that walkthrough:

Note that you have not preserved *any* data from the ICD file, so no ICD lookups can be done below this point! We will discuss how to fix this later, by copying the needed info to a hash or array.

(OK, before I added the my before @columns, you did technically preserve *some* ICD data, but since it was only the data from the last line in the file, it hardly counts.)

The problem with your *approach* is just the failure to preserve the ICD data from the first half of the program so that the second half can use it. This can be done by declaring %lookup_icd above the first while() loop, and then within the while() loop, use:
    $lookup_icd{$icd_code} = $icd_code_and_text;

With a few more additions, it would look like this:

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Useqq = 1; my $icd_path = 'icd-10-codes.txt'; my $in_path = 'female1.txt'; my $out_path = 'female1_corrected.txt'; open my $icd_fh, '<', $icd_path or die "Cannot open '$icd_path': $!"; #print "Reading '$icd_path'\n"; my %lookup_icd; while ( my $line = <$icd_fh> ) { chomp $line; my ( $lookup_code, $icd_code_and_text ) = split /\t/, $line; # print Dumper $lookup_code, $icd_code_and_text; if ( exists $lookup_icd{$lookup_code} ) { warn "Replacing $lookup_code;" . " was '$lookup_icd{$lookup_code}'," . " now '$icd_code_and_text'\n"; } $lookup_icd{$lookup_code} = $icd_code_and_text; } close $icd_fh; #print Dumper \%lookup_icd; #print "Reading '$in_path'\n"; #print "Writing '$out_path'\n"; open my $in_fh, '<', $in_path or die "Cannot open '$in_path': $!"; open my $out_fh, '>', $out_path or die "Cannot open '$out_path': $!"; while ( my $line = <$in_fh> ) { chomp $line; my @cols = split / /, $line; for my $possible_icd (@cols) { my $replacement_icd = $lookup_icd{$possible_icd}; if ($replacement_icd) { $possible_icd = $replacement_icd; } } print {$out_fh} join( ' ', @cols ), "\n"; } close $in_fh; close $out_fh or warn "Cannot close '$out_path': $!\nSome data may not have been + written.";

I strongly recommend spending some time with either of the books I mentioned above. You have ventured a little past "baby Perl", and so now will be better served by tutorial than by blind exploration.