deadlift has asked for the wisdom of the Perl Monks concerning the following question:
Hello, I am in need of assistance with a small problem I found in my code that has turned into an issue. I am still relatively new to Perl. My data is two text files that look similar to these two examples. My script currently opens the files, puts them into separate comma delimited arrays(@names, @jobs...), and exports certain differences to another text file. What I need is to print out the change of job title from file one to file two, and I also need the name of the persons whose job title changed.
It should look like this: Jim => Prev: MD Now: Doctor for all changed titles.
I know my code is stopping at the new name value and I tried different things like another for loop, or adding 1 to $i, and I even looked at changing it to a hash. I can't figure out a way to skip over the new name in the new array and still stay the same in the old one.
File1
Name, Job, City, State
Jim, MD, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, MD, San Pablo, CA
Sherry, Nurse, Pinole, CA
George, MD, Pinole, CA
Tim, Nurse, Pinole, CA
Bob, Nurse, Pinole, CA
Uma, MD, San Pablo, CA
Kate, Nurse, Oakland, CA
Pete, MD, San Pablo, CA
File2
Name, Job, City, State
Jim, Doctor, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, Doctor, San Pablo, CA
Sherry, Nurse, Pinole, CA
Jan, Doctor, San Pablo, CA
George, Doctor, Pinole, CA
Tim, Nurse, Richmond, CA
Bob, Nurse, Pinole, CA
Uma, Doctor, San Pablo, CA
Kate, Nurse, Oakland, CA
Paul, Doctor, Oakland, CA
Ruth, Nurse, Richmond, CA
Joe, Nurse, Oakland, CA
Nick, Nurse, Pinole, CA
Pete, Doctor, San Pablo, CA
$file = "1.txt";
(@namesOld = (), @namesNew = (), @jobOld= (), @jobNew= ());
open (FILE, '<', $file) || die;
while (<FILE>)
{
@hospWorkersOld = split(/,/, $_);
push (@namesOld, @hospWorkersOld[0]);
push (@jobOld, @hospWorkersOld[1]);
}
close FILE;
$file2 = "2.txt";
open (FILE, '<', $file2) || die;
while (<FILE>)
{
@hospWorkersNew = split(/,/, $_);
push (@namesNew, @hospWorkersNew[0]);
push (@jobNew, @hospWorkersNew[1]);
}
close FILE;
@oldJobs=();
@newJobs=();
@newNames=();
for ($i = 0; $i < scalar(@jobNew); $i++ )
{
if ($namesOld[$i] eq $namesNew[$i] && $jobOld[$i] ne $jobNew[$i]
+)
{
#print "$namesNew[$i]--$jobNew[$i]\n";
push (@newJobs, $jobNew[$i]);
push (@oldJobs, $jobOld[$i]);
push (@newNames, $namesNew[$i]);
}
}
Re: Arrary issues
by Cristoforo (Curate) on Mar 25, 2019 at 21:24 UTC
|
Hello deadlift
You were on the right road to consider using a hash.
Here is a possible solution using a hash (for the first shorter file).
#!/usr/bin/perl
use strict;
use warnings;
my $File1 = <<EOF;
Name, Job, City, State
Jim, MD, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, MD, San Pablo, CA
Sherry, Nurse, Pinole, CA
George, MD, Pinole, CA
Tim, Nurse, Pinole, CA
Bob, Nurse, Pinole, CA
Uma, MD, San Pablo, CA
Kate, Nurse, Oakland, CA
Pete, MD, San Pablo, CA
EOF
my $File2 = <<EOF;
Name, Job, City, State
Jim, Doctor, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, Doctor, San Pablo, CA
Sherry, Nurse, Pinole, CA
Jan, Doctor, San Pablo, CA
George, Doctor, Pinole, CA
Tim, Nurse, Richmond, CA
Bob, Nurse, Pinole, CA
Uma, Doctor, San Pablo, CA
Kate, Nurse, Oakland, CA
Paul, Doctor, Oakland, CA
Ruth, Nurse, Richmond, CA
Joe, Nurse, Oakland, CA
Nick, Nurse, Pinole, CA
Pete, Doctor, San Pablo, CA
EOF
my %occupation;
open my $fh1, '<', \$File1 or die $!;
<$fh1>; # throw away header line
while (<$fh1>) {
my ($name, $title) = split /, /;
$occupation{$name} = $title;
}
close $fh1 or die $!;
open my $fh2, '<', \$File2 or die $!;
<$fh2>; # throw away header line
while (<$fh2>) {
my ($name, $title) = split /, /;
if (exists $occupation{$name} and $occupation{$name} ne $title) {
print $name, " => Prev: $occupation{$name}, Now: $title\n";
+
}
}
close $fh2 or die $!;
You can see that I used strict and warnings which you should use (to identify errors or warnings in code).
That requires declaring the variables with my.
Also, it is good practice to check whether a file opening (or closing) occurred without error (the or die $! code on file openings and closings).
Although I didn't use Text::CSV here, it really should be used for more complex csv files.
Output:
C:\Old_Data\perlp>perl test2.pl
Jim => Prev.: MD, now: Doctor
Julie => Prev.: MD, now: Doctor
George => Prev.: MD, now: Doctor
Uma => Prev.: MD, now: Doctor
Pete => Prev.: MD, now: Doctor
| [reply] [d/l] [select] |
Re: Arrary issues
by choroba (Cardinal) on Mar 25, 2019 at 21:26 UTC
|
Using a hash sounds like a good idea. The following prints the changes for all the fields:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my @files = @ARGV;
my %people;
for my $file (@files) {
open my $fh, '<', $file or die $!;
<$fh>; # Skip the header line.
while (<$fh>) {
chomp;
my ($name, $job, $city, $state) = split /,\s*/;
@{ $people{$name}{$file} }{qw{ job city state }}
= ($job, $city, $state);
}
}
for my $person (keys %people) {
my $facts = $people{$person};
my @changes;
for my $fact (qw( job city state )) {
push @changes, $fact
if ($facts->{ $files[0] }{$fact} // "")
ne ($facts->{ $files[1] }{$fact} // "");
}
say "$person =>",
map {; " $_ Prev: ", $facts->{ $files[0] }{$_} // '-',
' Now: ', $facts->{ $files[1] }{$_} // '-'
} @changes if @changes;
}
It's a bit tricky as it's possible to have people in one file that don't exist in the other.
Loading the input could be done via Text::CSV_XS if it's a real CSV, i.e. if the fields can get quoted or contain quoted or escaped commas etc.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |
Re: Arrary issues
by tybalt89 (Monsignor) on Mar 25, 2019 at 23:19 UTC
|
#!/usr/bin/perl
# https://perlmonks.org/?node_id=1231669
use strict;
use warnings;
$_ = do { local $/; local @ARGV = qw(File1 File2); <> . <> };
print "$1 => Prev: $2 Now: $3\n"
while /^(\w+), (\w+),.*\n(?=(?:.*\n)*\1, (?!\2,)(\w+).)/gm;
| [reply] [d/l] |
Re: Arrary issues
by Marshall (Canon) on Mar 26, 2019 at 05:29 UTC
|
I am not quite sure what you intend. An example output would have been helpful.
One approach is to construct a hash of FILE1 based upon the NAME.
Read FILE2 and see if any records update JOB for a NAME in FILE1.
I am not sure if this is the intent?
Changing all occurrences of MD -> Doctor without reading FILE2 would be easier, but then what is the point of FILE2?.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Data::Dump qw /pp/;
$|=1; #turn off buffering to stdout
use Inline::Files;
use constant {JOB =>0, CITY=>1, STATE=>2};
my %db;
<FILE1>; #skip first line of file
while (<FILE1>)
{
next if /^\s*$/; #skip blank lines in file
chomp;
my ($Name, $Job, $City, $State) = split /\s*,\s*/,$_;
$db{$Name}=[$Job, $City, $State];
}
<FILE2>; #skip first line of file
while (<FILE2>)
{
next if /^\s*$/; #skip blank lines in file
chomp;
my ($Name, $Job, $City, $State) = split /\s*,\s*/,$_;
if (exists $db{$Name})
{
my $current_job = @{$db{$Name}}[JOB];
if ($current_job ne $Job)
{
print "$Name\'s job changed $current_job->$Job\n";
@{$db{$Name}}[JOB] = $Job;
}
}
}
pp \%db;
=Prints
Jim's job changed MD->Doctor
Julie's job changed MD->Doctor
George's job changed MD->Doctor
Uma's job changed MD->Doctor
Pete's job changed MD->Doctor
{
Bob => ["Nurse", "Pinole", "CA"],
George => ["Doctor", "Pinole", "CA"],
Jim => ["Doctor", "Pinole", "CA"],
Julie => ["Doctor", "San Pablo", "CA"],
Kate => ["Nurse", "Oakland", "CA"],
Pete => ["Doctor", "San Pablo", "CA"],
Sherry => ["Nurse", "Pinole", "CA"],
Tara => ["Nurse", "San Pablo", "CA"],
Tim => ["Nurse", "Pinole", "CA"],
Uma => ["Doctor", "San Pablo", "CA"],
}
=cut
__FILE1__
Name, Job, City, State
Jim, MD, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, MD, San Pablo, CA
Sherry, Nurse, Pinole, CA
George, MD, Pinole, CA
Tim, Nurse, Pinole, CA
Bob, Nurse, Pinole, CA
Uma, MD, San Pablo, CA
Kate, Nurse, Oakland, CA
Pete, MD, San Pablo, CA
__FILE2__
Name, Job, City, State
Jim, Doctor, Pinole, CA
Tara, Nurse, San Pablo, CA
Julie, Doctor, San Pablo, CA
Sherry, Nurse, Pinole, CA
Jan, Doctor, San Pablo, CA
George, Doctor, Pinole, CA
Tim, Nurse, Richmond, CA
Bob, Nurse, Pinole, CA
Uma, Doctor, San Pablo, CA
Kate, Nurse, Oakland, CA
Paul, Doctor, Oakland, CA
Ruth, Nurse, Richmond, CA
Joe, Nurse, Oakland, CA
Nick, Nurse, Pinole, CA
Pete, Doctor, San Pablo, CA
| [reply] [d/l] |
|
|