Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Matching Values in an Array

by Digs27 (Initiate)
on Nov 21, 2014 at 18:15 UTC ( [id://1108037]=perlquestion: print w/replies, xml ) Need Help??

Digs27 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!

I am a new programmer and need help with a project I am working on.

I have two data files formatted as follows:

Station.CSV

Station,Code 1,10 2,11 3,12 4,13 5,14

and..

Parameters.CSV

Station,S_10,S_11,S_12,S_13,S_14,S_15,T_10,T_11,T_12,T_13,T_14,T_15 1,31,29,29,31,29,29,15,14,23,15,14,23 2,33,28,23,33,28,23,17,15,23,17,15,23 3,23,27,33,23,27,33,18,16,23,18,16,23 4,25,26,28,25,26,28,23,14,15,23,14,15 5,26,26,27,26,26,27,23,18,17,23,18,17 6,27,33,31,27,33,31,14,17,18,14,17,18 7,33,29,29,33,29,29,12,18,23,12,18,23

Both of the CSV files listed above are dummy datasets, where the actual files i'm trying to use are 1,900(Station) and 48,000(Parameters) lines long.

My objective is to match station from station.csv to station in Parameters.csv and then use Code from station.csv to match the correct temperature and salinity column from Parameters.csv and return those values to station.csv. For example, in line 1 of station.csv I need it to match station 1 in Parameters.CSV and return values for S_10 and T_10 in that row. Does this make sense? Is this possible in Perl?

This is the script I have written up to this point. I am at a complete loss on how to continue.

my @Station = qw(); my @Code = qw(); open(TEXT,"<Station.csv") or die "\n\n\nFile Not Found!\n\n\n"; my @SCAL = <TEXT>; close(TEXT); foreach my $line (@SCAL) { my @dataA = split(',', $line); push (@Code, $dataA[1]."\n"); push (@Station, $dataA[0]."\n"); } open(FILE,"<Parameters.csv") or die "\n\n\nFile Not Found!\n\n\n"; my @FVCO = <FILE>; close (FILE); foreach my $lineB(@FVCO) { my @dataB = split(',', $lineB); if ( $dataB[0] == $Station[0 .. 5]) { #match correct column for salinity and salinity based off the +value for "code" } }

Please help me if you can!

Replies are listed 'Best First'.
Re: Matching Values in an Array
by toolic (Bishop) on Nov 21, 2014 at 18:44 UTC
    Self-contained example (not reading from files). You could store your table in a hash-of-hashes structure to make access simpler.
    use warnings; use strict; my %param; while (<DATA>) { chomp; my @cols = split /,/; next if $cols[0] =~ /\D/; for my $i (0 .. 5) { $param{$cols[0]}{s}{$i+10} = $cols[$i+1]; $param{$cols[0]}{t}{$i+10} = $cols[$i+7]; } } my @stats = split /\n/, <<EOF; Station,Code 1,10 2,11 3,12 4,13 5,14 EOF for (@stats) { chomp; my ($stat, $code) = split /,/; next if $stat =~ /\D/; print "$stat $param{$stat}{s}{$code} $param{$stat}{t}{$code}\n"; + } __DATA__ Station,S_10,S_11,S_12,S_13,S_14,S_15,T_10,T_11,T_12,T_13,T_14,T_15 1,31,29,29,31,29,29,15,14,23,15,14,23 2,33,28,23,33,28,23,17,15,23,17,15,23 3,23,27,33,23,27,33,18,16,23,18,16,23 4,25,26,28,25,26,28,23,14,15,23,14,15 5,26,26,27,26,26,27,23,18,17,23,18,17 6,27,33,31,27,33,31,14,17,18,14,17,18 7,33,29,29,33,29,29,12,18,23,12,18,23

    Outputs:

    1 31 15 2 28 15 3 33 23 4 25 23 5 26 18
Re: Matching Values in an Array
by CountZero (Bishop) on Nov 21, 2014 at 21:17 UTC
    Of course, Perl is perfectly suited for this kind of tasks!

    use Modern::Perl '2014'; my @data; <DATA>; # discard header line while (<DATA>) { chomp; last if $_ eq 'END PARAMETERS'; # end of data marker, not neces +sary if reading from a file my ( $station, @fields ) = split /,/; $data[$station] = \@fields; } <DATA>; # discard header line while (<DATA>) { chomp; my ( $station, $level ) = split /,/; say "Station $station: salinity $level is $data[$station][$level-10] and t +emperature $level is $data[$station][$level-4]"; } __DATA__ Station,S_10,S_11,S_12,S_13,S_14,S_15,T_10,T_11,T_12,T_13,T_14,T_15 1,31,29,29,31,29,29,15,14,23,15,14,23 2,33,28,23,33,28,23,17,15,23,17,15,23 3,23,27,33,23,27,33,18,16,23,18,16,23 4,25,26,28,25,26,28,23,14,15,23,14,15 5,26,26,27,26,26,27,23,18,17,23,18,17 6,27,33,31,27,33,31,14,17,18,14,17,18 7,33,29,29,33,29,29,12,18,23,12,18,23 END PARAMETERS Station,Code 1,10 2,11 3,12 4,13 5,14
    Output:
    Station 1: salinity 10 is 31 and temperature 10 is 15 Station 2: salinity 11 is 28 and temperature 11 is 15 Station 3: salinity 12 is 33 and temperature 12 is 23 Station 4: salinity 13 is 25 and temperature 13 is 23 Station 5: salinity 14 is 26 and temperature 14 is 18

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Matching Values in an Array
by 2teez (Vicar) on Nov 21, 2014 at 23:13 UTC

    Hi Digs27,
    You have been given a load lot of help, which IMHO, are great if not better, but I thought since you mentioned CSV format, one should also look at using the likes of Text::CSV_XS like this:

    use warnings; use strict; use Text::CSV_XS; use Inline::Files; #Note: Used to read multiply Virtual files my %station_data = getData( \*DATA ); my %data = getData( \*DATA2 ); for ( 0 .. $#{ $data{qw(Station)} } ) { my $key = $data{'Code'}[$_]; print join( " " => $station_data{'Station'}[$_], $station_data{"S_$key"}[$_], $station_data{"T_$key"}[$_] ), $/; } sub getData { my $file = shift(@_); my %data; my @heading = split /[\s,]/ => <$file>; my $csv = Text::CSV_XS->new( { binary => 1 } ) or die Text::CSV_XS->error_diag(); while ( my $row = $csv->getline($file) ) { push @{ $data{$_} } => shift @$row for @heading; } return %data; } __DATA__ Station,S_10,S_11,S_12,S_13,S_14,S_15,T_10,T_11,T_12,T_13,T_14,T_15 1,31,29,29,31,29,29,15,14,23,15,14,23 2,33,28,23,33,28,23,17,15,23,17,15,23 3,23,27,33,23,27,33,18,16,23,18,16,23 4,25,26,28,25,26,28,23,14,15,23,14,15 5,26,26,27,26,26,27,23,18,17,23,18,17 6,27,33,31,27,33,31,14,17,18,14,17,18 7,33,29,29,33,29,29,12,18,23,12,18,23 __DATA2__ Station,Code 1,10 2,11 3,12 4,13 5,14
    Output as:
    1 31 15 2 28 15 3 33 23 4 25 23 5 26 18
    Using the OP data. A bit limiting though.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Matching Values in an Array
by Anonymous Monk on Nov 21, 2014 at 18:16 UTC
    See DBD::CSV, if you know sql, this gets quite a bit easier
Re: Matching Values in an Array
by Laurent_R (Canon) on Nov 21, 2014 at 19:54 UTC
    Your approach is going the wrong around (although you could still make it work). Basically, in such a case, you need to read data from Station.CSV and lookup data from Parameters.CSV. The best way to do that is to first load Parameters.CSV into memory (in a hash of hashes, I would say, but an array of hashes would also fit the bill), and, once you've done that to read sequentially Station.CSV and to the necessary hash lookups. Your data is not very large, the second file will fit into memory without any problem.

    Possibly something like this (quick untested code) to populate the HoH:

    my @fieldnames = qw /S_10 S_11 S_12 S_13 S_14 S_15 T_10 T_11 T_12 T_13 + T_14 T_15/; # Note you could also make it dynamic and generate it from the in +put my %params; while (<$Param>) { my ($index, @values) = split /,/, $_; $params{$index} = { map {$fieldnames[$_], $values{$_}} 0..15;} }
    Retrieving the data is then quite simple in the %params HoH.

    No time now, but I'll try to give a complete solution in a couple of hours.

    Update: corrected a mistake in the code above. And below the full solution, a bit later than I originally planned:

    use strict; use warnings; my (undef, @fieldnames) = split /,/, <DATA>; chomp @fieldnames; my %params; while (<DATA>) { chomp; my ($index, @values) = split /,/, $_; $params{$index} = {map {$fieldnames[$_], $values[$_]} 0..$#field +names}; } my $stat = "1,10 2,11 3,12 4,13 5,14"; open my $STAT, "<", \$stat or die "cannot open $stat $!"; while (<$STAT>) { chomp; next if /sta/i; my ($stat, $code) = split /,/; print "Station $stat : Salinity: ", $params{$stat}{"S_$code"}, "; + Temperature: ", $params{$stat}{"T_$code"}, "\n"; } __DATA__ Station,S_10,S_11,S_12,S_13,S_14,S_15,T_10,T_11,T_12,T_13,T_14,T_15 1,31,29,29,31,29,29,15,14,23,15,14,23 2,33,28,23,33,28,23,17,15,23,17,15,23 3,23,27,33,23,27,33,18,16,23,18,16,23 4,25,26,28,25,26,28,23,14,15,23,14,15 5,26,26,27,26,26,27,23,18,17,23,18,17 6,27,33,31,27,33,31,14,17,18,14,17,18 7,33,29,29,33,29,29,12,18,23,12,18,23
    Output:
    Station 1 : Salinity: 31; Temperature: 15 Station 2 : Salinity: 28; Temperature: 15 Station 3 : Salinity: 33; Temperature: 23 Station 4 : Salinity: 25; Temperature: 23 Station 5 : Salinity: 26; Temperature: 18
Re: Matching Values in an Array
by poj (Abbot) on Nov 21, 2014 at 21:48 UTC
    the actual files i'm trying to use are 1,900(Station) and 48,000(Parameters) lines long.

    Does a station have only 1 record in Parameters ?

    poj

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1108037]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (2)
As of 2024-04-26 05:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found