Re: Help with parsing a file

ThereareabunchofissueswithyourcodethatI'llmentioninpassingtohelpyoutowardPerlish style programming instead of C style. The first issue is an almost complete lack of optional white space which I find hard to read. Use white space as you would for writing prose - that's probably what people read most of and what brains are trained to parse, so keep it simple for brains.

An immediate issue is that you don't show how you parse your input data so we can't tell what is in $row. That means we don't know what is in @face_ac and the line pushing @temp into it looks dubious to me. So lets throw all of that away to start with and build something new.

First, we want this to be a small self contained correct example so we start off with strictures and some baked in data. There is a hint that you know this, but always use strictures (use strict; use warnings; - see The strictures, according to Seuss).

use strict;
use warnings;

my $fileStr = <<STR;
F001            1.2
F101            3.2
solvent1      0
solvent2     3

F001            2.2
F101            7.2
solvent1      5
solvent2     0
STR

open my $fIn, '<', $fileStr or die "Couldn't open \$fileStr: $!\n";
[download]

This adds strictures, provides sample data as though it were in an external file and opens an input file handle to it. Now set up a loop to parse the input data. Perl allows us to tell it what constitutes an end of line character sequence so we take advantage of that to read the data one record at a time:

# Look for the empty line between records
local  $/ = "\n\n";

while (defined (my $record = <$fIn>)) {
[download]

Parse the lines. Note that %recordData is declared inside the loop because we don't need it outside the loop or before the loop. Always declare variables in the smallest scope and initialize them when they are declared if appropriate (arrays and hashes are empty by default so usually they don't need to be initialized). You are familiar with split already, but grep and map may be new. Pop off and skim their documentation. In this case we are using grep to remove empty lines and map to generate a key value pair for each line. Then we use grep to build a list of solvents and a list of Fs:

    my %recordData = map{split /\s+/, $_} grep {length $_} split "\n",
+ $record;
    my @solvents = grep {/^solvent\d+/} keys %recordData;
    my @fractions = grep {/^F\d+/} keys %recordData;
[download]

Now we can find the solvent with the zero value. We assume there is one and only one. There could be error checking around this, but I'm skipping it for now. Note that grep operates on a list and generates a list so $zeroSolvent needs to in list context so the value of the first element of the list generated by grep is assigned to it:

    my ($zeroSolvent) = grep {!$recordData{$_}} @solvents;
[download]

and now we can generate the report for the record:

    
    print "${zeroSolvent}_$_ => $recordData{$_}\n" for @fractions;
}
[download]

That prints:

solvent1_F101 => 3.2
solvent1_F001 => 1.2
solvent2_F101 => 7.2
solvent2_F001 => 2.2
[download]

The code above concatenated together is:

use strict;
use warnings;

my $fileStr = <<STR;
F001            1.2
F101            3.2
solvent1      0
solvent2     3

F001            2.2
F101            7.2
solvent1      5
solvent2     0
STR

open my $fIn, '<', \$fileStr or die "Couldn't open \$fileStr: $!\n";

# Look for the empty line between records
local  $/ = "\n\n";

while (defined (my $record = <$fIn>)) {
    my %recordData = map{split /\s+/, $_} grep {length $_} split "\n",
+ $record;
    my @solvents = grep {/^solvent\d+/} keys %recordData;
    my @fractions = grep {/^F\d+/} keys %recordData;
    my ($zeroSolvent) = grep {!$recordData{$_}} @solvents; 
    
    print "${zeroSolvent}_$_ => $recordData{$_}\n" for @fractions;
}
[download]

There may be follow up questions. :-D

This is not the solution that a person with experience in other programming languages might come up with first off, but it's worth exploring in detail because tools such as grep and map can clean up code something wonderful (they can also obscure code something dreadful).

Update: I should note that "${zeroSolvent}_$_ => $recordData{$_}\n" use variable interpolation. Perl expands the contents of variables used inside double quoted strings. The ${zeroSolvent} bit lets us use the variable zeroSolvent with an underscore character following it in the string without Perl seeing zeroSolvent_ as the variable name instead.

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

Comment on Re: Help with parsing a file Select or Download Code


Perl-Sensitive Sunglasses
	PerlMonks