Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: Help with parsing a file

by Marshall (Canon)
on May 28, 2022 at 23:04 UTC ( [id://11144260]=note: print w/replies, xml ) Need Help??

in reply to Help with parsing a file

It is rare for array indices to appear in Perl code for this sort of problem. Here is another way.

Perhaps helpful or not to you, this was my general thought process:
1. I started by writing "while(){" without filling in cndx yet.
2. I saw that you had blank line separated records.
So, I just coded a line to get that record and coded the subroutine.
There are many ways to write this sub, I just picked an obvious one
3. Then I applied your rules to get the solvent name from that record.
4. Then I wrote loop to iterate over F values
5. then I decided to end on eof and filled in while cndx with an eof check.

So, that is how I got to draft #1. Now I see that I could move getting the record into the while condx and stop going on a null hash. All sorts of improvements could be made. I wanted to demo iterating over the keys of the record and getting a subset of matching keys with grep. This is not perfect code, but I hope easy for you to understand.

use strict; use warnings; use Data::Dumper; my %results; #pick a better name for this!! my $eof_seen = 0; while (!$eof_seen) { my %record = get_record(); my ($solvent) = grep {/^solvent/ and $record{$_}==0}keys %record; foreach my $F (grep {/^F/}keys %record) { $results{$solvent."_".$F}= $record{$F}; } $eof_seen=1 if (eof(DATA)); } print Dumper \%results; sub get_record #blank line separated records { my %record; my $line; while (defined ($line = <DATA>) and $line !~ /^\s*$/) { my ($key, $value) = split ' ',$line; $record{$key} = $value; } return %record; } =Prints $VAR1 = { 'solvent1_F101' => '3.2', 'solvent1_F001' => '1.2', 'solvent2_F101' => '7.2', 'solvent2_F001' => '2.2' }; =cut __DATA__ F001 1.2 F101 3.2 solvent1 0 solvent2 3 F001 2.2 F101 7.2 solvent1 5 solvent2 0

Replies are listed 'Best First'.
Re^2: Help with parsing a file
by Odar (Novice) on May 29, 2022 at 19:19 UTC

    Thank you for helping with this Marshall, very great full. Based on some feedback and solutions provided I have realised I have missed a key info in my attempt to strip the problem to its most basic form (I have updated the question). The key info is that the data blocks are actually not separated by an empty line but by three lines of text with an empty line at the top and bottom and there are more than two of them.Apologies for the confusion.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11144260]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (2)
As of 2024-04-25 06:03 GMT
Find Nodes?
    Voting Booth?

    No recent polls found