Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: extract values from file if greater than value

by Cristoforo (Curate)
on Jun 15, 2016 at 18:18 UTC ( [id://1165767]=note: print w/replies, xml ) Need Help??


in reply to extract values from file if greater than value

Not knowing if the fields may repeat in either file as anonymous monk asked, I can only guess a solution. The best way to solve this would probably use a lookup hash (as in my solution below), provided the first 2 fields in file 1 don't repeat.
#!/usr/bin/perl use strict; use warnings; open my $fh1, '<', \<<EOF or die $!; 1 19002930 0.74 1 19002931 -0.12 EOF my %data; while (<$fh1>) { next if /CHR/; my ($chr, $bp, $min) = split; $data{$chr,$bp} = $min; } open my $fh2, '<', \<<EOF or die $!; 1 19002930 0.84 0.12 0.94 1 19002931 0 -.20 .12 EOF while (<$fh2>) { my ($chr, $bp, @rest) = split; if (defined(my $min = $data{$chr,$bp})) { print join(" ", $chr, $bp, grep $_ >= $min, @rest), "\n"; } }
This only prints the values - not sure how you want to store in an array (as you mentioned).

Update: Added the 'defined' operator to the if statement so a 'min' value of '0' will be accepted. Without testing for defined, a '0' value would cause the if statement to be wrongly false.

Also, like Marshall's solution, the temporary files I created were just for this example. You would need to open your files in the normal way open my $fh1, '<', 'yourfilename' or die $!.

Update 2: Noting Marshall's comment on separating $data{"$chr$bp"} by a space to be safer, $data{"$chr $bp"}, I used the seldom used idiom $data{$chr,$bp} where a comma separated series of terms as the key to a hash are joined together by the '$SUBSCRIPT_SEPARATOR', $;.

Also, I'm wondering what the purpose of next if /CHR/; is in his code. It is hard to see whithout a better data sample for the file he is reading.

Replies are listed 'Best First'.
Re^2: extract values from file if greater than value
by Marshall (Canon) on Jun 16, 2016 at 04:23 UTC
    I didn't see your post++ before posting my own revised solution according to the changing requirement specs! I encourage posters to be as clear as possible on the requirements - that makes a big difference! If the code doesn't work and the requirements don't either, then that is a mess!

    A very small nit: $data{"$chr$bp"} = $min;, I added a space between the values "$chr $bp" to prevent possible collisions between these two things.

    Update: Cristoforo is right about this seldom used hash key idiom with the commas for hash keys. I also wondered about /CHR/.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1165767]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2024-04-25 09:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found