Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

searching an array member in file header and print a column

by shalgham (Initiate)
on May 27, 2014 at 12:51 UTC ( [id://1087532]=perlquestion: print w/replies, xml ) Need Help??

shalgham has asked for the wisdom of the Perl Monks concerning the following question:

I have an array and a file, the file is tabulated file with column and rows. I want to search each memeber of the array and if it exists in the header of the file then I want to print the whole rows under it and each member of the array is from letter not numbers

foreach my$t(@array){ ## if it is eq to one each header print the lines under that header + plus the header itself##
I do not know how to do it. I have an array of some letters like @a=(nat,pls,kac). and I have a table with as " nat\t pls\t fof\t tri\t kac " and I want to get all the rows of the column that match the member in my array. for example, I need rows that has nat,pls,kac as header but discard fof tri column.

Replies are listed 'Best First'.
Re: searching an array member in file header and print a column
by kcott (Archbishop) on May 27, 2014 at 13:09 UTC

    G'day shalgham,

    Welcome to the monastery.

    A representative example of both your array and file would help comprehension.

    It sounds like you'll want to create a hash from your array, e.g.

    my %hash = map { $_ => 1 } @array;

    Then process your file based on the hash, e.g.

    while (<$filehandle>) { my @row_elements = split /$pattern/; next unless $hash{$row_elements[$i]}; # Process "wanted" row here }

    With more information to work with, I probably could have provided a better answer. See the guidelines in "How do I post a question effectively?": a better question gets a better answer.

    -- Ken

      Thanks Ken, I updated my question, I will check how to question soon. Thanks again
Re: searching an array member in file header and print a column
by InfiniteSilence (Curate) on May 27, 2014 at 15:05 UTC

    The normal, correct answer to a question like this in Perlmonks is for you to go and spend some additional time learning Perl. However, I think your problem has less to do with Perl and more to do with some type of lack of understanding of data structures and logical flow so I'll try to address these with an example that closely matches your problem:

    #!/usr/bin/perl -w use strict; use Data::Dumper; my @cols = qw(nat pls kac); my @bigStructure = (); my $realHeaders; my $lineCntr = 0; sub procHeader { my $hdrline = shift; my @instArray = (); my %hash = map {$_=>1} @cols; #thanks kcott my @contents = split /\s+/,$hdrline; for(@contents) { if($hash{$_}) { push @instArray,$_; } else { push @instArray, qw|skip|; } } return \@instArray; } while(<DATA>) { if(!m/^\d/){$realHeaders = procHeader($_)}; my @row = split /\s+/; for (0..$#{$realHeaders}) { if($realHeaders->[$_] ne 'skip') { push @{$bigStructure[$lineCntr]}, $row[$_]; } } $lineCntr++; } print Dumper \@bigStructure; 1; __DATA__ nat pls fof tri 0.1 0.1 0.23 0.1 2.3 1.8 3.2 4.4 5.5 3.2 8.6. 7.9

    Yields, (Edit: - moved $lineCntr to the end -- forgot arrays in Perl are zero-based)

    $VAR1 = [ [ 'nat', 'pls' ], [ '0.1', '0.1' ], [ '2.3', '1.8' ], [ '5.5', '3.2' ] ];

    This entire program could probably be reduced to a single or a few lines. It works by creating an intermediate structure, an array of so-called 'real' headers, with the 'skip' string where you do not want that value to be propagated. Then iterating over the row values is trivial.

    Looking at your pseudocode tells me that, like a lot of people learning Perl, you start thinking about a programming problem by first looking at the would-be operations. However, I recommend that you revise that thinking to start by considering the data structures instead. When I did this I recognized that I really should be using arrays rather than hashes for most of my program because I want to keep everything in the same order.

    Lastly, your example would actually only print two columns since nat and pls are both valid columns but fof and tri are not.

    Celebrate Intellectual Diversity

      Thanks a lot, You are right I am both new to perl and programming.
Re: searching an array member in file header and print a column
by 2teez (Vicar) on May 27, 2014 at 19:54 UTC

    Hi shalgham,
    Without removing anything from previous great post, I think ( that is if I get your question right ) what you need do is get the index of the column you wanted and print it out. Since all the values for the column you wanted follow the same.
    Something like this, using a modified data given by InfiniteSilence

    #!/usr/bin/perl -l use warnings; use strict; # get the header into an array my @header = split /\s+/, <DATA>; # conjure a regex using the name of columns you wanted # which is the array you have my $to_find = join( '|' => qw(nat pls kac) ); my $reg_to_use = qr/$to_find/; # get the index of the header you need my @index = grep { $header[$_] =~ /$reg_to_use/ } 0 .. $#header; print join( "\t" => @header[@index] ); while (<DATA>) { print join( "\t" => ( split /\s+/, $_ )[@index] ); } __DATA__ nat pls fof tri pls fof tri kac 0.1 0.1 0.23 0.1 0.1 0.23 0.1 0.31 2.3 1.8 3.2 4.4 1.8 3.2 4.4 3.21 5.5 3.2 8.6. 7.9 3.2 8.6. 7.9 2.89
    Output:
    nat pls pls kac 0.1 0.1 0.1 0.31 2.3 1.8 1.8 3.21 5.5 3.2 3.2 2.89
    Note: I don't know how your real dataset looks like, and the above 'might' not be the best in your situation but I can only hope this gives a head up.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1087532]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-04-19 14:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found