http://qs321.pair.com?node_id=1087551


in reply to searching an array member in file header and print a column

The normal, correct answer to a question like this in Perlmonks is for you to go and spend some additional time learning Perl. However, I think your problem has less to do with Perl and more to do with some type of lack of understanding of data structures and logical flow so I'll try to address these with an example that closely matches your problem:

#!/usr/bin/perl -w use strict; use Data::Dumper; my @cols = qw(nat pls kac); my @bigStructure = (); my $realHeaders; my $lineCntr = 0; sub procHeader { my $hdrline = shift; my @instArray = (); my %hash = map {$_=>1} @cols; #thanks kcott my @contents = split /\s+/,$hdrline; for(@contents) { if($hash{$_}) { push @instArray,$_; } else { push @instArray, qw|skip|; } } return \@instArray; } while(<DATA>) { if(!m/^\d/){$realHeaders = procHeader($_)}; my @row = split /\s+/; for (0..$#{$realHeaders}) { if($realHeaders->[$_] ne 'skip') { push @{$bigStructure[$lineCntr]}, $row[$_]; } } $lineCntr++; } print Dumper \@bigStructure; 1; __DATA__ nat pls fof tri 0.1 0.1 0.23 0.1 2.3 1.8 3.2 4.4 5.5 3.2 8.6. 7.9

Yields, (Edit: - moved $lineCntr to the end -- forgot arrays in Perl are zero-based)

$VAR1 = [ [ 'nat', 'pls' ], [ '0.1', '0.1' ], [ '2.3', '1.8' ], [ '5.5', '3.2' ] ];

This entire program could probably be reduced to a single or a few lines. It works by creating an intermediate structure, an array of so-called 'real' headers, with the 'skip' string where you do not want that value to be propagated. Then iterating over the row values is trivial.

Looking at your pseudocode tells me that, like a lot of people learning Perl, you start thinking about a programming problem by first looking at the would-be operations. However, I recommend that you revise that thinking to start by considering the data structures instead. When I did this I recognized that I really should be using arrays rather than hashes for most of my program because I want to keep everything in the same order.

Lastly, your example would actually only print two columns since nat and pls are both valid columns but fof and tri are not.

Celebrate Intellectual Diversity