http://qs321.pair.com?node_id=707243


in reply to greater efficiency required (ls, glob, or readdir?)

For efficiency you want something like this:

my $dir = '/Path/to/a/data/directory'; opendir my $DH, $dir or die "Cannot open '$dir' $!"; my %hash; while ( my $file = readdir $DH ) { next if $file =~ /~$/; open my $FH, '<', "$dir/$file" or die "Cannot open '$dir/$file' $! +"; while ( my $line = <$FH> ) { next if /^#/ || !/\S/; my ( $key, @values ) = split /\t/; $hash{ $file }{ $key } = \@values; } }

Update: I realised that the while loop is incorrect, it should be:

while ( my $line = <$FH> ) { next if $line =~ /^#/ || $line !~ /\S/; my ( $key, @values ) = split /\t/, $line; $hash{ $file }{ $key } = \@values; }

Replies are listed 'Best First'.
Re^2: greater efficiency required (ls, glob, or readdir?)
by linuxer (Curate) on Aug 27, 2008 at 19:09 UTC

    what about skipping the directory entries?

    my $dir = '/Path/to/a/data/directory'; opendir my $DH, $dir or die "Cannot open '$dir' $!"; my %hash; while ( my $file = readdir $DH ) { next if $file =~ /~$/; next if -d "$dir/$file"; # should also skip +'.' and '..' entries # read and process file }

    update: fix path issue "$dir/$file"

Re^2: greater efficiency required (ls, glob, or readdir?)
by jperlq (Acolyte) on Aug 27, 2008 at 20:09 UTC
    Thanks, I got your version to work with very few changes.
    my $dir = '/path/to/data/directory'; my %hash; opendir my $DH, $dir or die "cannot open '$dir' $!"; while (my $file = readdir $DH ) { next if $file =~ /~$/; next if -d $file; open my $FH, "<", "$dir/$file" or die "Cannot open '$dir/$file +' $!"; while ( my $line = <$FH> ) { next if /^#/ || !length($line); my ($key, @values ) = split(/\t/, $line); $hash{ $file }{ $key } = \@values; } }
    It even seems to work quite a bit faster than the ls/cat combo.

      You need to change: next if -d $file; to next if -d "$dir/$file";

      You need to change: next if /^#/ || !length($line); to next if $line =~ /^#/ || !length($line);