Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Parsing files in a directory using a list.

by oxalate (Initiate)
on Apr 08, 2018 at 19:01 UTC ( [id://1212550]=perlquestion: print w/replies, xml ) Need Help??

oxalate has asked for the wisdom of the Perl Monks concerning the following question:

Hi there, I'm relatively new to the world of coding, and I'm starting with Perl. I have this situation; I have a directory full of files with a extension .pdbqt, about 9000 files, and I want to open specific ones following a list, about 300, open each file from directory only if it is in the list and extract from them specific information. Now I'd figured out the code for extracting the info once the file is opened but can't figure out the code for matching each name from my list with specific files in my directory. So please help me.

Here's what I have done so far with my little knowledge:

#!/usr/bin/perl -w use strict; use warnings; my $file = $ARGV[0]; open (FILE, $file); my @array = <FILE>; close(FILE); chomp(@array); opendir(DIR, "path/to/folder/with/my/files"); my @files = readdir(DIR); chomp(@files); foreach my $i (@array) { if(grep @array[$i], @files) { my $match = (grep @array[$i], @files);

Here's my code for when I finally manage to match the element in the list with the files.

open(FILE, $i); my @files=<FILE>; chomp @files; for $j (@files){ my $id = $i; $id =~ s/out_//g; $id =~ s/.pdbqt//g; if($j =~ "REMARK VINA RESULT:"){ my $str = substr($j,25,5); print "$id\t$str\n";"; } }

Thank you so much for help

Replies are listed 'Best First'.
Re: Parsing files in a directory using a list.
by Lotus1 (Vicar) on Apr 08, 2018 at 20:24 UTC

    Your approach of testing all 9000 files against your list can be made to work but there is a simpler alternative. You can use Perl's file tests to check the existence of each of the 300 files in your list given the path. If the file exists then do some function on it. In this example I'm using -e to test existence of the file.

    use warnings; use strict; my $path = "path/to/folder/with/my/files"; my @list = qw( a b c d ); foreach my $file ( @list ){ print "found $file\n" if -e "$path/$file"; }

    A suggestion to save yourself a lot of struggling is to start with something small and get it working before adding the next step to it. Get the part working that reads your list of 300 files first. Once you can print it back out to the console with one file per array element then check for the existence of each file and print if found. Then keep adding small steps. You wouldn't try to build a whole building at once and then fix it so why try that with a program.

Re: Parsing files in a directory using a list.
by jimpudar (Pilgrim) on Apr 09, 2018 at 02:20 UTC

    Hello oxalate,

    Here is a tip regarding this section of the code:

    foreach my $i (@array) { if(grep @array[$i], @files) { my $match = (grep @array[$i], @files);

    First of all, $i is being set to the actual filename, not your array index. Thats why nothing seems to be working properly. Try this to see what I mean:

    $ perl -we ' my @array = qw( file1 file2 file3 ); foreach my $i (@array) { print "\$i = $i\n" }' $i = file1 $i = file2 $i = file3

    If you want $i to contain the array index, you could use $#array which will give you the number of array elements minus one (the highest index) along with the range operator:

    $ perl -we ' my @array = qw( file1 file2 file3 ); foreach my $i ( 0 .. $#array ) { print "\$i = $i\n" }' $i = 0 $i = 1 $i = 2

    Also, if you want to index into an array you should do it like this: $array[0]. Not like you have it, @array[0]. The way you have written it, you are taking a slice.

    This might get you a little bit closer to what you are trying to do...

    Best,

    Jim

Re: Parsing files in a directory using a list.
by Discipulus (Canon) on Apr 08, 2018 at 21:32 UTC
    hello oxalate and welcome to the monastery and to the wonderful world of Perl!

    You already are on the right path ( strict and warnigns are there ) but maybe you need to use some idioms more: firstly the use of the better form to open a file: lexical filehandle, checking if all was good: open my $h, '<', $file or die "cannot read from $file!"

    Another common way, idiomatic way of Perl thinking is: uniqueness, quantity or existence of something lead immediately to use an hash. You read a list populating the hash with each item and later on you'll be able to check against this list as simply as:  .. if exists $files{$current_element}

    But given your numbers (300 files is not a big deal nowadays..) you can also do all the work directly while reading the list file:

    .. my $list_file = '/path/to/list.txt'; open my $read, '<', $list_file or die "cannot read from'$list_file'!"; while (<$read>){ # it populates $_ is the same of: defined $_ = <$rea +d> chomp; my $ret = process ($_) if -e "./$_"; if ($ret){print "$_ succesfully processed\n"} else{print "Warning: '$_' was not processed!\n"} } sub process{ my $file = shift; open ... or return 0 # read and do something close .. # implicitly return the last thing: if close succeded is 1 }

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      Hi Discipulus. I noticed you used a my declaration in a conditional statement. This is something I avoid since it can cause strange and subtle bugs. Refer to this node where Haukex pointed out to me the reason with this quote from perlsyn.

      NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ...) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1212550]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-25 15:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found