Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Find files recursively and get their attributes

by sriram83.life (Acolyte)
on Apr 23, 2014 at 12:15 UTC ( [id://1083341]=perlquestion: print w/replies, xml ) Need Help??

sriram83.life has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a requirement to find zip,tar,jar,tar.gz etc.. files recursively in a directory and its sub directories and extract them.After extracting, i need to find a specific file and report its name and size into an xml file.

i am able to write the code to dump the file's size and name to an xml file but i am not able to extract the files recursively and get specified files i need to report in the xml file.

If the file found is a .swtag file,I am storing its size and its contents in a hashreference for later use.

Here is my code using File::Find module.

my $min_depth = 0; find( { wanted => \&wanted, }, @dirs); sub wanted { my $depth = $File::Find::dir =~ tr[/][]; return if $depth < $min_depth; if ( $File::Find::name =~ m/.zip\z$|file.*?\z$|.bak\z$|.jar\z$|.wa +r\z$/g ) { print "I am extracting zip file : $File::Find::name\n"; system("unzip -o $File::Find::name"); } if ( $File::Find::name =~ m/.tar.gz\z$/ ) { &execute_command("tar -xzvf $File::Find::name"); } if ( $File::Find::name =~ m/.iso\z$/ ) { system("mkdir /tmp/mnt"); system("mount -o loop $File::Find::name /tmp/mnt"); my @dirs = ( '/tmp/mnt' ); find ( { wanted => \&wanted, },@dirs ); system("unmount /tmp/mnt"); system("rm -rf /tmp/mnt"); } if ( $File::Find::name =~ m/.SYS$|.sys$|.sys2$|.SYS2$|.cmptag$|.sw +tag$|.swidtag$|.tag$|.fxtag$/ ) { print "$File::Find::name\n"; my $fsize = stat($File::Find::name); $fileparamhash->{$File::Find::name}->{Size} = $fsize->size; if ( $File::Find::name =~ m/.swtag$/ ) { my $parser = XML::LibXML->new; my $doc = $parser->parse_file("$File::Find::name"); my $FileDescription; my $versionname; my @filedesc = $doc->getElementsByTagName("ProductName"); foreach ( @filedesc ) { $FileDescription = $_->textContent; } my @versions = $doc->getElementsByTagName("ProductVersion"); foreach ( @versions) { $versionname = $_->textContent; } $fileparamhash->{$File::Find::name}->{FileDescription} = $File +Description; $fileparamhash->{$File::Find::name}->{FileVersion} = $versionn +ame; } } } }

can anyone look into my buggy code and help me..it is urgent.

Many Thanks, Sriram

Replies are listed 'Best First'.
Re: Find files recursively and get their attributes
by davido (Cardinal) on Apr 23, 2014 at 14:43 UTC

    All of your regular expressions are broken because they assume that '.' is treated as a literal character when, in fact, within a regular expression dot is a metacharacter that means "match anything except for newline" (in the default case, as you are using it). Also I'm not sure if your use of "*" in the regexes is intended to be a quantifier, or a wildcard. If the latter, that's another bug.

    At the least, you need to escape your '.' characters within the regular expressions using backslash: "\.".

    Your problem description is vague. I'd rather know what output you are getting, and what you would like to get.


    Dave

      Thanks Dave for your reply.I will change my regular expressions and try capturing required files.

      Actual output i want is an xml file that lists all the files ending with .swtag,.sys,.sys2 and .cmptag with their sizes and their full paths.

      Actual output i am getting is an empty xml file with beginning and ending tags.That means,files which i need are compressed with in zip file and tar files.After uncompressing those archives using File::Find module,my desired files are not reported to the xml file.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1083341]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-19 16:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found