Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Archive::Tar extract_file

by cajun (Chaplain)
on Jul 24, 2005 at 03:14 UTC ( [id://477541]=perlquestion: print w/replies, xml ) Need Help??

cajun has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to use Archive::Tar to extract the contents of a large number of tarballs into a given directory. Super Search finds me lots of information on creating tarballs with this module, but I didn't really find anything on extracting the entire tarball. Archive::Tar module review was of some help.

foreach (@sorted){ my $tar = Archive::Tar->new($_); my @files = $tar->get_files; foreach my $file (@files){ $tar->extract_file($file, "./EXTRACTED/$_"); } }
This snippet of code gives me: "No such file in archive: 'Archive::Tar::File=HASH(0xa1189a4)'". Which I believe means $file is a hash ref. But I'm not sure where to go from here.

Where am I going wrong here?

Thanks,
Mike

Update: Thanks tlm & itub, list_files was the issue!

Replies are listed 'Best First'.
Re: Archive::Tar extract_file
by tlm (Prior) on Jul 24, 2005 at 03:41 UTC

    Try list_files instead of get_files. That should give you a list of filenames instead of a list of Archive::Tar::File objects. Also, see the Archive::Tar docs for more details.

    the lowliest monk

Re: Archive::Tar extract_file
by itub (Priest) on Jul 24, 2005 at 03:42 UTC
    For that code to work the way you expect, you need to use list_files instead of get_files.
Re: Archive::Tar extract_file
by graff (Chancellor) on Jul 24, 2005 at 23:21 UTC
    I'm pleased you found the module review, and I'm sorry it wasn't more clear. (Let me know if you have any suggestions for changes/additions.)

    If you are always extracting the full content of each tar file to some specific place, then this might be a more efficient approach -- note that I'm assuming the tar file name ends in ".tar" or ".tar.gz" or ".tgz", and I'm stripping that off when naming the directory under EXTRACTED -- personally, I think that having directories named foo.tar.gz and so on is a bad idea:

    foreach ( @sorted ) { my $tar = Archive::Tar->new($_); ( my $dest = "./EXTRACTED/$_" ) =~ s/\.t(?:ar(?:.gz|gz)//; mkdir $dest unless ( -d $dest ); chdir $dest or die "mkdir/chdir failed on $dest: $!"; $tar->extract(); # no params: extract full content to cwd chdir "../.."; # return to original cwd }
    If you really wanted to extract each file individually for some reason (e.g. to divvy them out different extraction paths depending on some feature), then your loop over files would work better this way:
    for my $file ( $tar->get_files ) { my $dataref = $file->get_content_by_ref; # open a suitable output file and print $$dataref to it. # You can use $file->name to see the tarred path and # make subdirs as you see fit. }
    Of course, if the tar files are small and/or there are few files involved, you probably won't notice a difference relative to the "list_files() ... extract_file()" approach. (I just noticed it on tar files containing thousands of data files.)
Re: Archive::Tar extract_file
by saintmike (Vicar) on Jul 25, 2005 at 02:13 UTC
    Archive::Tar can be problematic with big tarfiles (memory), Archive::Tar::Wrapper is a wrapper around the system's tar command and stores extracted files on disk. And (coincidence??) it comes with a sample script that seems to do exactly what you want :).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://477541]
Approved by tlm
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2024-04-26 05:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found