Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

connecting to ftp sites and download certain files based on file extensions

by Anonymous Monk
on Oct 06, 2006 at 13:45 UTC ( [id://576663]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
I am new to Perl and I don't know what I can do to solve the following problem:
I want to contact an ftp site (namely:ftp://ftp.ncbi.nih.gov/genomes/Bacteria) and download locally all the subfolders in it, but downloading only files with the extension .faa).
For example, I want to download tha folder Bacteria (as I wrote previously) and then, recursively, download all subfolders (like Acidobacteria_bacterium_Ellin345) and only the files with extension .faa (like NC_008009.faa which is inside the Acidobacteria_bacterium_Ellin345 folder). I thought of downloading the whole ftp folder with wget -r command of Linux, but the files are rather large and will take sometime. I believe I must store all the names of the folders in a list and then open each folder and apply wget only to the files with .faa extension. I have no idea how I contact with ftp sites using Perl though...
Any hints would be grately appreciated...

Replies are listed 'Best First'.
Re: connecting to ftp sites and download certain files based on file extensions
by odha57 (Monk) on Oct 06, 2006 at 14:22 UTC
    Welcome to using Perl! Here is a basic example of using the Net::FTP module getting a list of files (in the example, they are .xml files). The program logs in, changes to a directory called bulk_download, gets the list of files, and then ftps them. At the end, it says how many files were fetched. Hope this helps!
    #!/usr/bin/perl -w use Net::FTP; # use the ftp module use strict; my (@filelist, $file, $ftp, $ftp_count); my $host = 'your ip address'; my $user = 'username'; # user name for login my $pass = 'password'; # password for login $ftp_count = 0; $ftp = Net::FTP->new($host, Debug => 0); # start an FTP session $ftp->login($user,$pass); # login $ftp->cwd("bulk_download"); # go to the bulk_dowload +directory $ftp->binary; # make sure we ftp the fi +le as binary @filelist = $ftp->ls("*.xml"); # and get the list of .xm +l files foreach $file (@filelist){ $ftp->get($file); # fetch it, ++$ftp_count; } $ftp->quit; print "For $host found $ftp_count files\n";
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: connecting to ftp sites and download certain files based on file extensions
by philcrow (Priest) on Oct 06, 2006 at 13:47 UTC
    Sounds like you should try Net::FTP. It can connect to sites and do all the things you need to do: ls, get, etc.

    Phil

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://576663]
Approved by wfsp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-25 12:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found