Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

SFTP more than 50+ files

by msk_0984 (Friar)
on Jan 31, 2008 at 16:52 UTC ( [id://665385]=perlquestion: print w/replies, xml ) Need Help??

msk_0984 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Repected Monks,

I have been using Net::SSH2 and implemented this module into one of the critical project for creation of a Web-based interface for distribution of files onto remote systems. Initially this project was implemented with Net::FTP module and then the client was very happy with the product and he wanted to make it secure, so asked us to use SSH module which was highly secure for him.( As we have to anyways be with the SSH protocol only).

Initially we went in for Net::SSH::Perl, but faced lot of dependencies and even after successfull installation of all the modules. We cld see that the application was using very high CPU usage for the execution of the applications in solaris platform.

We went for the Net::SSH2::SFTP module which was a new one. It is a handy and easy to use. So we went along and developed the application, which is working very fine at our end. But latter the client as usually came up and said that he wants to transfer more than 50-60 + files to 30 + remote systems. We used Net::SSH2 and threads to connect to remote systems parally. And when he tried the same the application was taking around 15 mins with cpu utilization reaching to the maximum levels ( 90+) and as expected the browser also timed out. Which made the product to halt.

My main concern is can we be able transfer more than 50 - 60 + files to remote system with less cpu utilization. We checked out the Perl modules and could not find a better module for the above scenario. So is there any module which could be handy for the above requirement.

This is the part of code which transfer the files to remote systems

my $ssh2 = Net::SSH2->new(); die " can't connect " unless $ssh2->connect($h); die " can't authenticate " unless $ssh2->auth(username=>$u, password=> +$p); $ftp_res= $ssh2->scp_put("$flname","$full_path"); ### (local_path, re +mote_path )
Hope you do the needfull.
Thanks in advance,

Sushil Kumar

Replies are listed 'Best First'.
Re: SFTP more than 50+ files
by samtregar (Abbot) on Jan 31, 2008 at 17:10 UTC
    We used Net::SSH2 and threads to connect to remote systems parally. And when he tried the same the application was taking around 15 mins with cpu utilization reaching to the maximum levels ( 90+) and as expected the browser also timed out. Which made the product to halt.

    This is exactly the kind of work you shouldn't do in the web server process while a browser is waiting for a response. Instead, queue the job for a background daemon to process. You can show your client a pretty progress screen with updates from the background job. If you don't already have some sort of job queue you might consider Gearman (which I've used) or TheSchwartz (which I have not).

    Now 15 minutes might be too long even if it's done in the background. If that's the case your next step should be to use a profiler (like Devel::DProf or Devel::Profiler) to figure out what's taking all the time. You might find it's something unexpected like a slow NFS share or DNS lookup timeouts.

    -sam

Re: SFTP more than 50+ files
by ides (Deacon) on Jan 31, 2008 at 17:03 UTC

    If these files are just updates of already existing files, I would suggest running rsync via a cron job ( via SSH ) and you should see a drastic reduction is load.

    If these are truly new files being pushed around then it obviously isn't appropriate.

    Frank Wiles <frank@revsys.com>
    www.revsys.com

Re: SFTP more than 50+ files
by zentara (Archbishop) on Jan 31, 2008 at 17:25 UTC
    You havn't really showed what you are doing, but 1 simple enhancement would be to zip(tar) all the files going to a certain server, so you are only sending a single file to it (instead of 50+). Then log in again, and unpack the file. That would reduce each server transaction to 1 file receive, and a ssh2 logon to unpack it. It may be possible to do it all with a single SSH2 session with shell usage, but it may be easier to do it in a 2 step design.

    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
      Thank you one and all for all your valuable suggestions and your prompt replies which has given me more scope to open up well and could learn a lot.

      The application is designed with an interface so that the user can select the file( Includes /Lookups/ Rules files ) on-fly to distribute it to remote systems ( through threading ). Currently we are planning to zip the selected files and unzip that in remote system. We are implementing this solution at our end and will have to check out how its performs at the Production End.

      Hope that this could fare well. Is there any thing else that can be implemented to make it more effective. So that it does not take much CPU Utilization and more time.

      Thank You Very Much Again.

      Sushil Kumar
Re: SFTP more than 50+ files
by sundialsvc4 (Abbot) on Jan 31, 2008 at 20:33 UTC

    Several suggestions:

    1. If you have a large number of secure file-transfers to do to remote locations, VPN is a great thing to use. A VPN-enabled router at home-office can connect to a VPN-enabled router in each remote site, using digital certificates, and when you do that, everything that you send down that pipe will automatically be very secure. Yet, this security is completely transparent to the clients. You might need a more-expensive router at home-office, but every router you get at any office-supply store these days is going to have VPN capability built-in.
    2. It's fine to use threads to parallelize file transfers, but you should never have “one thread per request.” Instead, visualize a thread as being just a worker-bee:
      • Each worker-bee removes a work-request from a queue, carries it out, and then loops to get another request from the queue. When no more requests can be found (after a reasonable timeout), the thread politely dies.
      • You should devise the system so that the number of threads can be easily set, and it will be based on the capacity of your I/O hardware (and the effective bandwidth of your network).
      • With experimentation, you'll find a good balance of number-of-threads. Generally speaking, you'll see that performance improves up to a point, then it starts to degrade in a much-worse-than-linear fashion; the so-called “hitting the brick wall effect.”
      • If a thread isn't able to complete a request, such as “nobody at the other end seems to be answering the phone,” the thread could respond to this problem by marking-up the request record in some way and throwing it back onto the queue. Then it grabs the next request from the same queue and keeps going. Requests that fail repeatedly would eventually have to be discarded.

    As usual, there is a lot of support for this sort of thing out there already in CPAN. Be creative in your search.

Re: SFTP more than 50+ files
by dwm042 (Priest) on Jan 31, 2008 at 21:54 UTC
    Work I've done has involved support of Perl based transmission code, including SFTP. We handled 2 orders of magnitude more files (per day) to two orders of magnitude more locations (per day), but we launched single jobs for single files. We were using an old obsolete Sparc server, and did not have load issues of any kind unless:

    1. Someone tried to PGP encrypt too big a file.

    2. If a transmission broke, someone tried to send all the queued files at once. Feeding them into the account a few at a time was far superior.

    You really want to stagger the connections, as opposed to doing them all at once.

Re: SFTP more than 50+ files
by salva (Canon) on Jan 31, 2008 at 21:25 UTC
    Are you using an x86 or an SPARC box?

    The SPARC architecture is not very good for encryption, and if you use gcc to compile perl and its modules it is even worse, using the compiler from Sun could significantly increment your code performance.

    If you are transferring the same 60 files to 30 machines, it is like encrypting 1800 files. You can use GPG to encrypt the files just once and then send them over unencrypted FTP, HTTP or SMTP connections. That would probably reduce the CPU requirements an order of magnitude.

    You can also try Net::SFTP::Foreign, it is usually faster than the alternatives, though in that particular case, where the CPU is the bottleneck I am not sure...

    Finally, there is lftp, a sophisticated multiprotocol (ftp, sftp, http, etc.) file transfer client. It can handle a high number of transfers in parallel and is scriptable. You can write an script for it from perl and then invoke it as an external program.

Re: SFTP more than 50+ files
by Anonymous Monk on Jul 07, 2008 at 07:01 UTC
    Hi Sushil I am also getting the same problem.can u please let me know how you have done this.my mail id is vikas_poonia@yahoo.com. Can u please share your mail id. please drop me a mail

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://665385]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-24 15:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found