Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Net::SFTP::Foreign frequent connection server stall

by Anonymous Monk
on Jun 19, 2012 at 17:38 UTC ( [id://977128]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have been having connection stall issue when using Net::SFTP::Foreign to do SFTP to get an ~80mb xml file daily. I have searched for about two weeks now, and do not have a lot of leads. I've tried setting debug = -1, or setting verbose on -vvv and the latest I've tried is setting queue_size = 16. There's really no indication to why it stalls. It would transfer about 30mb, and then stop. No obvious msgs in the debug print either. And it doesn't stall every time, but I would say 70% of the time. (really at my wit's end)

These are the versions of things I have to work with, most out of my control but maybe I can ask for another Perl version bump (but I can't seem / don't know how / to find what's the latest version of Perl supported on Solaris 5.9)

OS Solaris 5.9
Perl 5.8.8
SFTP v 1.35 2005/10/05
Foreign 1.44

I need to solve this issue (obviously), but not sure how else I can proceed. I suppose I can always switch back to FTP but there will be future policies to restrict to SFTP only. Please help!

  • Comment on Net::SFTP::Foreign frequent connection server stall

Replies are listed 'Best First'.
Re: Net::SFTP::Foreign frequent connection server stall
by syphilis (Archbishop) on Jun 19, 2012 at 23:57 UTC
    Hi,
    I use Net::SFTP::Foreign::Backend::Net_SSH2 which (as you may have guessed) uses Net::SSH2 for the sftp transfers.
    I find it very good, though I don't think I've ever had to transfer anything bigger than about 10 megabytes.

    I used to transfer using scp via Net::SSH2, but sometimes experienced incomplete transfers. However, I think that problem (in Net::SSH2) has since been fixed.

    Cheers,
    Rob
Re: Net::SFTP::Foreign frequent connection server stall
by zentara (Archbishop) on Jun 19, 2012 at 17:54 UTC
    You didn't mention trying Net::SSH2, it might be worth a try.
    #!/usr/bin/perl use warnings; use strict; use Net::SSH2; my $ssh2 = Net::SSH2->new(); $ssh2->connect('localhost') or die "Unable to connect Host $@ \n"; $ssh2->auth_password('z','ztester') or die "Unable to login $@ \n"; #get a large file like a 100Meg wav file my $dir = '/home/whoever'; my $remote1 = $dir.'/1.wav'; use IO::File; my $local1 = IO::File->new("> 2.wav"); #it needs a blessed reference $ssh2->scp_get($remote1, $local1); __END__

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      Hello Zentara,
      I have not tried using Net::SSH2, but interested in learning what considerations one should go through when deciding to use SSH2 vs. SFTP. (as I am still very new to using Perl and open source... I find very hard to navigate)
        Hi, Net::SSH2 was designed to replace the old sftp and ssh module. You can ask the SSH experts on the maillist at ssh-sftp-perl maillist for all the details. The switch started about 10 years ago, and at the time the most important reason was to eliminate the numerous math library dependencies , like Math::Pari which was an installation hassle, and often resulted in sftp dropping back to very slow pure Perl math processing. It also brought all of the various ssh utilities, like sftp, scp, ssh under one module. I've found it easy to install and use, as it only requires libssh2 to be installed with it's development headers.

        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh
Re: Net::SFTP::Foreign frequent connection server stall
by salva (Canon) on Jun 20, 2012 at 05:46 UTC
    Hi, I am Net::SFTP::Foreign author. Set $Net::SFTP::Foreign::debug = ~(8|16|1024|2048); and send me the output by email. Also, run your script with truss and send me the last hundreds of lines.

    BTW, which SSH client are you using, the one from Solaris or OpenSSH one and which version? I would try changing it.

    Net::SFTP::Foreign supports timeouts and resuming transfers. You can combine these features as a workaround until we find what is going wrong.

      Hi Salva,

      Getting a dump from a failed run will be tricky. As it seems to only fail the first run of the day. I will try to get it.

      "ssh -V" shows: Sun_SSH_1.1.1, SSH protocols 1.5/2.0, OpenSSL
      I'm not sure what you mean by changing it or that I have any power to(?)

      PS. Just want to say kudos because I can see you actively answer user questions.
        As it seems to only fail the first run of the day

        Then, you must investigate what happens at that time of the day that is different than when it runs properly.

        It may be a problem related to the network, some firewall being rebooted every day at some time early on the morning or whatever. If you can run tcpdump on the machine, you may be able to discover what is happening with the connection at the TCP level.

        I'm not sure what you mean by changing it or that I have any power to(?)

        If you can, install OpenSSH on the machine and tell Net::SFTP::Foreign to use it instead of the SSH client from Solaris, just to see if it makes a difference.

      Hello, not sure how to email you (can you PM me your email address?). Thanks!
        CPAN authors can usually be reached via CPANID@cpan.org. In my case, salva@cpan.org.
Re: Net::SFTP::Foreign frequent connection server stall
by Illuminatus (Curate) on Jun 19, 2012 at 21:12 UTC
    I would use scp over sftp as zentara suggests. If, however, 'future policies' means only sftp, then you'll have to figure out what the problem is:
    1. A code fragment showing what you're doing might be helpful
    2. You don't list the SSH command/version that you're using
    3. The latest version of Net::Sftp::Foreign is 1.74, and it looks like it should work with 5.8.8
    4. Are both client and server Solaris 5.9?
    5. I'm not sure what you're referring to by 'SFTP v 1.35'. Your client is your perl program. Is this the server version? There are lots of different servers around. Most also create log files
    6. Can you use the sftp client on the command line to reliably transfer the file?
    7. tcpdump/wireshark of a hanging transfer can at least tell you what's crapping out at the tcp level

    fnord

      fnord / Illuminatus,

      2. "ssh -V" shows: Sun_SSH_1.1.1, SSH protocols 1.5/2.0, OpenSSL

      3. Good to know!!

      4. DataONTAP (NetApp OS)

      5. I initially looked at the version of the SFTP.pm, so I just put it in here. Prolly extra info...

      6. Yes, and one thing I forgot to mention is if I re-run the job, the sftp will go through. It also worked completely fine when ran daily on our test server. But also the same symptom on our dev box... (almost seems like it's only the first connection of the day, or maybe within 24hrs of last connection but it sounds ridiculous and I have no proof...)

      7. I wouldn't know how to do that, or if I can. Since, I'm in a corporation and peons don't have access to anything, much less production. I'm just not sure. I mean, for us, in order to install a new Perl module it's 2 layers removed and we HOPE we get a knowledgeable SA...
      my $ssherr = File::Temp->new or die "File::Temp->new failed"; open my $stderr_save, '>&STDERR' or die "unable to dump STDERR"; open STDERR, '>&'.fileno($ssherr); my $sftp = Net::SFTP::Foreign->new( host => $properties{RemoteHostName}, user => $rmtUser, password => $rmtPass , timeout => 10, more=> '-v' , queue_size => 16); open STDERR, '>&'.fileno($stderr_save); if ($sftp->error) { print "sftp error: ".$sftp->error."\n"; seek($ssherr, 0, 0); while (<$ssherr>) { print "captured stderr: $_"; } $endprogram = 1; } close $ssherr; if ($endprogram) { exit($endprogram); } if (!$sftp->setcwd($properties{RemoteDirPath})) { print "File Transfer - unable to change directory: " . $sftp->error; exit(1); } if (!$sftp->get($properties{RemoteFileName}, "$fileTo")) { print "File Transfer - get failed: " . $sftp->error; exit(1); }
Re: Net::SFTP::Foreign frequent connection server stall
by Anonymous Monk on Jun 20, 2012 at 16:54 UTC
    I just want to say thanks to all who've replied, for taking your time. Great sense of community!
Re: Net::SFTP::Foreign frequent connection server stall
by Anonymous Monk on Jun 20, 2012 at 18:05 UTC
    One update: This morning's run (using FTP) also failed.
    Error Msg: Timeout at /opt/perl-5.8.8/lib/Net/FTP.pm line 491

    which is... last unless $len = $data->read($buf, $blksize);

    Maybe it's a clue?
      Maybe it's a clue?

      Yes, investigate what happens in your network every day at that time!!!

        Good evening Salva, I am seeing the same problem. Every 50GB or so the SFTP connection will drop, retrys are not successful. scp an sftp work correctly, even in parallel to this activity so it looks like it is a Net::Foreign issue. Environment:
        matt@aslan:185% cat /etc/redhat-release Fedora release 22 (Twenty Two) matt@aslan:186% perl -v This is perl 5, version 20, subversion 3 (v5.20.3) built for x86_64-li +nux-thread-multi (with 16 registered patches, see perl -V for more detail) ... matt@aslan:187% perl_module_version Net::SFTP::Foreign Net::SFTP::Foreign : 1.75 matt@aslan:188%
        I see no relevant detail in the ssh logs during the failure:
        ... debug3: Ignored env DISPLAY debug3: Ignored env SGI_ABI debug1: Sending subsystem: sftp debug2: channel 0: request subsystem confirm 1 debug2: callback done debug2: channel 0: open confirm rwindow 0 rmax 32768 debug2: channel 0: rcvd adjust 2097152 debug2: channel_input_status_confirm: type 99 id 0 debug2: subsystem request accepted on channel 0 # got it!, len:95, code:2, id:-, status: - Error: Connection lost (Connection to remote server is broke +n) Resuming transfer (retry 1) # queueing msg len: 71, code:17, id:1973 ... [1] # waiting for message... [1] # queueing msg len: 83, code:3, id:1974 ... [2] # waiting for message... [2] debug2: channel 0: read<=0 rfd 4 len 0 debug2: channel 0: read failed debug2: channel 0: close_read debug2: channel 0: input open -> drain ...
        The error and resuming lines above was generated by this perl, the rest was from -vvv:
        emit(" Error: " . $connect->status . ' (' . $connect->error . ')',
        emit() is just a wrapper around print STDERR; return undef; The code whole paragraph is:
        for (my $pass=0; $pass < $_max_retries; $pass++) { if ($connect->put($local, $remote, %$options)) { my $stat = $connect->stat($remote); $size = $stat->size; main::add2Log(" Finished $size of $lsize bytes", 'i +nfo') if ($main::debug >= 1 and $pass); $main::progressBar = 100; return 1; } main::append2Log("\b"x$_fll, ' 'x$_fll, "\b"x$_fll) unless (main::is_graphical()); emit(" Error: " . $connect->status . ' (' . $connect->error . ')', " Resuming transfer (retry " . ($pass+1) . ')') +; my $connect = sshConnect($url, $user, $password, 'sftp', undef +, 1, 1); return undef unless ($connect); }
        The whole run sans debug looks like:
        Uploading marilyn/photos/1108BlueRidge/Original/CRW_2030.tif + Error: Connection lost (Connection to remote server is broke +n) Resuming transfer (retry 1) Error: Connection lost (Connection to remote server is broke +n) Resuming transfer (retry 2) Error: Connection lost (Connection to remote server is broke +n) Resuming transfer (retry 3) Error: Connection lost (Connection to remote server is broke +n) Resuming transfer (retry 4) Retries exceeded - failed to send '/Website/www/html/marilyn/p +hotos/1108BlueRidge/Original/CRW_2030.tif' Uploading marilyn/photos/1108BlueRidge/Original/CRW_2031.crw
        What detail can I collect to help diagnose? Matt

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://977128]
Approved by NetWallah
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-04-18 06:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found