scp cronjob

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: scp cronjob by ides (Deacon) on Sep 11, 2002 at 14:24 UTC
Have you looked into using rsync? It only trasmits the differences between files and greatly reduces the bandwidth used. I use it to backup hundreds of GBs of data each day. It would negate the need for your script and it works over SSH. To backup and/or mirror from the current directory to to your other server would go something like this in cron: `export RSYNC_RSH="/usr/bin/ssh -C" 0 0 * * * /usr/bin/rsync -ar * user@backupserver:/usr/local/apache/ht +docs` [download] Hope this helps. ----------------------------------- Frank Wiles <frank@wiles.org> http://frank.wiles.org	[reply] [d/l]
Re: scp cronjob by mp (Deacon) on Sep 11, 2002 at 16:48 UTC
You are currently making a separate system call to run scp for each file that needs to be copied. This is inefficient because of the overhead of starting up a new process and the overhead of initiating a new connection to the remote machine. If you are running this every minute, and if the script takes more than a minute to run, you will also have multiple instances of the script running simultaneously. Rsync, as mentioned above, would be a much better option. If you are primarily copying html files, you may want to also turn on compression (-z option). `/usr/bin/rsync -az -e ssh source_directory/ \ user@backupserver:destination_directory -e ssh tells rsync to use ssh for transport -a is archive mode, which gives recursion and preserves permissions an +d such. -ar would be harmlessly redundant. -v will show filenames during the transfer.` [download] The trailing slash on the source directory is important. See man rsync for more info.	[reply] [d/l]
Re: scp cronjob by kabel (Chaplain) on Sep 11, 2002 at 12:58 UTC
i suggest to put some running information inside a log file. that helped me alot in former times ;) ###################################################################### +######### sub do_log { ###################################################################### +######### my $output_string = ""; my $log_file = "/some/where"; if (not defined $_[0]) { $output_string = "\n"; } else { $output_string = "$$: " . scalar (gmtime ()) . ": " . join ("" +, @_) . "\n"; } print STDERR $output_string unless ($opts{n}); unless ($opts{l}) { if (-e $log_file) { open (LOGFILE, ">> $log_file") or die "cannot write log [$ +log_file][$!]"; } else { open (LOGFILE, "> $log_file") or die "cannot create log [$ +log_file][$!]"; } print LOGFILE $output_string; close (LOGFILE); } } [download] please let me know if the sub can be improved.	[reply] [d/l]
Re: scp cronjob by Util (Priest) on Sep 11, 2002 at 17:43 UTC
My best guess is that the `$list` file is getting over-written by multiple copies of the Perl script, when your bandwidth limits cause one copy to hang long enough for another copy to start. Here are some other thoughts: You don't need the `$list` file. Use backticks instead: @backupFiles = `ssh user\@backupserver ls /usr/local/apache/htdocs`; [download] You are using File::Find and "globbing" (`</path/.htm>`) together. This is redundant; use one or the other. Do you really want to back up only the new files, but not any changed files? This confuses me. `ls -c` is an odd command for this program; you are asking `ls` to sort by ctime, then throwing away the sort order by using a hash. Plain `ls` would be clearer. Rather than use cron, you might just have the Perl script run as a daemon, never ending until you kill it. This would prevent the multi-copy problem, too. `while (1) { do_something(); sleep 60; }` [download] If bandwidth is a problem, keep a flagfile whoses modtime is equal to the time of the last scp transfer. If no local file is newer than the flagfile, you can skip the ssh traffic. You are calling `scp` once for each file; instead, you could build a list of files and call `scp` once: `my @files = map { "'$_'" } grep {not $chompedList{basename($_)} </usr/myfiles/.htm>; system "scp -C @files user\@backupserver:/usr/local/apache/htdocs";` [download] You are using `fileparse` where `basename` would be clearer. Finally, listen to ides. Rsync was designed for this kind of job. It has lots of options to fine-tune what gets synced and how; it can even limit its bandwidth use. Rsync should work great by itself as a cron job, but here is a (lightly tested) script to demonstrate my other points: #!/usr/bin/perl -W use warnings 'all'; use strict; my $rmthost = 'backupserver'; my $rmtuser = 'user'; my $rmtpath = '/usr/local/apache/htdocs'; my $lclpath = '/usr/myfiles'; my $lclglob = '*.htm'; my $flagfile = '.last_backup'; my $r_opts = "-azq --blocking-io -e 'ssh -l $rmtuser'"; my $lclfiles = "$lclpath/$lclglob"; sub modtime { my $file = shift; my @stats = stat $file or return 0; return $stats[9]; } while (1) { my $timestamp = modtime("$lclpath/$flagfile"); my $run = grep {modtime($_) > $timestamp} glob $lclfiles; if ($run) { system "rsync $r_opts $lclfiles $rmthost:$rmtpath"; open FLAG, ">$lclpath/$flagfile" or die; close FLAG or die; } sleep 60; } [download]	[reply] [d/l] [select]
Re: scp cronjob by Anonymous Monk on Sep 14, 2002 at 22:44 UTC
thanks guys, your comments have been much appreciated. I am going to re-write the script (to help my perl learning) and look into implementing rsynch instead.	[reply]


"be consistent"
	PerlMonks