Myth busted: Shell isn't always faster than Perl

Replies are listed 'Best First'.
Re: Myth busted: Shell isn't always faster than Perl by Roy Johnson (Monsignor) on Dec 30, 2005 at 18:54 UTC
The problem with the shell version is that it's spawning a new process for every `rm`. The usual practice is to use `xargs` in conjunction with find. `time find . -type f -print \| xargs rm` [download] I don't know how much that will affect your timings, though. Caution: Contents may have been coded under pressure.	[reply] [d/l] [select]
Re^2: Myth busted: Shell isn't always faster than Perl by jdhedden (Deacon) on Dec 30, 2005 at 20:36 UTC
I'd say it's a wash. Trying it several times, the best times I got were: `time perl -MFile::Find -e'finddepth sub { unlink if -f }, @ARGV' /tmp` [download] real 0m3.111s user 0m0.821s sys 0m2.233s `time find /tmp -type f \| xargs rm` [download] real 0m3.312s user 0m0.760s sys 0m2.511s And the varied widely - anywhere up to 5 seconds. Remember: There's always one more bug.	[reply] [d/l] [select]
Re^3: Myth busted: Shell isn't always faster than Perl by zentara (Archbishop) on Dec 30, 2005 at 21:17 UTC
I tested my original script with the `return if -d; unlink $_` [download] against the golfed `unlink $_ if -f;` [download] and the golfed -f test seems to be a bit slower. Maybe because unlink somehow gets called for each directory then stopped? Whereas the 'return if -d ' returns immediately. BUT the improved shell with null `find . -type f -print0 \| xargs -0 rm` [download] seems to win :-( `time -d-test Gtk3 real 0m0.412s user 0m0.074s sys 0m0.337s time -f-test Gtk3 real 0m0.478s user 0m0.076s sys 0m0.388s time find . -type f -print0 \| xargs -0 rm real 0m0.334s user 0m0.012s sys 0m0.321s` [download] I'm not really a human, but I play one on earth. flash japh	[reply] [d/l] [select]
Re^2: Myth busted: Shell isn't always faster than Perl by zentara (Archbishop) on Dec 30, 2005 at 20:16 UTC
Yeah, that brings the shell closer in speed, BUT it starts complaining AND skipping filenames with spaces in them, I believe that is why the original contruct was the way it was. I'm not really a human, but I play one on earth. flash japh	[reply]
Re^3: Myth busted: Shell isn't always faster than Perl by jdporter (Paladin) on Dec 30, 2005 at 20:36 UTC
it starts complaining AND skipping filenames with spaces in them ...And that is precisely why `find` has the `-print0` switch and `xargs` has the `-0` (or `--null`) switch. `find . -type f -print0 \| xargs -0 rm` [download] We're building the house of the future together.	[reply] [d/l] [select]
Re^4: Myth busted: Shell isn't always faster than Perl by polettix (Vicar) on Jan 05, 2006 at 16:41 UTC
Re^5: Myth busted: Shell isn't always faster than Perl by Perl Mouse (Chaplain) on Jan 05, 2006 at 17:03 UTC
Some notes below your chosen depth have not been shown here
Re: Myth busted: Shell isn't always faster than Perl by jdporter (Paladin) on Dec 30, 2005 at 17:50 UTC
I think you'd better do `sub { -f _ or return; unlink $_;` [download] You really only want to unlink "regular" files; and this makes the comparison apples-and-apples with the shell version. Also, explicitly testing `'.'` and `'..'` is superfluous, because they'd be caught by `-d`. We're building the house of the future together.	[reply] [d/l] [select]
Re^2: Myth busted: Shell isn't always faster than Perl by VSarkiss (Monsignor) on Dec 30, 2005 at 18:42 UTC
Fore! `sub { unlink if -f }` [download] :-) Do not rebuke them with harsh words ... but rather lead them gently - with URLs - so that they may learn wisdom.	[reply] [d/l]
Re^3: Myth busted: Shell isn't always faster than Perl by jdporter (Paladin) on Dec 31, 2005 at 17:32 UTC
Yes, my reply originally looked like that; but as the OP said, you may want to do additional things, such as reporting. We're building the house of the future together.	[reply]
Re: Myth busted: Shell isn't always faster than Perl by Perl Mouse (Chaplain) on Dec 31, 2005 at 00:40 UTC
I've never heard of the myth "Shell is always faster". Not that your myth busts anything - using the 'exec' option to delete one file at the time is a far from optimal solution. As pointed out, the '-print0' in combination with 'xargs' is much more efficient, as it saves spawning a gazillion processes. I'm a bit surprised however that no-one so far as piped in the "programmer time is more costly than running time" mantra. Surely, the 2 seconds running time difference are dwarved by all the extra typing you need in your Perl solution. Or are Perl programmers cheap, and shell programmers expensive? I would always go for the shell solution. I'll have deleted all the files even before you've finished typing your Perl program. `Perl --((8:>*`	[reply]
Re^2: Myth busted: Shell isn't always faster than Perl by zentara (Archbishop) on Dec 31, 2005 at 11:59 UTC
I never type out a script more than once, it goes into a /bin directory in my path. "Damn it Jim, I'm a Perl hacker, NOT a typist" :-) I'm not really a human, but I play one on earth. flash japh	[reply]
Re^3: Myth busted: Shell isn't always faster than Perl by Perl Mouse (Chaplain) on Jan 02, 2006 at 13:41 UTC
But if you haven't been on the system yet, you haven't had a chance to install your "delete files and leave the directory structure" program yet. One way of doing system administration is to write a little program for every minor task you want. A small change, a different program. And then, everyone has to carry disks with their personal libraries around. Granted, it's workable. I myself prefer the Unix/POSIX solution. Lots of small tools, that can be stacked like legos. Tools that are everywhere, like `find` and `xargs`. When I sit down at a Unix system, I can type `find . -type f -print0 \| xargs rm` [download] to delete files, and leave the directory structure as is. I don't have to remember whether I installed a program doing this for me on the box, and if I did, how it's called. And I don't need to write a new program if I want to delete all files older than a week - just add an extra option to `find`. (Sure, you could enhance your program that it takes all kinds of options, but if you have to type as many options to your program as to find, you might as well have used find in the first place). I'm not a monoculturist programmer. For anything complex, I write a Perl or a C program (preferably Perl, but that isn't always available - if all you have is a few Mb of RAM and a dozen or so Mb on disk, there's no Perl, but `busybox` stacks a lot of goodies in just a few kb). But I don't bother writing programs for tasks that I don't do that often and that only require a few simple commands. That's not efficient. `Perl --((8:>*`	[reply] [d/l]
Re^4: Myth busted: Shell isn't always faster than Perl by zentara (Archbishop) on Jan 02, 2006 at 16:50 UTC
Re^5: Myth busted: Shell isn't always faster than Perl by Perl Mouse (Chaplain) on Jan 02, 2006 at 17:28 UTC
Some notes below your chosen depth have not been shown here
Re^5: Myth busted: Shell isn't always faster than Perl by robharper (Pilgrim) on Jan 04, 2006 at 15:34 UTC
Re^2: Myth busted: Shell isn't always faster than Perl by demerphq (Chancellor) on Jan 02, 2006 at 09:29 UTC
I would always go for the shell solution. I'll have deleted all the files even before you've finished typing your Perl program. Well, since you are being snarky I'll respond in kind: I doubt it, i reckon youll still be fighting with the shell syntax, and doublechecking that the switches and utilities you got so used to in bash are actually present in the shell you need to run it on. And even then you still wont be 100% confident that it will all work as expected. Which to me is the reason that perl scripts beat shell scripts hands down pretty well every time. I can use the same perl script on every shell and OS I can find pretty much. Your shell script will only work on a small subset of them, and will require massive changes for some of them. Shell scripts are only worth thinking about if you are a monoculture programmer. Since I'm not I view them mostly with contempt. Who needs shell scripts when you have perl scripts instead? --- $world=~s/war/peace/g	[reply]
Re^3: Myth busted: Shell isn't always faster than Perl by Aristotle (Chancellor) on Jan 02, 2006 at 10:42 UTC
I reckon youll still be fighting with the shell syntax I can type `find \| xargs` pipes in my sleep. doublechecking that the switches and utilities you got so used to in bash are actually present in the shell you need to run it on Present in the shell? They�re external binaries; which shell you�re using is irrelevant. Maybe �present on the system,� except that if `find`, `xargs` and `rm` are not present, that is one very broken system. And the `-print0`/`-0` switches are available on these commands on all Unixoid systems where I cared to look. And all that is far more likely to be around than `perl`, in any case. If your portability argument concerns moving between Windows and Unix, well, I can see how someone working on Windows would prefer to always use Perl� `:-)` Makeshifts last the longest.	[reply]
Re^3: Myth busted: Shell isn't always faster than Perl by Perl Mouse (Chaplain) on Jan 02, 2006 at 11:03 UTC
Well, since you are being snarky I'll respond in kind: I doubt it, i reckon youll still be fighting with the shell syntax, and doublechecking that the switches and utilities you got so used to in bash are actually present in the shell you need to run it on. And even then you still wont be 100% confident that it will all work as expected. Bollocks. `find \| xargs` has worked on every Unix system I've used for the last 30 years. Out of the box. In any shell, as the only 'shell' thing here is the pipe, which is universal. It has worked long before Larry released perl1.0, and it will continue to work long after perl5 will be a distant memory. Which to me is the reason that perl scripts beat shell scripts hands down pretty well every time. I can use the same perl script on every shell and OS I can find pretty much. Your shell script will only work on a small subset of them, and will require massive changes for some of them. The shell solution will work on at least anything that's POSIX compliant. Will your Perl program work in perl6? How would you know - it may work on todays version of perl6, but maybe not on next weeks. As for Perl being present on the OS by default, for many OSses, it's only quite recent that their OS came with some version of perl5 installed. Shell scripts are only worth thinking about if you are a monoculture programmer. Since I'm not I view them mostly with contempt. Who needs shell scripts when you have perl scripts instead? So, you do everything with Perl scripts, so you're not a monoculture programmer? Interesting. What's your definition of monoculture then? But you're right. Once you have a truck, you have no need for a bicycle. It's much easier to start up the truck and find a parking spot, just to get a newspaper from the shop around the corner. It's cheaper as well. Bicyclists are monoculture traffic participants - none of them know how to drive a car. `Perl --((8:>*`	[reply]
Re^4: Myth busted: Shell isn't always faster than Perl by demerphq (Chancellor) on Jan 02, 2006 at 11:22 UTC
Re^5: Myth busted: Shell isn't always faster than Perl by Tanktalus (Canon) on Jan 03, 2006 at 04:13 UTC
Re: Myth busted: Shell isn't always faster than Perl by Tanktalus (Canon) on Dec 31, 2005 at 17:37 UTC
zentara, try this one. I wrote this many years ago to clean up 100's of MB of source code (meaning 100's of 1000's of files) and it seems pretty fast. Way faster than rm -rf, for example. However, my goal wasn't to remove just the files, but the whole tree. I'll comment out the part that removes directories just to make it do what yours does. Granted ... this is a bit more complex. But it can't easily be duplicated in shell. use strict; use warnings; $\|=1; foreach my $d (@ARGV) { remove_dir($d); rmdir $d; } print "\nDone.\n"; sub remove_dir { my $d = shift; if ( -f $d or -l $d ) { unlink $d; return; } # must be a directory? my (@sfiles, @sdirs); local *DIR; opendir(DIR, $d) \|\| do { print "Can't open $d: $!\n"; return }; foreach (readdir(DIR)) { next if ($_ eq '.'); next if ($_ eq '..'); my $sd = "$d/$_"; if ( -l $sd ) { push(@sfiles, $sd);} elsif ( -d $sd ) { push(@sdirs, $sd); } else { push(@sfiles, $sd); } } closedir(DIR); print "."; # process subdirectories via fork my $count; foreach my $sd (@sdirs) { my $pid; if ($pid = fork()) { # parent ++$count; } elsif (defined $pid) { # child remove_dir($sd); exit; } else { # failure - try again in a bit sleep 5; redo; } while ($count > 2) { wait(); $count--; } } while (wait() != -1) {} #foreach (@sdirs) { # rmdir $_ \|\| do { # warn "$0: Unable to remove directory $_: $!\n"; # }; #} my @cannot = grep {!unlink($_)} @sfiles; if (@cannot) { warn "$0: cannot unlink @cannot\n"; } } [download] I'll also add that the difference in speed between .4s and 3s is quite negligible when compared to the amount of time it takes to remember and write them. This example above is ludicrously expensive to write, but it is something I do enough that I call it "RD" (yes, upper-case - it's too dangerous to get a short lower-case name) and put it in /usr/local/bin on all machines, all platforms, that I have access to (primarily as a symlink to a shared NFS partition). We really do use it that much ;-)	[reply] [d/l]
Re: Myth busted: Shell isn't always faster than Perl by itub (Priest) on Dec 30, 2005 at 19:29 UTC
I had never heard that myth; actually, I tend to hear the opposite. The truth is, it depends. ;-)	[reply]
Re: Myth busted: Shell isn't always faster than Perl by runrig (Abbot) on Jan 02, 2006 at 19:31 UTC
It depends on what you're doing also. Once when I rewrote a third-party utility in perl, rewriting this bit caused it to go slower in perl: `grep "^function" .4gl \| sed "s/$.$:function $.$(./\2 \1 \/^function \2(/"` [download] But the above was wrong, so I rewrote a "correct" perl version : `/^\sfunction\s+(\w+)\s\(/i # and then use hashes to save data so there's no s///` [download] I rewrote the new perl version in shell (grep/sed) for kicks, and it was slower than the perl version (and much uglier).	[reply] [d/l] [select]
Re: Myth busted: Shell isn't always faster than Perl by Anonymous Monk on Dec 30, 2005 at 19:59 UTC
That is outstanding. We, at my company, had the same type of discussion. We had a failed bash script that we needed to fix but no one really knows bash, we are perl guys.	[reply]
Re: Myth busted: Shell isn't always faster than Perl by Anonymous Monk on Dec 30, 2005 at 23:51 UTC
Put your own inability to develop quality shell script as a defect of the shell, do it, clever plan.	[reply]
Re^2: Myth busted: Shell isn't always faster than Perl by zentara (Archbishop) on Dec 31, 2005 at 12:13 UTC
Well that speaks to the point I'm making. The people who suggested the original slow shell script, are well respected and talented shell programmers. And my "run-of-the-mill" Perl script, beat it. So when someone says, "why use Perl, I can do it faster with a shell script", you better think twice; because maybe the Perl is faster. Also the optimized shell script only beat the Perl version by a nose, Considering how much more flexible the Perl script is, in processing the files as they are found, run-of-the-mill Perl is likely to be faster, than a run-of-the-mill shell, doing some equivalent task. Shell, with it's constant spawing of awk and sed, etc.; is probably harder to do at optimized speed, compared to Perl. I'm not really a human, but I play one on earth. flash japh	[reply]


P is for Practical
	PerlMonks