Using Perl saves time....

Replies are listed 'Best First'.
Re: Using Perl saves time.... by hv (Prior) on Jul 18, 2005 at 12:23 UTC
One thing the `ls` does that the perl code does not is sort the output. My linux system supports `ls -U` to leave the list unsorted: if there are that many files this may fix the speed issue. Hugo	[reply] [d/l] [select]
Re^2: Using Perl saves time.... by hakkr (Chaplain) on Jul 18, 2005 at 16:21 UTC
more than one nix way to do it ;) `'nl '` [download] update* removed redundant cat *	[reply] [d/l]
Re^3: Using Perl saves time.... by TomDLux (Vicar) on Jul 18, 2005 at 17:55 UTC
`nl - the line numbering utility` takes one or more file names on its command line. So `nl ` cats the contents of all the files with line numers. `46$ ls \| wc 15 15 233 47$ nl \| wc 857 3540 29899 48$ ls \| nl 1 Makefile 2 RCS 3 Tests 4 ... 15` [download] But you just want the count, not the list, so you would need to `\| tail -1 \| cut -d' ' -f1`. Or to be precise: `ls \| nl \| tr -s ' ' ' ' \| tr "\t" ' ' \| cut -d' ' -f2` -- `TTTATCGGTCGTTATATAGATGTTTGCA`	[reply] [d/l]
Re^3: Using Perl saves time.... by chb (Deacon) on Jul 19, 2005 at 08:56 UTC
Hm, I think with this glob the shell will first try to put all filenames on your commandline. If you have a lot of files, the maximum size for a command line will be exceeded (this maximum can be pretty large on *nix machines, but the OP intended to count lots of files...).	[reply]
Re: Using Perl saves time.... by fergal (Chaplain) on Jul 18, 2005 at 10:40 UTC
Why did you use -l for the ls? This certainly made it slower because it had to stat every file it found. Also ls could be aliased in your shell to something funky (like one that uses different colours for differerent file types) so for a fair comparison do `/bin/ls \| wc -l` [download]	[reply] [d/l]
Re^2: Using Perl saves time.... by merlyn (Sage) on Jul 18, 2005 at 11:10 UTC
In Using Perl saves time....: `ls -1 \| wc` In Re: Using Perl saves time....: Why did you use -l for the ls? I think your computer is not showing you the difference between a digit one and the letter ell. The original post is an (unnecessary) digit one. You're complaining rightly about the expense of a letter ell long listing (if that was indeed the case). It's unnecessary because ls has two behaviors, depending on whether the output is a terminal or not (something I count as being broken, but oh well). To a terminal, it columnizes, but to a pipe or file, it's automatically one element per line (classic mode). Thank the idiots at Bezerkley for this abomination. This leads people to believe that they need to add "-1" to get one column, when in fact that's usually not necessary. As an example, compare "ls" with "ls \| cat". -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l]
Re^3: Using Perl saves time.... by szabgab (Priest) on Jul 18, 2005 at 11:50 UTC
Oh, you are right I did not have to use -1 (one) as ls\|wc counts the same as ls -1\|wc You learn something every time...	[reply]
Re^3: Using Perl saves time.... by fergal (Chaplain) on Jul 18, 2005 at 11:51 UTC
I didn't even think of the -1 (one) option, I thought he was using -l to make sure it was one line per file or something. Still it's odd that it's so slow, maybe AIX is just nasty.	[reply]
Re^3: Using Perl saves time.... by Ultra (Hermit) on Jul 18, 2005 at 12:31 UTC
... Thank the idiots at Bezerkley ... It's a joke, right? Dodge This!	[reply]
Re^4: Using Perl saves time.... by merlyn (Sage) on Jul 18, 2005 at 12:32 UTC
Re^5: Using Perl saves time.... by Ultra (Hermit) on Jul 18, 2005 at 12:41 UTC
Some notes below your chosen depth have not been shown here
Re^2: Using Perl saves time.... by jhourcle (Prior) on Jul 18, 2005 at 11:17 UTC
He used `-1` which gives you one item per line, not `-l` which stats the files Of course, under most OSes, when you pipe `ls` into something, it implies `ls -1`, so it streams faster, as opposed to trying to determine how many columns to build. (I've not used AIX, so it's possible that it doesn't do this). I would, however, recommend using the `-l` flag to `wc`, so that it only needs to count the lines, and not the words and characters as well. And most ufs systems choke hard on ls when you have too many items in a directory. I saw a poorly set up system, that was rolling its log files each minute (it might've been one per transaction). The backups were failing, because it took more than 24 hrs for it to generate the listing of the directory with more than 2 million entries ... so the next backup would start running before the first one had finished.	[reply] [d/l] [select]
Re^2: Using Perl saves time.... by Smylers (Pilgrim) on Jul 18, 2005 at 11:12 UTC
Why did you use -l for the ls? He didn't use `-l`; he used `-1`, so as to list just 1 file per line, to make the counting right. Though at least some versions of `ls` spot when their output is being piped and infer `-1` anyway. Smylers	[reply]
Re: Using Perl saves time.... by sh1tn (Priest) on Jul 19, 2005 at 08:08 UTC
`localhost:/usr/share/doc$ time perl -e 'print scalar(()=glob"*"),$/' 1164 real 0m0.045s user 0m0.040s sys 0m0.000s` [download]	[reply] [d/l]
Re^2: Using Perl saves time.... by greenFox (Vicar) on Jul 19, 2005 at 08:55 UTC
Your code is slow too, you need a suitably large directory to test properly- `$ time perl -e 'print scalar(()=glob""),$/' 136631 real 0m4.810s user 0m3.340s sys 0m1.320s $ time perl -le 'opendir f, $ARGV[0] or die $!;++$c while readdir f; p +rint $c' . 136633 real 0m0.440s user 0m0.400s sys 0m0.040s` [download] -- Murray Barton Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho*	[reply] [d/l]
Re^3: Using Perl saves time.... by sh1tn (Priest) on Jul 19, 2005 at 10:47 UTC
You are right - glob is slower than readdir. My point is the short one-liner. Otherwise: `# time ls \| wc -l 99999 real 0m1.850s user 0m1.590s sys 0m0.210s # time perl -e '@_ = glob"*";print$#_' 99998 real 0m1.462s user 0m0.880s sys 0m0.570s # the best alternative: # time perl -MIO::Dir -e '@_ = IO::Dir->new(".")->read;print$#_' 100000 real 0m0.680s user 0m0.570s sys 0m0.100s` [download]	[reply] [d/l]
Re: Using Perl saves time.... by greenFox (Vicar) on Jul 19, 2005 at 01:57 UTC
I thought this should be a one liner. My first thought was- `perl -le "@l = <>; print scalar @l"` which of course is slooow so I borrowed from your script- `perl -le 'opendir f, $ARGV[0] or die $!;++$c while readdir f; print $c'` I am sure the golfers could slim it down some more :) -- Murray Barton Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho*	[reply] [d/l] [select]
Re^2: Using Perl saves time.... [GOLF!] by radiantmatrix (Parson) on Jul 21, 2005 at 13:48 UTC
`perl -e '++$c for glob "";print $c'` [download] Seems to be pretty quick, too. <-radiant.matrix-> Larry Wall is Yoda: there is no `try{}` (ok, except in Perl6; way to ruin a joke, Larry! ;P) The Code that can be seen is not the true Code* "In any sufficiently large group of people, most are idiots" - Kaa's Law	[reply] [d/l]


Pathologically Eclectic Rubbish Lister
	PerlMonks