Re: Using Perl saves time....
by hv (Prior) on Jul 18, 2005 at 12:23 UTC
|
One thing the ls does that the perl code does not is sort the output. My linux system supports ls -U to leave the list unsorted: if there are that many files this may fix the speed issue.
Hugo
| [reply] [d/l] [select] |
|
more than one nix way to do it ;)
'nl *'
update removed redundant cat * | [reply] [d/l] |
|
46$ ls | wc
15 15 233
47$ nl * | wc
857 3540 29899
48$ ls | nl
1 Makefile
2 RCS
3 Tests
4
...
15
But you just want the count, not the list, so you would need to | tail -1 | cut -d' ' -f1.
Or to be precise: ls | nl | tr -s ' ' ' ' | tr "\t" ' ' | cut -d' ' -f2
--
TTTATCGGTCGTTATATAGATGTTTGCA
| [reply] [d/l] |
|
Hm, I think with this glob the shell will first try to put all filenames on your commandline. If you have a lot of files, the maximum size for a command line will be exceeded (this maximum can be pretty large on *nix machines, but the OP intended to count lots of files...).
| [reply] |
Re: Using Perl saves time....
by fergal (Chaplain) on Jul 18, 2005 at 10:40 UTC
|
Why did you use -l for the ls? This certainly made it slower because it had to stat every file it found. Also ls could be aliased in your shell to something funky (like one that uses different colours for differerent file types) so for a fair comparison do
/bin/ls | wc -l
| [reply] [d/l] |
|
In Using Perl saves time....:
ls -1 | wc
In Re: Using Perl saves time....:
Why did you use -l for the ls?
I think your computer is not showing you the difference between a digit one and the letter ell. The original post is an (unnecessary) digit one. You're complaining rightly about the expense of a letter ell long listing (if that was indeed the case).
It's unnecessary because ls has two behaviors, depending on whether the output is a terminal or not (something I count as being broken, but oh well). To a terminal, it columnizes, but to a pipe or file, it's automatically one element per line (classic mode). Thank the idiots at Bezerkley for this abomination. This leads people to believe that they need to add "-1" to get one column, when in fact that's usually not necessary.
As an example, compare "ls" with "ls | cat".
| [reply] [d/l] |
|
| [reply] |
|
| [reply] |
|
| [reply] |
|
|
|
|
He used -1 which gives you one item per line, not -l which stats the files
Of course, under most OSes, when you pipe ls into something, it implies ls -1, so it streams faster, as opposed to trying to determine how many columns to build. (I've not used AIX, so it's possible that it doesn't do this).
I would, however, recommend using the -l flag to wc, so that it only needs to count the lines, and not the words and characters as well.
And most ufs systems choke hard on ls when you have too many items in a directory. I saw a poorly set up system, that was rolling its log files each minute (it might've been one per transaction). The backups were failing, because it took more than 24 hrs for it to generate the listing of the directory with more than 2 million entries ... so the next backup would start running before the first one had finished.
| [reply] [d/l] [select] |
|
| [reply] |
Re: Using Perl saves time....
by sh1tn (Priest) on Jul 19, 2005 at 08:08 UTC
|
localhost:/usr/share/doc$ time perl -e 'print scalar(()=glob"*"),$/'
1164
real 0m0.045s
user 0m0.040s
sys 0m0.000s
| [reply] [d/l] |
|
$ time perl -e 'print scalar(()=glob"*"),$/'
136631
real 0m4.810s
user 0m3.340s
sys 0m1.320s
$ time perl -le 'opendir f, $ARGV[0] or die $!;++$c while readdir f; p
+rint $c' .
136633
real 0m0.440s
user 0m0.400s
sys 0m0.040s
-- Murray Barton Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho
| [reply] [d/l] |
|
You are right - glob is slower than readdir.
My point is the short one-liner. Otherwise:
# time ls | wc -l
99999
real 0m1.850s
user 0m1.590s
sys 0m0.210s
# time perl -e '@_ = glob"*";print$#_'
99998
real 0m1.462s
user 0m0.880s
sys 0m0.570s
# the best alternative:
# time perl -MIO::Dir -e '@_ = IO::Dir->new(".")->read;print$#_'
100000
real 0m0.680s
user 0m0.570s
sys 0m0.100s
| [reply] [d/l] |
Re: Using Perl saves time....
by greenFox (Vicar) on Jul 19, 2005 at 01:57 UTC
|
I thought this should be a one liner. My first thought was-
perl -le "@l = <*>; print scalar @l"
which of course is slooow so I borrowed from your script-
perl -le 'opendir f, $ARGV[0] or die $!;++$c while readdir f; print $c'
I am sure the golfers could slim it down some more :)
-- Murray Barton Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho
| [reply] [d/l] [select] |
|
perl -e '++$c for glob "*";print $c'
Seems to be pretty quick, too.
<-radiant.matrix->
Larry Wall is Yoda: there is no try{} (ok, except in Perl6; way to ruin a joke, Larry! ;P)
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law
| [reply] [d/l] |