Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re (tilly) 3: I'm falling asleep here

by tilly (Archbishop)
on Oct 21, 2001 at 06:34 UTC ( [id://120350]=note: print w/replies, xml ) Need Help??


in reply to Re: Re (tilly) 1: I'm falling asleep here
in thread -s takes too long on 15,000 files

The example code was running on Windows, not Unix.

However your other points are true. I admit that I am guessing as to how the Windows dir function is running so much faster than the simple Perl shown.

But one note. It may be that your filenames don't divide well based on the first few characters. (So one directory has a ton of directories, the rest do not.) In that case the above scheme can be improved by first taking an MD5 hash of the filename, and then placing files into directory locations depending on the characters in the MD5 hash.

(-:At which point your on-disk data storage is starting to be a frozen form of efficient data structures you might learn about in an algorithms course...:-)

Replies are listed 'Best First'.
Re: (tilly) I'm falling asleep here
by ishk0 (Acolyte) on Oct 21, 2001 at 07:36 UTC
    If only it was that simple... it just needs to be all in one directory for another program that parses them (that I didn't write).

    Like I said, ls on Linux takes about 10 seconds for all those files, so it still leaves me guessing.

    I'm also thinking MD5 is a bit of an overkill on 15000 * rather short filenames, a simple rotating hash would be quicker. And more efficient.

    =)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://120350]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-25 22:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found