Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: Netflix (or on handling large amounts of data efficiently in perl)

by Garp (Acolyte)
on Dec 24, 2008 at 22:50 UTC ( [id://732521]=note: print w/replies, xml ) Need Help??


in reply to Re: Netflix (or on handling large amounts of data efficiently in perl)
in thread Netflix (or on handling large amounts of data efficiently in perl)

Thanks for your suggestions. Currently trying to push the data out into a BerkeleyDB now having spent a few hours this morning trying to get an understanding of bdb usage. Gave up trying to understand MLDMB & bdb for now, the documentation on CPAN just got a bit weird. Found great resources using DB_File but sadly ActivePerl haven't managed to get that into their repository so far (I really miss having a *nix box around when it comes to this stuff!)

Vec? Urgh, more time wading through perldoc ahead. Great technical resource, but half of it can be a pain for anyone not from a comp-sci or c++ programming background!

  • Comment on Re^2: Netflix (or on handling large amounts of data efficiently in perl)

Replies are listed 'Best First'.
Re^3: Netflix (or on handling large amounts of data efficiently in perl)
by tilly (Archbishop) on Dec 24, 2008 at 23:24 UTC
    Random tip. Try http://strawberryperl.com/ and see if it lessens the pain of Windows.

    A more technical tip. Try sorting your data and using a btree format for your data. With a hash you do a lot of seeking to disk, and seeks to disk are slow. 1/200th of a second per seek may not sound like a lot, but try doing 100 million of them and you will take the better part of a week. But a btree loaded and accessed in close to sorted order does lots of streaming to/from disk and that is quite fast. (And a merge sort streams data very well.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://732521]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-24 03:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found