Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^2: Netflix (or on handling large amounts of data efficiently in perl)

by diotalevi (Canon)
on Dec 27, 2008 at 02:21 UTC ( [id://732737]=note: print w/replies, xml ) Need Help??


in reply to Re: Netflix (or on handling large amounts of data efficiently in perl)
in thread Netflix (or on handling large amounts of data efficiently in perl)

I've recently posted a Judy array (http://judy.sourceforge.net) wrapper to CPAN at Judy which has a sparse bit vector implementation, Judy1(3) which is available in Perl at Judy::1.

Updated: I've posted Compact and sparse bit vector which is an example of a perl vs Judy bit vector.

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Replies are listed 'Best First'.
Re^3: Netflix (or on handling large amounts of data efficiently in perl)
by Garp (Acolyte) on Dec 29, 2008 at 10:31 UTC

    Let's see if I'm following your reasoning correctly.

    I'm essentially interested in three variables:
    $movieid
    $userid
    $rating

    Are you suggesting that I make a multi-dimensional Judy array of arrays? So for each movie create a Judy array using $userid as the index and $rating as the value, then put that into a Judy array as the value with $movieid as the index?

    Apologies if I'm stating the obvious, I wouldn't classify myself as a programmer.

    From a very, very rough test (not even gone back to confirm availability of data) this is looking very good indeed for memory consumption. Will do some further testing tomorrow

      Sure, why not. tilly originally mentioned a bitmap so I mentioned something cheaper in memory. You can build multi dimensional Judy arrays. In particular, JudyHS is implemented as a nested set of JudyL arrays. I've posted a snippet at Dump JudyHS which demos dumping a JudyHS structure.

      It's explicitly required for this to work that Judy is nestable.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://732737]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 05:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found