Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re^2: Iterating through Two Arrays. Is there a better use of memory?

by Jeri (Scribe)
on Oct 13, 2011 at 17:10 UTC ( [id://931328]=note: print w/replies, xml ) Need Help??


in reply to Re: Iterating through Two Arrays. Is there a better use of memory?
in thread Iterating through Two Arrays. Is there a better use of memory?

I worry about hashes because of some previous work I've done and some points made in this article.

HashMemoryBlog

Especially where it says "Perl's hashes are inefficient. You need A LOT of memory if you intend to hash tens of millions of key."

This is a concern for me, because the datasets I'm using are really really gigantic.

And I meant to say, 1,000 elements in each array.

  • Comment on Re^2: Iterating through Two Arrays. Is there a better use of memory?

Replies are listed 'Best First'.
Re^3: Iterating through Two Arrays. Is there a better use of memory?
by davido (Cardinal) on Oct 13, 2011 at 17:30 UTC

    "Tens of millions of keys" sounds like a big problem, but it's not a problem that's unique to hashes. If there's the potential of having tens of millions of units of anything (stored in arrays, hashes, whatever), then there's the potential for even more. And suddenly we're talking about a solution that just won't scale well.

    It seems that you're getting deep into the database domain. The data set is growing (or will grow) to a point where holding it all in memory at once becomes a bad design decision.

    How did we get from 1000 elements in the original post to tens of millions of elements? Just because some article points out the obvious; that holding tens of millions of items in memory at once isn't a good solution, doesn't mean that holding thousands is a problem.


    Dave

Re^3: Iterating through Two Arrays. Is there a better use of memory?
by DentArthurDent (Monk) on Oct 13, 2011 at 17:29 UTC
    Back in the day, I made a traveling salesman problem solver which used Perl hashes to reduce the search space. The script started using gigabytes of memory once you got above a few million hash keys. If you're dealing with thousands of items, I wouldn't worry unless you're in a low-memory environment.
    My mission: To boldy split infinitives that have never been split before!
      Great! I will use a hash!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://931328]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (None)
    As of 2024-04-25 00:10 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found