Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: write hash to disk after memory limit

by FloydATC (Deacon)
on Mar 13, 2015 at 21:59 UTC ( [id://1120006]=note: print w/replies, xml ) Need Help??


in reply to write hash to disk after memory limit

Swapping is the very act of writing to disk after the physical memory limit has been reached, is it not? When choosing what chunks of memory to swap out, the operating system will usually pick those that have not been in recent use. Unless you can write a significantly smarter algorithm, I'd expect the performance to be worse if you try to swap manually.

Only if you can make better guesses on what chunks of data you won't be needing any time soon will you be able to outperform the memory manager. But then, if you knew you wouldn't be needing parts of the data in memory, you probably wouldn't have bothered placing it there to begin with, right?

If the data set was 10 times larger, maybe I'd spend some time trying to come up with a completely different approach. Today, for a 17 GB data structure I'd seriously consider just buying more RAM so I could get back to work.

The sad truth is, one day wasted on writing, testing and debugging clever code costs far more than a 16 GB stick these days.

-- FloydATC

Time flies when you don't know what you're doing

Replies are listed 'Best First'.
Re^2: write hash to disk after memory limit
by BrowserUk (Patriarch) on Mar 13, 2015 at 22:34 UTC
    the sad truth is, one day wasted on writing, testing and debugging clever code costs far more than a 16 GB stick these days.

    True. But only if the hardware is capable of accommodating it.

    Now the choice is to upgrade the motherboard to one that can accommodate the "extra stick"; but that usually means also upgrading the CPU because later motherboards that can handle more memory have different, later cpu sockets. So now we're looking at anything from 3 to 10 times the price.

    But, does the version of the OS we're using support that new hardware? Does it have drivers available for everything? Does the new hardware still support the legacy ports and drivers need for the other processes that run on the same box?

    Is the, now required, OS upgrade covered by the current license? Is it approved by your company/organisation? What are the costs involved in that upgrade? How many other processes will need to be compatibility tested with it? How long will the integration/testing/approval process take and how much will it cost?

    What if this process is run concurrently on a cluster -- 16 to 32 machines -- or a farm -- 100s or 1000s of machines. How much does that "extra stick" cost now?

    So, sod the upgrade, farm it out to AWS. Fine, but what are the security and legal implications of doing so? Is the data in whole or in part identifiable as customer data? Can a European company legally transmit customer data to US (sited or controlled) servers? How much will the test case in the European Court of Human Rights cost?

    Or; maybe we could just do some bit-twiddlingTM and compress the data representation some, and avoid the whole issue. At least until we get a ECoHR hearing date.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
    .

      I agree, sometimes it's not as easy as simply throwing a stick of memory at the problem. I'm just saying it's usually worth considering before you start reinventing fundamental parts of the operating system.

      Sometimes, even a forklift upgrade of the entire data center can be the most sensible thing to do.

      -- FloydATC

      Time flies when you don't know what you're doing

Re^2: write hash to disk after memory limit
by LanX (Saint) on Mar 14, 2015 at 00:24 UTC
    > Today, for a 17 GB data structure I'd seriously consider just buying more RAM so I could get back to work.

    And when his laboratory gets expanded to output 170GB he's supposed to run and buy 10 times more RAM?

    Clever algorithms pay off buy scaling silently without causing such troubles.

    Only counting the day you spend designing is a miscalculation...

    Look at the code he showed us and how just re-sorting the dimensions of his data structure will reduce any swapping dramatically.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)

    PS: Je suis Charlie!

      And when his laboratory gets expanded to output 170GB he's supposed to run and buy 10 times more RAM?

      Like I said, if the data set was 10 times bigger...

      I don't disagree with any of what you say, it's just that having a working data set of 17 GB that simply isn't suitable for anything else than keeping it all in RAM is not unheard of in this day and age.

      Assuming for a moment that this isn't a problem that needs to scale for an entire datacenter, and we're not talking about reprogramming a deep space probe launched 20 years ago can also help with reducing the need for throwing man hours on the problem.

      If it turns out that in this particular case the data set wasn't really 17 GB after all but only expanded to this size as it was read into memory, that's great :-)

      I was merely trying to illustrate why replacing OS swapping with home baked swapping would probably not be worth the effort.

      -- FloydATC

      Time flies when you don't know what you're doing

        Well yes you are right - somehow...

        ... let me try to explain:

        For me it's obvious that this is a XY Problem.

        So it's like the advice to take a taxi if there's no more fuel in the tank - which is of course reasonable!

        But I have the feeling the driver keeps the motor running over night, so taking a taxi wouldn't solve his real problem.

        Anyway we most likely will never fully know the real problem ... :)

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)

        PS: Je suis Charlie!

      A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1120006]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 18:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found