Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
When a machine starts using virtual memory, its performance drops substantially.

Either the machine is using virtual memory from before it finished booting or it is running in "real mode" and thus probably not running a modern, multi-user operating system. That may sound like nit-picking since you probably meant something more like "using paging space" but even that doesn't make much sense to me (and the rest of what you said makes the distinction important beyond a nit-picking correction). "Making heavy use of paging space" would be more accurate, or just "paging heavily". But, yes, running out of physical memory can cause a system to become extremely inefficient and have a hard time getting much of anything done and even have a hard time recovering. I remember our ancient VMS system had a kernel trap that if it was spending more time futzing around trying to figure out how to get the next thing done than it was spending actually doing things, then it would just give up, flush buffers, and reboot.

Therefore if you have an expensive server, think about getting extra RAM and disabling swap. Or even just disabling swap.

I've never seen that option. I guess it makes sense for it to exist given that something like Linux is able to run on tiny systems lacking an appropriate resource to hold paging space.

By contrast if that machine had no virtual memory,

I think you mean "had no paging space" (a.k.a. "swap space" but I try not to say "swap" when I mean "paging" and "swap space" is mostly used for paging not swapping entire processes out of memory). The modern multi-user operating system with its protections are based around virtual memory so I doubt you are running without virtual memory, just with virtual memory that is not allowed to grow larger than the size of physical memory.

the failures are much more obvious. Plus there is a good chance that the offending memory hog will die fast, and the server is likely to be able to continue doing everything else it is supposed to do.

My experience is that running out of virtual memory means that there will likely be something somewhere other than the "one hog" that runs into a case of malloc() or realloc() failing. And my experience is that it is extremely rare for code to be written to deal well with malloc() or realloc() failing. So if we have a system that has run out of virtual memory, then we schedule it for a reboot ASAP. Often, the corruption caused by the virtual memory exhaustion isn't obvious in the short term so most often the system appears to continue on rather normally. But in the cases when the reboot wasn't done, eventually something about the system became flaky.

AIX had an interesting take on this problem. It would notice that it was coming close to running out of memory and so would pick a process to kill based on some heuristics that I don't recall ever seeing documented. (It also didn't care how much virtual memory a process allocated, just how much virtual memory the process used, thus malloc() would never fail.) My experience was that AIX's heuristics in this area almost always picked the wrong victim. I don't know if that is an indication of the problem being significantly harder than it might at first appear or if it is just IBM implementing something stupid (likely a combination).

Sad to say, but it is too easy to write C code that doesn't bother to check whether malloc() returned NULL. So having one or more processes have an internal failure at random, some of them silently, sounds much worse to me than AIX's idea of having one process die obviously and in a relatively controlled manner. And I recall AIX's solution not being well liked.

So I advise caution to anyone planning on taking your advice.

Unfortunately, I have not seen a magic bullet for dealing with mis-behaved memory hogs. And my experience says that this approach isn't a magic bullet, either. Our arsenal of weapons against this problem is comprised of such various and diverse elements as testing, load boxing, monitoring, post mortem analysis, isolation, ... The problem gets really hard when your "production systems" are the ones used by a bunch of users and programmers (the best single tool I've seen from the side-lines there is having a daemon that suspends any process that makes a nuisance of itself and notifies the "watchers" for intervention).

- tye        


In reply to Re^2: RFC: Abusing "virtual" memory (failure) by tye
in thread RFC: Abusing "virtual" memory by sundialsvc4

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (5)
    As of 2021-01-22 00:23 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      Notices?