http://qs321.pair.com?node_id=653115


in reply to Re: RFC: Abusing "virtual" memory
in thread RFC: Abusing "virtual" memory

Page faults are not caused when memory is allocated, but when memory is used. And allocated memory is not "commit"ted; it will be swapped out and back in as the OS sees fit, so re-allocating that space can certainly cause plenty of page faults, even if no "new requests to the OS" are made directly.

Excessive page faults causing a system to be very slow happens when there isn't enough physical memory for all of the running processes to have the things they need resident at the same time. Playing games with when memory is allocated is unlikely to make a big difference in such a scenario in my experience.

No transitions form user mode to kernel mode. One page fault.

That seems to nicely sum up your confusion. Page faults happen as interrupts, not as calls from user mode into kernel mode. You cannot prevent page faults by avoiding calling into the kernel. Just about any time you access any byte of memory, a page fault could happen. Even allocating a big chunk of memory with a single call into the OS will surely cause more than one page fault.

- tye        

  • Comment on Re^2: RFC: Abusing "virtual" memory (page faults vs malloc)

Replies are listed 'Best First'.
Re^3: RFC: Abusing "virtual" memory (page faults vs malloc)
by BrowserUk (Patriarch) on Nov 28, 2007 at 01:15 UTC

    There are (under win32), two situations in which a page fault can occur.

    1. The first is when an attempt is made to access a reserved and commited page that has been relegated to the system pagefile (or other backing file) due to overall system memory exhaustion.

      This is scenario you have based your post upon.

    2. The second is when an attempt is made to access a page of memory that has been reserved, but not yet commited.

      Win32 memory management schemes frequently use VirtualAlloc( baseAddr, someLargeSize, MEM_RESERVE, PAGE_READWRITE|PAGE_WRITECOPY|PAGE_GUARD ) to reserve a large address space to the process, but without actually allocating physical memory or pagefile space.

      This memory can then be managed using the Heap* functions APIs which will automatically commit previously reserved pages on-demand.

      Once reserved memory has been commited, it is also possible to MEM_RESET that memory. This indicates to the OS that the pages in question are no longer being used and so need not be written to the pagefile when their physical pages are to be reused for other processes, but the virtual pages are not decommited, because they will be reused at a later point.

      A quote from the documentation spells this out more clearly:

      The HeapCreate function creates a private heap object from which the calling process can allocate memory blocks by using the HeapAlloc function. HeapCreate specifies both an initial size and a maximum size for the heap. The initial size determines the number of committed, read/write pages initially allocated for the heap. The maximum size determines the total number of reserved pages. These pages create a contiguous block in the virtual address space of a process into which the heap can grow. Additional pages are automatically committed from this reserved space if requests by HeapAlloc exceed the current size of committed pages, assuming that the physical storage for it is available. Once the pages are committed, they are not decommitted until the process is terminated or until the heap is destroyed by calling the HeapDestroy function.

      So you see, by preallocating a large chunk of memory and then freeing it back to the heap, the pages commited by that large allocation are commited (backed by physical memory and/or pagefile space), but then returned to the heap manager for reallocation. They are therefore already commited (to the process/physical memory/pagefile) but free for reallocation for subsequent calls to HeapAlloc. This means that new calls to HeapAlloc can be satisfied in user mode as there is no need for transition to kernel mode to commit new pages to the process.

    This extract from Win32.c is one example of the manipulations Perl does with MEM_RESERVED and MEM_COMMIT:

    static char *committed = NULL; /* XXX threadead */ static char *base = NULL; /* XXX threadead */ static char *reserved = NULL; /* XXX threadead */ static char *brk = NULL; /* XXX threadead */ static DWORD pagesize = 0; /* XXX threadead */ void * sbrk(ptrdiff_t need) { void *result; if (!pagesize) {SYSTEM_INFO info; GetSystemInfo(&info); /* Pretend page size is larger so we don't perpetually * call the OS to commit just one page ... */ pagesize = info.dwPageSize << 3; } if (brk+need >= reserved) { DWORD size = brk+need-reserved; char *addr; char *prev_committed = NULL; if (committed && reserved && committed < reserved) { /* Commit last of previous chunk cannot span allocations */ addr = (char *) VirtualAlloc(committed,reserved-committed,MEM_COM +MIT,PAGE_READWRITE); if (addr) { /* Remember where we committed from in case we want to decommit +later */ prev_committed = committed; committed = reserved; } } /* Reserve some (more) space * Contiguous blocks give us greater efficiency, so reserve big blo +cks - * this is only address space not memory... * Note this is a little sneaky, 1st call passes NULL as reserved * so lets system choose where we start, subsequent calls pass * the old end address so ask for a contiguous block */ sbrk_reserve: if (size < 64*1024*1024) size = 64*1024*1024; size = ((size + pagesize - 1) / pagesize) * pagesize; addr = (char *) VirtualAlloc(reserved,size,MEM_RESERVE,PAGE_NOACCE +SS); if (addr) { reserved = addr+size; if (!base) base = addr; if (!committed) committed = base; if (!brk) brk = committed; } else if (reserved) { /* The existing block could not be extended far enough, so decom +mit * anything that was just committed above and start anew */ if (prev_committed) { if (!VirtualFree(prev_committed,reserved-prev_committed,MEM_DEC +OMMIT)) return (void *) -1; } reserved = base = committed = brk = NULL; size = need; goto sbrk_reserve; } else { return (void *) -1; } } result = brk; brk += need; if (brk > committed) { DWORD size = ((brk-committed + pagesize -1)/pagesize) * pagesize; char *addr; if (committed+size > reserved) size = reserved-committed; addr = (char *) VirtualAlloc(committed,size,MEM_COMMIT,PAGE_READWRI +TE); if (addr) committed += size; else return (void *) -1; } return result; }

    And here are some references to the Heap* calls from vmem.h:

    #define WALKHEAP() WalkHeap(0) #define WALKHEAPTRACE() WalkHeap(1) * HeapRec - a list of all non-contiguous heap areas const int maxHeaps = 32; /* 64 was overkill */ * Use VirtualAlloc() for blocks bigger than nMaxHeapAllocSize since const int nMaxHeapAllocSize = (1024*512); /* don't allocate anything +larger than this from the heap */ int HeapAdd(void* ptr, size_t size BOOL bRet = (NULL != (m_hHeap = HeapCreate(HEAP_NO_SERIALIZE, ASSERT(HeapValidate(m_hHeap, HEAP_NO_SERIALIZE, NULL)); BOOL bRet = HeapDestroy(m_hHeap); HeapFree(m_hHeap, HEAP_NO_SERIALIZE, m_heaps[index].base); ptr = HeapReAlloc(m_hHeap, HEAP_REALLOC_IN_PLACE_ONLY|HEAP_NO_SERI +ALIZE, HeapAdd(((char*)ptr) + m_heaps[m_nHeaps-1].len, size ptr = HeapAlloc(m_hHeap, HEAP_NO_SERIALIZE, size); if (HeapAdd(ptr, size)) { if (HeapAdd(ptr, size, bBigBlock)) { HeapAdd(ptr, size); int VMem::HeapAdd(void* p, size_t size void VMem::WalkHeap(int complete) MemoryUsageMessage("VMem heaps used %d. Total memory %08x\n", m_nH +eaps, total, 0); ASSERT(HeapValidate(m_hHeap, HEAP_NO_SERIALIZE, ptr));

    So yes. Even accesses to commited memory can cause a pagefault in low memory situations, but when you know you are about to allocate a large number of small pieces of memory--as when filling a large hash or array for the first time--preallocating a large chunk in a single allocation and then freeing it means that space is commited as the result of a single page fault and then returned to the heap for reallocation by the many small alloctions without the need for further page faults.

    What say you now of my "confusion"?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I won't try to interpret a load of documentation and try to decide how much of it does or doesn't apply to how Perl does things, etc. A trivial test shows your conclusion is wrong:

      #!/usr/bin/perl sleep 10; $#a= 100; sleep 10; $#b= 1000; sleep 10; $#c= 10000; sleep 10; $#d= 100000; sleep 10; $#e= 1000000; sleep 10; $#f= 10000000; sleep 10;

      Fire up Task Manager then launch the above script. Find "perl.exe" and look at the "Page Faults" column (add it if it isn't there). My results were:

      1981 1984 +3 1986 +2 1996 +10 2095 +99 3077 +982 12865 +9788

      So pre-allocating to 10-times the size requires 10-times as many page faults. Not 1.

      The documentation supports your use of the word "committed" but I stand by my claim that pages are not "committed" such that using them won't incur a page fault. It appears (from my experiment) that even the first use of a freshly "committed" page incurs a "page fault" in order to make the page available. I realized that this initial "page fault" might be different such as between different pools, etc, but I didn't want to get into obscure details that only apply to one platform; so I stayed rather vague on that particular point to be safe. But testing shows that for my Perl on my Win32, the number of page faults for extending an array is propotional to how much you extend it.

      So yes. Even accesses to commited memory can cause a pagefault in low memory situations,

      No, I was not in a "low memory situation" here, IMHO. My system was relatively idle. I re-ran the scenario with almost nothing else running and got similar results (643 646 647 657 755 1734 11519).

      - tye        

        This displays the page fault count before and after allocating an 40960 byte chunk of memory from a heap, then freeing it and allocating it again.

        #! perl -slw use strict; use Inline C => Config => LIBS => '-lpsapi.lib'; use Inline C => 'DATA', NAME => 'heap', CLEAN_AFTER_BUILD => 0; my $heap = heapCreate( 0, 0, 1024 * 1024 ); my $space = heapAlloc( $heap, 0, 4096 * 10 ); heapFree( $heap, 0, $space ); $space = heapAlloc( $heap, 0, 4096 * 10 ); print heapSize( $heap, 0, $space ); __DATA__ __C__ #include <windows.h> #include <psapi.h> U32 heapCreate( U32 flags, U32 initial, int max ) { return (U32) HeapCreate( flags, initial, max ); } U32 heapAlloc( U32 hHeap, U32 flags, U32 size ) { U32 pMem; PROCESS_MEMORY_COUNTERS pmc; pmc.cb = sizeof( PROCESS_MEMORY_COUNTERS ); GetProcessMemoryInfo( GetCurrentProcess(), &pmc, sizeof( PROCESS_MEMORY_COUNTERS ) ); printf( "pagefaults before alloc of %d bytes: %d\n", size, pmc.PageFaultCount ); pMem = (U32)HeapAlloc( (HANDLE)hHeap, flags, (SIZE_T)size ); GetProcessMemoryInfo( GetCurrentProcess(), &pmc, sizeof( PROCESS_MEMORY_COUNTERS ) ); printf( "pagefaults after alloc of %d bytes: %d\n", size, pmc.PageFaultCount ); return pMem; } U32 heapSize( U32 hHeap, U32 flags, U32 mem ) { return (U32)HeapSize( (HANDLE)hHeap, flags, (LPVOID)mem ); } U32 heapFree( U32 hHeap, U32 flags, U32 mem ) { return (U32)HeapFree( (HANDLE)hHeap, flags, (LPVOID)mem ); }

        The output

        c:\test>HeapMem.pl pagefaults before alloc of 40960 bytes: 952 pagefaults after alloc of 40960 bytes: 964 pagefaults before alloc of 40960 bytes: 964 pagefaults after alloc of 40960 bytes: 964 40960 c:\test>HeapMem.pl pagefaults before alloc of 40960 bytes: 953 pagefaults after alloc of 40960 bytes: 965 pagefaults before alloc of 40960 bytes: 965 pagefaults after alloc of 40960 bytes: 965 40960 c:\test>HeapMem.pl pagefaults before alloc of 40960 bytes: 953 pagefaults after alloc of 40960 bytes: 965 pagefaults before alloc of 40960 bytes: 965 pagefaults after alloc of 40960 bytes: 965 40960

        Shows that the first time, allocating 10 pages of memory results in 12 page faults. The second time none.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Side note for monks who, like me, are crap in a Win32 environment: You can add all sorts of different columns to the process list in task manager through the view->select columns menu.

        I didn't know that before this node and it took me a few minutes of right-clicking on things within the pane to figure it out.

      Great stuff!

      Now, how can we work together to hammer this article into a good reference/tutorial piece that will incorporate this valuable bit of information for the Win32-oriented Perl programmer?