Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: 4k read buffer is too small

by voeckler (Sexton)
on Jun 17, 2008 at 03:59 UTC ( [id://692430]=note: print w/replies, xml ) Need Help??


in reply to Re: 4k read buffer is too small
in thread 4k read buffer is too small

I wanted to know the number of read(2) calls, so I used

strace -e read perl ...

Each of these reads hits the kernel's VFS, as they go from userland to kernelland. According to the admins, each read will incur an NFS request to the server. Too many simultaneous requests will topple the server. Less NFS requests, as generated by a larger buffer reads, are friendlier to the server; even, if they are not necessarily speeding up my program.

Replies are listed 'Best First'.
Re^3: 4k read buffer is too small
by starbolin (Hermit) on Jun 17, 2008 at 16:44 UTC

    I think your admins are lying to you. The NFS block size is determined when you isssue mount to tie the NFS driver into your file system. Just by co-incidence the default block size is also 4k. The NFS block size determines when and how much data is requested from the server not the application's IO block size. See your systems mount manual page.

    After doing just a tiny bit of reading and a little bit of testing on my system I'm convinced that modifying perl's block size would be a wasted effort. It would not change the size of the NFS requests to the server.


    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}
      I'm convinced that modifying perl's block size would be a wasted effort. It would not change the size of the NFS requests to the server.

      I disagree with your judgement. Please refer to the mtab string I posted further down. The rsize mount option is 32k, and was all along.

      read and rsize match:
      If the read buffer matches the NFS rsize mount option, then each read will incur 1 NFS request to the server with maximum possible efficiency.
      read less than rsize:
      A read with a buffer smaller than the rsize option cannot wait until the NFS buffer is filled - the NFS client will issue a request with however little was requested, unless the data is already in the read-ahead buffer. Again, each read will incur 1 NFS request, albeit less efficient by increasing the header to data ratio (more overhead).
      read larger than rsize:
      A read buffer larger than the rsize option will be split by your friendly VFS layer into however many NFS requests it takes to fulfill using rsize-sized chunks.

      In my concrete example, this means that the 4k buffer employed by PerlIO will issue more NFS request than using a 32k buffer that matches my rsize mount option. Thus, increasing the buffer size in PerlIO is not wasted effort. Rather, any increase to 32k or beyond would result in a factor 8 less NFS requests. AFAI understood my admins, it is the sheer number of small NFS requests that slows the server to a crawl.

      Careful testing will reveal the ideal buffer size. But both, the admins and I expect it to be at 32k or larger, not at 4k.

        From your quoted text:

        "...unless the data is already in the read-ahead buffer."
        So what's happening here is you perl script is faster that the NFS fill ( not suprising ) and so is emptying out the buffer on each read. Subsequent reads are issued before the read-ahead has time to fill the buffer. At least in a perfect world this what 'may' be happening, unless your NFS client is broken somehow. Try putting a sleep in your script and look at iostat to see if the NFS read sizes increases. I had actually though earlier of putting sleep in the code to ease the burden on the network but dismissed it as not addressing the NFS driver buffering issue, but from the text provided it seems that your driver implementation over-responds to multiple requests and so might respond to a slowing of the request rate.

        I don't recall if you've stated anywhere whether your script reads the file sequentially. If so have you tried somethine like:

        cat "filename" > script
        In order to buffer the reads. ( You may need an enhanced version of cat to avoid memory thrashing with this method.)

        Should you still decide to pursue modification of perl it would seem that this is the kind of thing for which IO layers was implemented. One would make a copy of the appropriate layer, change the name, then modify the buffer size. Switching to a larger read buffer would then simply require a use statement in your code.


        s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://692430]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-24 19:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found