Re^2: 4k read buffer is too small

in reply to Re: 4k read buffer is too small
in thread 4k read buffer is too small

If you're only reading the file from beginning to end, another useful trick is to write a small program to read files in whatever blocksize you need (for example with sysread) and write them to standard output; then you can run that program and pipe its output to your actual program, which can read from the pipe in 4KB blocks without affecting how the NFS server is accessed. If you need to seek around this won't work, but sometimes it can be helpful.

Yes, strong agreement to this trick. My office neighbor also suggested this work-around, since we have at least 2 CPUs per node, and up to 8 CPUs per node, but most often, the actual computation only takes 1 CPU. CPU cycles are cheap!

As for the NFS client tuning, I will convey the message, but I suspect that the admins already did quite a bit of tuning. After all, our directory requests are served from a different physical machine than the data blocks. Myself, I don't have god privileges on any of the machines.

XXX:/export/samfs-XXX01 /auto/XXX-01 nfs rw,nosuid,noatime,rsize=32768
+,wsize=32768,timeo=15,retrans=7,tcp,intr,noquota,rsize=32768,wsize=32
+768,addr=10.125.0.8 0 0
[download]

The readahead sounds intriguing. How would it work, if 200 clients tried to read the same file, though slightly offset in start time? Wouldn't read-ahead aggravate the server load in this case?

Comment on Re^2: 4k read buffer is too small Download Code

Replies are listed 'Best First'.
Re^3: 4k read buffer is too small by sgifford (Prior) on Jun 17, 2008 at 04:55 UTC
`XXX:/export/samfs-XXX01 /auto/XXX-01 nfs rw,nosuid,noatime,rsize=32768 +,wsize=32768,timeo=15,retrans=7,tcp,intr,noquota,rsize=32768,wsize=32 +768,addr=10.125.0.80 0` [download] Interesting, that should be reading in 32KB blocks. You would still see 4K blocks with `strace`, though, which might be throwing off your analysis. Try seeing if the output of `nfsstat` or `tcpdump` matches what you'd expect from `strace`. If you find that it actually is reading in larger blocks, your sysadmins can try increasing `rsize` further. Also, I seem to recall that you need NFSv3 to read blocks larger than 16K, so if you're not getting the full 32K you are asking for, you might want to look at that. The readahead sounds intriguing. How would it work, if 200 clients tried to read the same file, though slightly offset in start time? Wouldn't read-ahead aggravate the server load in this case? I'm not familiar with the internals of the Linux NFS code, but generally readahead will write into the buffer cache, and then client requests will be read from there. As long as it doesn't run out of memory it should do the right thing in the scenario you describe. -- sgifford's Web page	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: 4k read buffer is too small
by sgifford (Prior) on Jun 17, 2008 at 04:55 UTC

XXX:/export/samfs-XXX01 /auto/XXX-01 nfs rw,nosuid,noatime,rsize=32768 +,wsize=32768,timeo=15,retrans=7,tcp,intr,noquota,rsize=32768,wsize=32 +768,addr=10.125.0.80 0
[download]

strace

nfsstat

tcpdump

strace

rsize

Also, I seem to recall that you need NFSv3 to read blocks larger than 16K, so if you're not getting the full 32K you are asking for, you might want to look at that.

The readahead sounds intriguing. How would it work, if 200 clients tried to read the same file, though slightly offset in start time? Wouldn't read-ahead aggravate the server load in this case?

--
sgifford's Web page

[reply]
[d/l]
[select]

In Section Seekers of Perl Wisdom