Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: 4k read buffer is too small

by almut (Canon)
on Jun 16, 2008 at 21:52 UTC ( [id://692386]=note: print w/replies, xml ) Need Help??


in reply to 4k read buffer is too small

AFAIK, stdio buffering - as configurable via setvbuf - is incompatible with PerlIO's buffering, which is why it's disabled when you configure Perl to use PerlIO.  OTOH, you most probably do want PerlIO... so configuring/rebuilding Perl to not use it, isn't really an option.

Anyhow, a little digging around suggests that you can "configure" PerlIO's buffer size in the file perlio.c:

STDCHAR * PerlIOBuf_get_base(pTHX_ PerlIO *f) { PerlIOBuf * const b = PerlIOSelf(f, PerlIOBuf); PERL_UNUSED_CONTEXT; if (!b->buf) { if (!b->bufsiz) b->bufsiz = 4096; /* <--- here */ b->buf = Newxz(b->buf,b->bufsiz, STDCHAR); if (!b->buf) { b->buf = (STDCHAR *) & b->oneword; b->bufsiz = sizeof(b->oneword); } b->end = b->ptr = b->buf; } return b->buf; }

At least, I changed that 4096 to 8192, recompiled perl (v5.10.0), and now strace reveals that read(2) is being called for blocks of size 8192, when you execute something like

open my $fh, "<", $^X or die; while (<$fh>) { }

while before the change, read blocks were of size 4096.

Other than that, I haven't done any testing yet. So, no guarantees whatsoever (!) that it'll work in every respect... — just something to play with at your own risk.  Good luck!

Replies are listed 'Best First'.
Re^2: 4k read buffer is too small
by voeckler (Sexton) on Jun 17, 2008 at 04:15 UTC

    Thank you, this sounds like what I was looking for. I was poking at the Perl code today. I will try this tomorrow.

    PS: Do you think the Perl gods will make a buffer setting function available again in PerlIO? After all, C has setvbuf and C++ has myistream.rdbuf()->pubsetbuf(buf,bufsize) to let the user override defaults, if he so choses.

      Do you think the Perl gods will make a buffer setting function available again in PerlIO?

      I can't really speak for the Perl gods, but considering that the configurability of the buffer size currently is near the lowest conceivable level1, I'd think that making it user-settable (à la setvbuf with stdio) isn't prioritized very high at the moment.

      You might want to bring the issue up on p5p, however... if you feel determined and are well prepared with good arguments :) — I do remember having come across a related discussion (last time I felt like needing setvbuf myself), but unfortunately, I can't find it at the moment2. I recall I did sense some reluctance to change in the overall tone of the thread...

      ___

      1 "configurability levels" that I could think of:

      • (1) hardcoded magic constant in the code
      • (2) macro/constant (system-dependent) automatically determined during configure
      • (3) compile-time configure option
      • (4) user-configurable global runtime option affecting all buffers (switch, env-var, magic Perl var, whatever)
      • (5) user-configurable runtime option per IO handle (like setvbuf)
      • (6) user-configurable runtime option per PerlIO layer
      • (7) like (6), but dynamically reconfigurable on open/unflushed handles

      2 googling the p5p archives - i.e. 'setvbuf site:www.xray.mpe.mpg.de' - doesn't produce any hits, although there are definitely some mentions of setvbuf  (presumably some restrictive robots.txt file)

        As always, p5p welcomes well-crafted patches more openly than suggestions without code. The people needing such a configuration point may have to be the ones to put in the effort before it'd be even considered for inclusion.
Re^2: 4k read buffer is too small
by DrHyde (Prior) on Jun 17, 2008 at 10:08 UTC
    That code surprises me. I would have at least expected it to be equal to the page size. And that varies with architecture. On Alpha, for instance, it's 8K.
      It surprises me more that there's a magic number like that buried down in the core. It appears that you should be able to configure that in your own custom IO layer and set the size as big as you wish.

      ACK: I would have expected a getpagesize() call, since pages are often natural boundaries. Or at least a reference to a BUFSIZ that many stdio's define - which happens to be, after several indirections, come to 8k on my x86_64 Linux.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://692386]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2024-04-23 21:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found