http://qs321.pair.com?node_id=1067679

saintmike has asked for the wisdom of the Perl Monks concerning the following question:

I have a program that opens a Unix pipe (via pipe()), spawns a child process, and sends over a line of characters, terminated by a newline. The spawned child then reads the line from the pipe viw IO::Handle->getline().

This works in 99% of all cases, but I'm seeing sporadic failures under load and long lines (haven't been able to reproduce it in a standalone script) where IO::Handle->getline() from the pipe will not return all that has been stuffed into the pipe until the newline, but stops at 73,728 bytes which seems to be the pipe's capacity (google the number).

Are there situations where getline() (which seems to map into a scalar read from <fh>) won't read until a newline and there's no EOF either, but a momentary pipe clog?

I'm using IO::Handle 1.28 and perl 5.14.2, were there fixes for this issue that I overlooked when I briefly viewed the core changes?

Replies are listed 'Best First'.
Re: IO::Handle->getline() partial reads from pipe
by oiskuu (Hermit) on Dec 18, 2013 at 21:09 UTC

    Have you tried any diagnostics? Check $! after getline? Interrupts shouldn't be the issue (Using Perl's readline...).

    Are you sure the problem is not on the *sending* side? How do you write the line? If syswrite, did you check return value? If print, did you set autoflush? Do you close the pipe?

Re: IO::Handle->getline() partial reads from pipe
by choroba (Cardinal) on Dec 18, 2013 at 22:38 UTC
    Buffered IO does not work with sockets, maybe pipes are a similar beast? I was having similar problems in A Game Using TCP Sockets.
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Well, buffered IO does work with sockets, you just have to take into account all of the corners of such use :)

      In this particular case one must actually see the code and analyze the data flow, using for example strace. In simple tests, the getline works correctly with pipes. It blocks waiting for EOF or for next "\n". It returns not earlier than obtaining either of those. Completely different story happens when the reading end of pipe is in "non-blocking" mode. Here getline may produce incomplete lines and in general reading would require a lot of care.

Re: IO::Handle->getline() partial reads from pipe
by wazat (Monk) on Dec 19, 2013 at 06:14 UTC

    I used to know something about pipes, but that was a long time ago. At the kernel level a pipe as a maximum size (once upon a time I think 16K may have been typical). If you writes are larger than that then at the system level the child may get at most that max amount. I would think that getline() should buffer the read and read again until it reads data with an newline. It sounds like this isn't happening in your case. I'd expect that kind of behaviour from sysread(), but not getline().

    I'm curious if the long line is split between getline() calls. If it was you could check if the returned line was newline terminated. If it wasn't terminated then call getline() again and concatenate the data yourself. (I know this is UGLY.)

    You might also see if flushing the write side makes any difference. I wouldn't expect it to help but it may be worth a try.

    I am ASSUMING that your situation is as simple as described---a single parent writes and a single child reads, no other interaction between the two processes.

      I'm curious if the long line is split between getline() calls. If it was you could check if the returned line was newline terminated. If it wasn't terminated then call getline() again and concatenate the data yourself. (I know this is UGLY.)
      Yeah, I was curious about that as well, but haven't tried re-reading it because I think that'd be a bug in getline() and should be fixed there. getline() should always read until a newline (or EOF), and neither seems to be the case (unless the kernel erroneously sends EOF for some reason).

      The problem is that I can't reproduce it in a standalone script, it only happens randomly on a busy production server, so my options in debugging this are limited.

      You might also see if flushing the write side makes any difference. I wouldn't expect it to help but it may be worth a try.
      That one I've tried, using printflush() instead of print on the IO::Handle of the writer makes no difference.

      And yes, it's as simple as described, a single writer and a single reader.

        I suspect I'm of little help, but I'll throw out some dumb ideas.

        You could try a series of partial writes of a long line to see if that reproduces the problem.

        Is there any CRLF or UTF8 processing at play?

        Are you able to test using the exact production data?

        Is the failure a matter of a truncated line, or premature end of file?

        And did you test printflush() return value? The print, printflush, flush, close methods all return true for success.
Re: IO::Handle->getline() partial reads from pipe
by vsespb (Chaplain) on Dec 22, 2013 at 16:04 UTC
    You need at least check "$!". readline/print can be interrupted with signals under 5.14 https://rt.perl.org/Ticket/Display.html?id=119097