Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

HPUX corruption of file handle after fork

by Abe (Acolyte)
on Dec 10, 2007 at 14:40 UTC ( [id://656119]=perlquestion: print w/replies, xml ) Need Help??

Abe has asked for the wisdom of the Perl Monks concerning the following question:

Dear Brothers of Logic -

I have a big sql file generated by a tool, which I want to scan into separate pieces of sql and execute in parallel. I open the file, loop, and fork upon each complete block, closing the file handle in the child.

I find the file handle is corrupted by the fork/child - I think the file position keeps flipping back, because I see the message to STDERR report early blocks repeatedly.

Here's the code:

open $fh, '<', $ARGV[0] or die "Can't open $ARGV[0]: $!"; while (<$fh>) { my $Sql ; my $ReportId ; if ( /SQL_REPORT_ID (\d+)/ ) { $ReportId = $1; while (<$fh>) { last if (/END SQL_REPORT_ID/) ; $Sql .= "$_"; } my $pid = fork ; die "Can't fork: $!" unless defined $pid; if ($pid) { $Pids{$pid} = 1; } else { close $fh or die "Can't close $fh"; #print "\n\n\n\n\n\n\n\n\n\n\n---------$ReportId------------- +\n\n$Sql\n"; print STDERR "Generated $ReportId\n"; exit(0); } } } ..etc.
It doesn't matter if I use a name FH for the file handle, or the scalar you see, nor does it if I fail to close $fh in the child.

IT WORKS OFF STDIN!

I have tried sysopen() too - no good.

The likliest thing is I've done something daft and fail to see it, and you monks'll spot it.

Otherwise I wonder if it's to do with HPUX specifically, and I've scoured the web for something about this, but see nothing. It works on Windows, but I don't think that's very significant, as that wraps fork() somehow.

I can work round this - fine, but I think it's essential to understand it, or we can't use fork(), and that'd be a problem for us (we can't be worrying about only using fork() if we've no files open.)

Has anyone seen this, or know of any guidance in the scriptures?

Thank you!

Abe

Replies are listed 'Best First'.
Re: HPUX corruption of file handle after fork
by roboticus (Chancellor) on Dec 10, 2007 at 16:07 UTC
    Abe:

    Remove the line:

    close $fh or die "Can't close $fh";
    The children messing with (closing, in this case) the file handle is likely the cause of your problem. You might still have a problem with the children sharing a single copy of $Sql, but i've never used fork in perl before, so I couldn't say.

    ...roboticus

    Update: I forgot to tell why the line ought to be cut.

      Roboticus -

      Thanks very much for your swift answer.

      I had tried with and without the line, and no difference. I've just tried again to be quite sure.

      My understanding of fork at the OS level is that file descriptors in the parent are dup()'d to the child. The child has to close the fd or at least not move it by reading from it.

      It's almost as if when I issue close() perl is repositioning the file pointer before close()ing at the OS level.

      If I get rid of the close(), I still have the problem, so maybe exit() does it implicitly

      INTERESTING:

      If I change exit(0) to:

      exec /bin/false
      (so that I get _exit below decks instead of perl's wrapped exit(),) IT WORKS!!

      So I think perl is fiddling with file descriptors in the child in its exit function.

      If I then restore the explicit close(), the problem returns!!

      So perl close() is altering the position in the file via the child's file descriptor.

      I wonder whether that happens on Solaris, Linux, Aix - I've only HPUX to look at.

      Regards, Abe

Re: HPUX corruption of file handle after fork
by almut (Canon) on Dec 10, 2007 at 16:31 UTC
    I think the file position keeps flipping back, because I see the message to STDERR report early blocks repeatedly.

    Are you sure they're being reported repeatedly, or just somewhat out of order (which would be the expected behaviour due to runtime differences of the forked processes)?.

    I just tried it (with a simple test file containing dummy SQL_REPORT_ID sections) on HP-UX 11.00, 11.11 and 11.23 with Perl 5.8.4 and 5.8.8, and I'm unable to reproduce the problem you describe. I.e., the proper $Sql content, $ReportId etc. are reported once each, as expected.

    Are you doing anything else in the forked processes than what you've shown in the snippet above, or should that in fact already exhibit the problem?

      Thanks very much almut - yes I'm absolutely sure.

      However - we are on perl 5.8.0 (corporate stuff.)

      I really think this'll be a perl 5.8.0 issue - do you?

      We're probably not free to go to 5.8.8 right now, but I believe I understand the problem now and what to watch out for.

      See my earlier reply to self with what I found with exec /bin/false to force use of run-time lib _exit.

      Thanks so much for your help.

      Abe

        I just played around with this some more (I somehow had the feeling that you know what you're doing... ;)  and I can now reproduce the problem. Initially, I had rather short content in my dummy SQL_REPORT_ID sections. So, to simulate longer read times, I added a sleep 1 like this

        ... while (<$fh>) { last if (/END SQL_REPORT_ID/) ; $Sql .= "$_"; } sleep 1; ...

        and now I'm getting this (when doing print STDERR "Generated $ReportId: $Sql"; in the child):

        Generated 1: foo 1 Generated 2: foo 2 Generated 3: foo 3 Generated 4: foo 4 Generated 2: foo 2 Generated 3: foo 3 Generated 4: foo 4 (2..4 repeated ad infinitum...)

        This is on HP-UX (no difference between 11.00, 11.11 and 11.23) — on Linux, however, everything is fine. I'll try other platforms later and report back...

        For the record, my test file contains:

        SQL_REPORT_ID 1 foo 1 END SQL_REPORT_ID SQL_REPORT_ID 2 foo 2 END SQL_REPORT_ID SQL_REPORT_ID 3 foo 3 END SQL_REPORT_ID SQL_REPORT_ID 4 foo 4 END SQL_REPORT_ID

        Update: in case anyone is interested, here are the results for a couple of other platforms I could find:

        No problem on

        • Linux (various versions, going back to kernel 2.2)
        • AIX 4.3, 5.1, 5.3
        • Mac OSX 10.4.10

        but same problem as on HP-UX, on

        • Solaris 2.6 (SunOS 5.6), Solaris 8, 9 and 10
        • IRIX 6.5

        (don't know what to make of it, yet...)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://656119]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-24 07:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found