in reply to Re: reading from a file after a seek isn't working for me in thread reading from a file after a seek isn't working for me
@ikegami: Fellow Monks, can you please explain in detail the need for the explicit close here?
Normally opening with an existing FH closes the original file or at least I never noticed
a problem in cutting this corner in one-shots, one-liners or inline shell scripts
(but usually avoiding read, sysread, tty's and STDIN/OUT/ERR).
Thanx,
Peter
Update: - ok, any takers for this riddle with more time? Will summarize if pointed correctly with keywords and RTFM's to check :)
From perldoc -f close:
You don’t have to close FILEHANDLE if you are immediately going to do another "open" on it, because "open" will close it
for you. (See "open".) However, an explicit "close" on an input file resets the line counter ($.), while the implicit
close done by "open" does not.
There are a few more notes on pipes, but those don't seem to match the opener's situation either. Skimming perlopentut I didn't see pointers of interest - au contraire, it even _seems_ to imply that reopening w/o close (my reading on the lack of close() in the Playing with STDIN/STDOUT section) for STDIN/STDOUT is fine. Or is there indeed some hardcoded magic of it being STDOUT we insist to read from??
What do I miss?
Re^3: reading from a file after a seek isn't working for me
by almut (Canon) on Oct 21, 2009 at 21:44 UTC
|
can you please explain in detail the need for the explicit close here?
I think it has to do with PerlIO in combination with an implementation peculiarity.
When you compare the straces of both variants, you'll see something like:
# with explicit close
close(1) = 0
open("/tmp/stdout.log", O_RDWR|O_CREAT|O_TRUNC, 0666) = 1
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff2ba69a30) = -1 ENOTTY (I
+nappropriate ioctl for device)
lseek(1, 0, SEEK_CUR) = 0
fstat(1, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fcntl(1, F_SETFD, 0) = 0
# without explicit close
open("/tmp/stdout.log", O_RDWR|O_CREAT|O_TRUNC, 0666) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff023c1390) = -1 ENOTTY (I
+nappropriate ioctl for device)
lseek(4, 0, SEEK_CUR) = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
dup2(4, 1) = 1
close(4) = 0
fcntl(1, F_SETFD, 0) = 0
Now, the issue is (I think) that although the dup2 does
create a copy of fd 4 as fd 1 at the system level (and in fact
does also close the old fd 1), it does not copy the PerlIO part,
which is only being handled properly, when the filehandle is being
created directly using Perl's open. For this reason, the
filedescriptor is considered invalid from the PerlIO point of view (—> the "Bad file descriptor" message).
This is checked at the beginning of Perl's read using
PerlIOValid(f)1 (even before doing any read system call).
Don't ask (me), however, why the indirect dup2-technique is being
used in the first place instead of simply closing the filedescriptor
before the open... (Presumably, it did work before the the introduction of PerlIO, and might just not have been adapted appropriately since.)
___
1 see perlio.c:
#define Perl_PerlIO_or_Base(f, callback, base, failure, args) \
if (PerlIOValid(f)) { \
const PerlIO_funcs * const tab = PerlIOBase(f)->tab;\
if (tab && tab->callback) \
return (*tab->callback) args; \
else \
return PerlIOBase_ ## base args; \
} \
else \
SETERRNO(EBADF, SS_IVCHAN); \
return failure
...
SSize_t
Perl_PerlIO_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
Perl_PerlIO_or_Base(f, Read, read, -1, (aTHX_ f, vbuf, count));
}
| [reply] [d/l] [select] |
|
Thanx for the pointer, almut. That dup & perlio scrap is interesting.
But there must be more to it than that, as I don't see any special treatment for STDOUT in the perlio.c scrap (neither for the numeric FD's 0 to 2):
I was playing with the scrap below in the meantime.
I dupped SAVOUT on STDERR instead / simplifying system to
printing / using autoflush / opening STDOUT myself to /dev/tty first:
no change.
This however is interesting:
Changing the name of the handle STDOUT <=> ANYTHINGeLSE
manages to act as a toggle for the problem. Furthermore, w/o close, the tell on the STDOUT file pointer at begin prints 19 in the example below (might be due to the handle earlier being a tty, and something didn't quite catch the change to a plain file w/o explicit close?). Any other handle name prints 5 regardless of close or no close.
So it looks like we have some hard-coded STDOUT-related magic somewhere in the guts of PERLIO or even lower, with probably STDIN/ERR offering similar peculiarities.
Given that too much in Perl, esp wrt <> and stdio is magic, it's probably a good idea to say strictly outside any possibly dusty corner whose smell is faintly related to something magic. Which in this case might just be the idea of reusing a special handle, and worse, reading from it.
How to classify this behaviour: What doc/code do we still miss? Or is this indeed, say, an easy-to-fix oversight in the documentation? Or is it a somewhat larger actual bug? |
Still wondering (& vowing to step even more cautiously anywhere near STDIO magic),
My thanx to almut & ikegami for the work below! |
less confused now (& busy scribbling away two new-to-me debugging tips along with a link to their demonstration here)
Peter
| [reply] [d/l] |
|
...as I don't see any special treatment for STDOUT in the perlio.c scrap
Just to be clear: the perlio.c snippet was only meant to show where the PerlIOValid() check happens for the read.
The decision between using a direct close vs. the indirect dup2, OTOH, is more likely to happen in Perl's open implementation (which I didn't yet have time to wade through — it's rather lengthy... and for a low-depth explanation I figured the manifestation of the difference in the strace should be sufficient evidence).
| [reply] [d/l] [select] |
|
seek(STDOUT, -0, 2) or die $!;
print STDOUT "abc\n";
does indeed append "abc\n" to the file.
It's more like Perl remembers the handle's original mode and doesn't realize it can read from it now.
Update: I did a bit of Dumping and stracing of my own.
There's is no difference in the IO objects. I'm now with you leaning towards a PerlIO problem.
Seems that the "Bad file descriptor" message originates from Perl, not the system. Perl doesn't even attempt to read from STDOUT.
$ cat a.pl
use Devel::Peek;
open(SAVOUT, '>&STDOUT') or die $!;
close(STDOUT) if $ARGV[0];
open(STDOUT, '+>', "/tmp/stdout.log") or die $!;
Dump(*STDOUT{IO});
@argv = qw(/bin/echo hello world);
system(@argv);
print SAVOUT "before=", tell(STDOUT), "\n";
seek(STDOUT, 0, 0) or die $!;
print SAVOUT "after=", tell(STDOUT), "\n";
while (1) {
my $rv = read STDOUT, $_, 8192;
die $! if !defined($rv);
last unless $_;
print SAVOUT "stdout=", $_;
}
print SAVOUT "at end=", tell(STDOUT), "\n";
close STDOUT;
$ diff -u <(strace perl a.pl 0 2>&1) <(strace perl a.pl 1 2>&1) | less
...
lseek(1, 0, SEEK_SET) = 0
lseek(1, 0, SEEK_CUR) = 0
-[ code to read locale-dependent version of error message]
-write(2, "Bad file descriptor at a.pl line"..., 37Bad file descriptor
+ at a.pl line 17.
-) = 37
+read(1, "hello world\n", 4096) = 12
+read(1, "", 4096) = 0
+close(1) = 0
-write(3, "before=0\nafter=0\n", 17before=0
+write(3, "before=0\nafter=0\nstdout=hello wo"..., 46before=0
after=0
-) = 17
+stdout=hello world
+at end=12
+) = 46
close(3) = 0
-exit_group(9) = ?
-Process 4028 detached
+exit_group(0) = ?
+Process 4032 detached
| [reply] [d/l] [select] |
|
But it's not, or at least not completely invalid. ...
Good point. Actually, when taking a closer look, I think Perl sets
EBADF one routine further down in PerlIOBase_read() (which is
being called from the macro Perl_PerlIO_or_Base), in case
the PERLIO_F_CANREAD flag isn't set:
PerlIOBase_read(pTHX_ PerlIO *f, void *vbuf, Size_t count)
{
STDCHAR *buf = (STDCHAR *) vbuf;
if (f) {
if (!(PerlIOBase(f)->flags & PERLIO_F_CANREAD)) {
PerlIOBase(f)->flags |= PERLIO_F_ERROR;
SETERRNO(EBADF, SS_IVCHAN);
return 0;
}
...
It's more like Perl remembers the handle's original mode and
doesn't realize it can read from it now.
Yes, and that's most likely because the dup2 doesn't copy
the perl-internal PERLIO* flags (well, how should it, it knows nothing about them).
The following snippet shows that the two STDOUTs modes differ
depending on whether STDOUT is explicitly being closed first:
(I made use of Inline::C because I couldn't find a way to call PerlIO_modestr() directly via plain Perl)
#!/usr/bin/perl
use Inline C;
close STDOUT if $ARGV[0];
open(STDOUT, '+>', "/tmp/stdout.log") or die $!;
dumpmode(STDOUT);
__END__
__C__
void dumpmode(SV* fh) {
char buf[10];
PerlIO *f = IoIFP(sv_2io(fh));
PerlIO_modestr(f, buf);
fprintf(stderr, "mode = %s\n", buf);
}
Output:
$ ./802590.pl 0
mode = w
$ ./802590.pl 1 # with explicit close
mode = r+
Not really sure why it says "r+" instead of "w+", but I
suspect it's because the "+>" internally maps to the same mode as "+<",
after having clobbered the file...
Also, if you set PERLIO_DEBUG, you can see that the "w+"
mode is being applied to the PerlIO layers of fd 1 only in case it is
properly closed/opened:
$ PERLIO_DEBUG=/dev/tty ./802517.pl 1 # with explicit close
...
./802517.pl:0 openn(perlio,'(Null)','Iw',1,0,0,(nil),0,(nil))
./802517.pl:0 Layer 0 is unix
./802517.pl:0 Layer 0 is unix
./802517.pl:0 PerlIO_push f=0x6253c0 unix w 0x603b08
./802517.pl:0 fd 1 refcnt=1
./802517.pl:0 PerlIO_push f=0x6253c0 perlio Iw 0x603b08
./802517.pl:0 Layer 1 is perlio
...
./802517.pl:15 openn(perlio,'','w+',-1,0,0,(nil),1,0x60a178)
./802517.pl:15 Layer 0 is unix
./802517.pl:15 Layer 0 is unix
./802517.pl:15 PerlIO_push f=0x6253c0 unix w+ 0x603b08
./802517.pl:15 fd 1 refcnt=1
./802517.pl:15 PerlIO_push f=0x6253c0 perlio w+ 0x603b08
Otherwise, the "w+" is being applied to a different fd (here fd 8),
and thus disappears together with the fd when it is closed (after the dup2):
$ PERLIO_DEBUG=/dev/tty ./802517.pl 0
...
./802517.pl:0 openn(perlio,'(Null)','Iw',1,0,0,(nil),0,(nil))
./802517.pl:0 Layer 0 is unix
./802517.pl:0 Layer 0 is unix
./802517.pl:0 PerlIO_push f=0x6253c0 unix w 0x603b08
./802517.pl:0 fd 1 refcnt=1
./802517.pl:0 PerlIO_push f=0x6253c0 perlio Iw 0x603b08
./802517.pl:0 Layer 1 is perlio
...
./802517.pl:15 openn(perlio,'','w+',-1,0,0,(nil),1,0x60a178)
./802517.pl:15 Layer 0 is unix
./802517.pl:15 Layer 0 is unix
./802517.pl:15 PerlIO_push f=0x6253e8 unix w+ 0x603b08
./802517.pl:15 fd 8 refcnt=1
./802517.pl:15 PerlIO_push f=0x6253e8 perlio w+ 0x603b08
./802517.pl:15 fd 8 refcnt=0
./802517.pl:15 PerlIO_pop f=0x6253e8 perlio
./802517.pl:15 PerlIO_pop f=0x6253e8 unix
(irrelevant parts snippet) | [reply] [d/l] [select] |
Re^3: reading from a file after a seek isn't working for me
by ikegami (Patriarch) on Oct 21, 2009 at 21:09 UTC
|
| [reply] |
|
|