Re: Why do here-docs have to end with a newline, not EOF?
by tachyon-II (Chaplain) on May 12, 2008 at 00:24 UTC
|
This behaviour stems from the implementation of scan_heredoc() found in toke.c in the Perl 5 source
The basic logic in that function is to grok the token and add a newline to the end of it. Then scan the following lines until the token (complete with newline on the end) is found sitting on a line by itself, left justified (ie no leading whitespace). RFC 111 for perl 6 suggests making:
$var = <<STUFF;
...
STUFF
$var = <<STUFF;
...
STUFF
$var = <<STUFF;
...
STUFF;
All as legal syntax, ie the requirement for no leading whitespace and the trailing newline will be removed. Ongoing discussion in RFC 162. Trailing whitespace on the closing token line also breaks heredocs as the search token "string\n" is not found. Commenting out the line *d++ = '\n' looks like it would change the trailing newline requirement, but my analysis was brief, suprficial and totally untested. | [reply] [d/l] |
|
An even briefer glance at the code suggests that commenting out the newline assignment (I presume following the token size test) would allow any terminal token starting with the heredoc start token string, but possibly including any amount of garbage following it, would match. The trailing new line is a little like an implicit \b.
Perl is environmentally friendly - it saves trees
| [reply] |
Re: Why do here-docs have to end with a newline, not EOF?
by toolic (Bishop) on May 12, 2008 at 01:38 UTC
|
Why is this?
Because The Free Manual says so. According to perlop (search for <<EOF):
If the terminating identifier is on the last line of the program, you must be sure there is a newline after it; otherwise, Perl will give the warning Can't find string terminator "END" anywhere before EOF....
| [reply] |
|
If the terminating identifier is on the last line of the program, you must be sure there is a newline after it
...but if there's a newline, then it isn't the last line, is it? Something of a documentation bug there. It would be better expressed as "the terminating identifier isn't allowed to be the last line".
Nobody says perl looks like line-noise any more
kids today don't know what line-noise IS ...
| [reply] |
|
...but if there's a newline, then it isn't the last line, is it?
Newline is traditionally a line terminator not a line separator. But even if you consider newline a line separator, then it is still a bit of a stretch to call the 0 characters after the final newline "a line". I think you have been mislead by quirks of notepad (or its kin) which shows space for a line below the last line since that space is where the cursor goes when you press the final ENTER.
Certainly, <> won't read "a line" after a final newline in a file (it won't read anything, other than "end of file"). Now, if the last newline in a file is followed by more characters, then the historical view would be to consider that "a partial line" or "an unterminated line" or similar terminology. Traditionally, vi would refuse to create such files and if you used vi to edit such a file and saved any changes, vi would add a final newline silently (this has changed with newer vi replacements like, for example, vim).
It would be better expressed as "the terminating identifier isn't allowed to be the last line".
Better to change the verbage to not rely on either view of what "last line" means. Perhaps:
If the terminating identifier is at the very end of the program without even a newline after it, then Perl will give the warning Can't find string terminator "END" anywhere before EOF. So be sure to include a newline immediately after the terminating identifier.
| [reply] [d/l] [select] |
Re: Why do here-docs have to end with a newline, not EOF?
by pc88mxer (Vicar) on May 11, 2008 at 23:45 UTC
|
It seems you'll only get that problem if the last line of your file doesn't end in a newline.
Don't know why that's the case, but otherwise there's no problem having a here document at the end of your program.
| [reply] |
Re: Why do here-docs have to end with a newline, not EOF?
by turo (Friar) on May 12, 2008 at 00:11 UTC
|
I'm using perl 5.8.8 version for testing your little snippet and it works well. I've executed it with and without the final LF; in unix and in MSDOS format (with CR-LF); and the snippet still works well ...
perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
| [reply] [d/l] |
|
Really? I can reproduce the OP's results on WinXP and Linux.
WinXP, Perl 5.8.8, ActivePerl 820
>perl -e"print qq{print <<ENDHERE\nabc\nENDHERE}" > test.pl
>perl test.pl
Can't find string terminator "ENDHERE" anywhere before EOF at test.pl
+line 1.
>perl -e"print qq{print <<ENDHERE\nabc\nENDHERE\n}" > test.pl
>perl test.pl
abc
>
Linux, Perl 5.8.4
$ perl -e'print qq{print <<ENDHERE\nabc\nENDHERE}' > test.pl
$ perl test.pl
Can't find string terminator "ENDHERE" anywhere before EOF at test.pl
+line 1.
$ perl -e'print qq{print <<ENDHERE\nabc\nENDHERE\n}' > test.pl
$ perl test.pl
abc
$
If I were to guess, I'd say the parser looks for <"\nENDHERE\n"> instead of <"\nENDHERE" followed by either "\n" or end of file>, and it does so because it's easier and because every line is suppose to end with a line feed in unix.
| [reply] [d/l] [select] |
|
yeep ...
my editor was adding the final '0a' in the file without consulting me :-S ...
perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
| [reply] [d/l] |
Re: Why do here-docs have to end with a newline, not EOF?
by ambrus (Abbot) on May 12, 2008 at 18:18 UTC
|
If an end of file instead of a newline would be allowed, then you could write (non-empty) perl quine in just 22 bytes. I suppose the developpers didn't want to tale the blame for adding one more language feature just for obfus.
| [reply] |