Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Why do here-docs have to end with a newline, not EOF?

by Cody Pendant (Prior)
on May 11, 2008 at 23:37 UTC ( [id://686004]=perlquestion: print w/replies, xml ) Need Help??

Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

I recently discovered that the end of a here-doc can't be the last thing in a file, i.e. (line numbers for clarity):
1. print <<ENDHERE; 2. foo bar baz 3. ENDHERE
doesn't work, but
1. print <<ENDHERE; 2. foo bar baz 3. ENDHERE 4.

does. Why is this?



Nobody says perl looks like line-noise any more
kids today don't know what line-noise IS ...

Replies are listed 'Best First'.
Re: Why do here-docs have to end with a newline, not EOF?
by tachyon-II (Chaplain) on May 12, 2008 at 00:24 UTC

    This behaviour stems from the implementation of scan_heredoc() found in toke.c in the Perl 5 source

    The basic logic in that function is to grok the token and add a newline to the end of it. Then scan the following lines until the token (complete with newline on the end) is found sitting on a line by itself, left justified (ie no leading whitespace). RFC 111 for perl 6 suggests making:

    $var = <<STUFF; ... STUFF $var = <<STUFF; ... STUFF $var = <<STUFF; ... STUFF;

    All as legal syntax, ie the requirement for no leading whitespace and the trailing newline will be removed. Ongoing discussion in RFC 162. Trailing whitespace on the closing token line also breaks heredocs as the search token "string\n" is not found. Commenting out the line *d++ = '\n' looks like it would change the trailing newline requirement, but my analysis was brief, suprficial and totally untested.

      An even briefer glance at the code suggests that commenting out the newline assignment (I presume following the token size test) would allow any terminal token starting with the heredoc start token string, but possibly including any amount of garbage following it, would match. The trailing new line is a little like an implicit \b.


      Perl is environmentally friendly - it saves trees
Re: Why do here-docs have to end with a newline, not EOF?
by toolic (Bishop) on May 12, 2008 at 01:38 UTC
    Why is this?
    Because The Free Manual says so. According to perlop (search for <<EOF):
    If the terminating identifier is on the last line of the program, you must be sure there is a newline after it; otherwise, Perl will give the warning Can't find string terminator "END" anywhere before EOF....
      If the terminating identifier is on the last line of the program, you must be sure there is a newline after it
      ...but if there's a newline, then it isn't the last line, is it? Something of a documentation bug there. It would be better expressed as "the terminating identifier isn't allowed to be the last line".


      Nobody says perl looks like line-noise any more
      kids today don't know what line-noise IS ...
        ...but if there's a newline, then it isn't the last line, is it?

        Newline is traditionally a line terminator not a line separator. But even if you consider newline a line separator, then it is still a bit of a stretch to call the 0 characters after the final newline "a line". I think you have been mislead by quirks of notepad (or its kin) which shows space for a line below the last line since that space is where the cursor goes when you press the final ENTER.

        Certainly, <> won't read "a line" after a final newline in a file (it won't read anything, other than "end of file"). Now, if the last newline in a file is followed by more characters, then the historical view would be to consider that "a partial line" or "an unterminated line" or similar terminology. Traditionally, vi would refuse to create such files and if you used vi to edit such a file and saved any changes, vi would add a final newline silently (this has changed with newer vi replacements like, for example, vim).

        It would be better expressed as "the terminating identifier isn't allowed to be the last line".

        Better to change the verbage to not rely on either view of what "last line" means. Perhaps:

        If the terminating identifier is at the very end of the program without even a newline after it, then Perl will give the warning Can't find string terminator "END" anywhere before EOF. So be sure to include a newline immediately after the terminating identifier.

        - tye        

Re: Why do here-docs have to end with a newline, not EOF?
by pc88mxer (Vicar) on May 11, 2008 at 23:45 UTC
    It seems you'll only get that problem if the last line of your file doesn't end in a newline.

    Don't know why that's the case, but otherwise there's no problem having a here document at the end of your program.

Re: Why do here-docs have to end with a newline, not EOF?
by turo (Friar) on May 12, 2008 at 00:11 UTC
    I'm using perl 5.8.8 version for testing your little snippet and it works well. I've executed it with and without the final LF; in unix and in MSDOS format (with CR-LF); and the snippet still works well ...

    perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'

      Really? I can reproduce the OP's results on WinXP and Linux.

      WinXP, Perl 5.8.8, ActivePerl 820

      >perl -e"print qq{print <<ENDHERE\nabc\nENDHERE}" > test.pl >perl test.pl Can't find string terminator "ENDHERE" anywhere before EOF at test.pl +line 1. >perl -e"print qq{print <<ENDHERE\nabc\nENDHERE\n}" > test.pl >perl test.pl abc >

      Linux, Perl 5.8.4

      $ perl -e'print qq{print <<ENDHERE\nabc\nENDHERE}' > test.pl $ perl test.pl Can't find string terminator "ENDHERE" anywhere before EOF at test.pl +line 1. $ perl -e'print qq{print <<ENDHERE\nabc\nENDHERE\n}' > test.pl $ perl test.pl abc $

      If I were to guess, I'd say the parser looks for <"\nENDHERE\n"> instead of <"\nENDHERE" followed by either "\n" or end of file>, and it does so because it's easier and because every line is suppose to end with a line feed in unix.

        yeep ... my editor was adding the final '0a' in the file without consulting me :-S ...
        perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'
Re: Why do here-docs have to end with a newline, not EOF?
by ambrus (Abbot) on May 12, 2008 at 18:18 UTC

    If an end of file instead of a newline would be allowed, then you could write (non-empty) perl quine in just 22 bytes. I suppose the developpers didn't want to tale the blame for adding one more language feature just for obfus.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://686004]
Approved by pc88mxer
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 23:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found