Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: magic-diamond <> behavior -- WHAT?!

by moritz (Cardinal)
on Oct 29, 2008 at 21:55 UTC ( [id://720345]=note: print w/replies, xml ) Need Help??


in reply to magic-diamond <> behavior -- WHAT?!

That's known, and afaict there is now a module on CPAN that fixes it.

There have been a whole lot of threads about that on p5p recently, with the result (if any) that it won't be changed in core, because too much code (and too many hackers) rely on this feature.

Replies are listed 'Best First'.
Re^2: magic-diamond <> behavior -- WHAT?! (sanity)
by tye (Sage) on Oct 30, 2008 at 07:30 UTC

    And, I apologize in advance, but it is perhaps the perfect example of how p5p can produce the most inane decisions.

    There is a lot more code being used that relies on <> doing the sane thing. Code that uses -n or -p with a wildcard (very common) is clearly expecting sane behavior not dangerous leaking of file names into the execution stream. Almost all of the code that I've seen use <> is expecting it to read from the files named in @ARGV. Duh!

    So fixing <> would break some rare hackish code and fix a ton of simple code. People who write hackish code are much better suited to adding -Margv (or whatever it gets called) to get the historic, magical behavior. That makes much better sense than hoping everybody who uses <> in the normal way will know to use some special module or trick just to make things safe and sane.

    Heck, it would even be fairly easy to have <> default to be safe and sane while also warning when fed a file name that starts with a filemode character or ends with '|' (and the warning could mention -Margv -- something that would end the warning since the type of behavior would be specified explicitly).

    And the story about it having been designed that way is beyond suspicious. If <> had been designed to be the way that it is, then -p would not work the way it does. It was an accident of implementation. And the documentation was simply a restating of that implementation so it was also an accident that it was "documented" to work that way.

    The documentation never (unless it was recently updated) said anything close to "beware of file names that start with '<' or start or end with '|' because ..." or even "note that 'perl -pex *' is unsafe" or even "And look how cool it is if you have a file named 'make test |' ...".

    The documentation does say lots of thinks like:

    -n
    causes Perl to assume the following loop around your program, which makes it iterate over filename arguments
    find . -mtime +7 -print | perl -nle unlink
    The @ARGV array is then processed as a list of filenames.

    There is a lot more documentation that <> shouldn't react badly to the file name I close this node with (compared to the so-called "documentation" of the magic behavior by virtue of "is equivalent to the following Perl-like pseudo code" that uses some 'open' which isn't clearly declared to be as magical as Perl's two-arg open).

    After hearing of people making noises like "Oh, sure, I've always known it was magic. Heck, everybody did. It is documented. Duh!" I did some searching trying to find evidence of all of these people having "known" this for so long. I only found evidence of people using <> like they expected it to iterate over the names of files in @ARGV.

    So, I loudly call "bull" on that decision and its justifications. Not that I (as I've said before) expect this to change anything. p5p has proved to be quite immune to persuasion from me over some years, so I gave it up years ago. It sounds like several people have tried on this point and it is clearly discussed as a fait accompli (if I'm not misusing that term too badly) so I suspect my prediction is pretty safe. Ugh. :)

    echo > 'echo "Perl is my bitch!" && rm -rf .. |'

    - tye        

      I feel with you, I'm not happy with their decision either, and some discussion turn out rather frustrating on p5p.

      It's a feature so magic (and so little known) that it can be considered a security hazard. IMHO.

      Even though we can't convince them, we can still do something about it: propose documentation patches. I'd like to write some, but in the last two weeks I haven't got around to anything perlish, so I don't think I'll get around to it any time soon.

      If nobody gets around to it, maybe we should write a patch against pod/perltodo.pod in perl.git to mark it as a TODO item.

      (Update: patch submitted, and it has been applied already.)

      The behavior gets several paragraphs of explicit mention in a rather common reference book. Not to mention the Camel itself explicitly covers the behavior it in its discussion on <> as well (p82, 3rd ed). Considering both of what would have to be considered the "standard reference books" on the language cover this behavior one would grant plausibility to the "It is documented. Duh!" crowd.

      (Now that's not to say that I don't see where the "it shouldn't be on by default" crowd are coming from either, and agree that would be a "safer" default behavior; but it is doing just what it says on the tin . . .)

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        I think you may be getting your carte blanc before your Camel. :)

        I read "the Camel" and I don't believe it mentioned any such thing (probably not the same revision of "the Camel" you refer to, of course). And at the time (quite a while ago) of the coming out party of the "It is documented. Duh!" proclaimers, I don't believe it was documented well in a popular book. In any case, I never saw mention of documentation of that in books in that time frame. I'm not at all surprised that it is documented in some books by now. But I also wouldn't be totally shocked if there was a book that covered it well way back then.

        But it is also true that bugs get documented in auxillary reference material. The "It is documented" is more short-hand for the "We can't change it because the standard documentation has always said that it worked that way" claim, and that is the meaning that I call "bull" on.

        - tye        

      mistake or not, taint cures a lot of this

        Not really. It prevents odd file names from being treated as shell commands, but it dos so by killing your program instead of treating them as the names of files to read as intended.

        It's like fixing a flat tire by removing the car's battery. Sure, you won't ruin your car by driving with a flat. But you also won't be driving your car.

        No, taint checking is a dang stupid idea of a "fix". It doesn't actually fix anything and it makes lots of parts of your program bring everything to a screaching halt if you don't get a bunch of extra work done just right. And proposing it as a "fix" is a pretty clear demonstration of "you just don't get it at all".

        An actual fix that is also not breaking tons of other parts of your code is simply $_= "< $_" for @ARGV; (done everywhere that @ARGV gets sets for <> to be used, though).

        Now go fix just about every mention of <> in the documentation and hope that every person who ever uses <> non-hackishly jumps through your extra hoops and hope that the huge majority of them who won't (because it has been documented in dozens of places for decades that such hoops are not required) don't run into a truly evily-named file. And be happy that a few hackish programs don't require the slightest modification (even through a deprecation cycle) while every use of <> in the standard documentation is wrong.

        Oh, and have fun fixing the documentation for -i. That even more obviously puts the lie to "it was designed to work that way".

        - tye        

      p5p-the-list can endlessly debate issues like this, but don't mistake that for decision, justification, or anything like that. When it comes down to it, someone may produce a patch, and the blead pumpking may apply it. No one else matters except Larry.

      I have taken advantage of the misfeature, and probably will again, but would be happy to have it not be the default...except for one issue which was raised in the p5p noise: I think - should continue to indicate stdin. And once you have that one exception, you've already lost the battle for a "safe" *.

Re^2: magic-diamond <> behavior -- WHAT?!
by repellent (Priest) on Oct 29, 2008 at 22:06 UTC
    July 2008? That's very recent.

    Hey, as long as we're continuing down the hacker path, why not include ARGV::readonly in the core?

    Thanks for the sanity reference, moritz! :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://720345]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-03-29 06:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found