Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Accessing the hash name in perl

by haukex (Archbishop)
on Mar 22, 2017 at 14:24 UTC ( [id://1185455]=note: print w/replies, xml ) Need Help??


in reply to Accessing the hash name in perl

I think that there are much better solutions than trying to parse a Perl file manually. However, what the best solution is depends on several things.

  • Can you change the program that generates this input file? If so, you should at least add the our and 1; as duyet showed, because the file you posted is actually not really valid Perl - if you try to run it, you'll get the error message "Global symbol "$test" requires explicit package name (did you forget to declare "my $test"?) at ..." On the other hand, if you can change how the data is stored, then it's probably much better to use a data serialization format such as JSON (using JSON::MaybeXS), for example.

  • Can you completely trust the source of this file? If so, you can make Perl parse it using one of the functions do, require, or use, as several monks have shown. However, note that this will execute arbitrary Perl code, so these functions can introduce huge security holes if any of the input files can be changed by untrusted users, for example.

  • If you're already married to Perl as a serialization format (which, as I said above, I wouldn't necessarily recommend), then I can suggest one of my own modules as a possibility to more safely parse the file - note however that it still has quite a few limitations, for example it will not parse the statements package hash; use strict; use warnings; in your input, so you'd have to strip those manually using e.g. a regex. You can read about my module at Undumping Perl.

Replies are listed 'Best First'.
Re^2: Accessing the hash name in perl
by vrk (Chaplain) on Mar 22, 2017 at 15:11 UTC

    One way to execute possibly unsafe code is with Safe. In this case, if all the external code does is define a data structure, you might even get away with something like my $s = Safe->new; $s->rdo($filename). But I definitely agree with trying to change the input format first, or at least making it a Perl module.

      One way to execute possibly unsafe code is with Safe.

      True, but I normally don't mention Safe because I think it's a little too easy to misuse. It requires a good amount of knowledge of the Perl internals, one must know not only which opcodes need to be allowed, but exactly what each one of them does; allowing only one too many can theoretically open a door for attackers. Plus, opcodes do sometimes change (rarely, but still), so that might have to be taken into account. Finally, the module currently appears unmaintained, and IIRC, has had some security-related bugs in the past. If used properly, Safe can make eval safer, but not "safe". That's why if there's any doubt, I'd recommend to not eval at all.

        That's why if there's any doubt, I'd recommend to not eval at all.

        Correct. And that also excludes do FILE, require, and use. See Re^4: Using guards for script execution? for details.

        The sane way of handling foreign data is to use a non-executable format like JSON. Perl has several JSON parsers / generators at CPAN.

        YAML can contain executable code (but I think you have to explicitly enable that). XML does not contain executable code, but it may contain references to local and network resources (https://en.wikipedia.org/wiki/XML_external_entity_attack) and it may contain DoS attacks (https://en.wikipedia.org/wiki/Billion_laughs). YAML is also vulnerable to the latter attack. CSV and INI file formats may appear simple, but both are essentially underspecified and thus have some "interesting" edge cases. Text::CSV_XS can handle most, if not all CSV files, because it has support for most, if not all of the edge cases.

        Of course, it would be possible to define a very restricted subset of Perl for data exchange, like JSON is a very restricted subset of Javascript. But it seems nobody has done that yet. A basic requriement would be to forbit any executable code; and I think that cyclic references should also be avoided. The format may look very similar to the output of Data::Dumper, but without any variable names. Parsing that subset should be quite easy, as it would pretty much look like JSON.

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1185455]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (1)
As of 2024-04-25 19:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found