http://qs321.pair.com?node_id=92509
Category: Text Processing
Author/Contact Info Ton Sistos
antonio@moonlight.com
Description: Data::Dumper is a great utility that converts Perl structures into eval-able strings. These strings can be stored to text files, providing an easy way to save the state of your program.

Unfortunately, evaling strings from a file is usually a giant security hole; imagine if someone replaced your stucture with system("rm -R /"), for instance. This code provides a non-eval way of reading in Data::Dumper structures.

Note: This code requires Parse::RecDescent.

Update: Added support for blessed references.
Update: Added support for undef, for structures like [3, undef, 5, [undef]]. Note that the undef support is extremely kludgy; better implementations would be much appreciated!
Update2: Swapped the order of FLOAT and INTEGER in 'term' and 'goodterm' productions. FLOAT must come before INTEGER, otherwise it will never be matched!

#####
# Package: Undumper
# Author: Ton Sistos
#
# Usage:
# 
# my $undumper = Undumper->new();
# my $string = my <<'_EOSTRING_';
# {
#  '1' => {string=>"hello"}, 
#  '2' => [2,4,6,[0,3],[1,2]], 
#  bar => [1, 2, { this => 'that', 5, "world"}, baz],
#  5, 4.023421, 
#  'foo', "hello world"
# }
# _EOSTRING_
# my $struct = $undumper->Undump($string) or die "Bad string";
#####
package Undumper;

use Parse::RecDescent;
use strict;
use vars qw($grammar);

# Enable warnings within the Parse::RecDescent module.

$::RD_ERRORS = 1; # Make sure the parser dies when it encounters an er
+ror
$::RD_WARN   = 1; # Enable warnings. This will warn on unused rules &c
+.
$::RD_HINT   = 1; # Give out hints to help fix problems.


$grammar = <<'_GRAMMAR_';
    { my $u = '^%$&undef&$*!'; }

    # Terminals first
    INTEGER : /[-+]?\d+/
            { $return = int($item[1]); }
    FLOAT : /[-+]?\d*\.\d+[eE][-+]?\d+/
          | /[-+]?\d+\.\d*[eE][-+]?\d+/
          | /[-+]?\d*\.\d+/
    STRING : /"((.*?(\\\\)*(\\")*)*?)"/s
           { $return = $1; $return =~ s/\\"/"/g; $return =~ s/\\\\/\\/
+g; }
           | /'((.*?(\\\\)*(\\')*)*?)'/s
           { $return = $1; $return =~ s/\\'/'/g; $return =~ s/\\\\/\\/
+g; }
    SIMPLESTRING : /[a-zA-Z]\w*/

    term : FLOAT
         | INTEGER
         | STRING
         | SIMPLESTRING

    goodterm : FLOAT
             | INTEGER
             | STRING

    anystring : STRING
              | SIMPLESTRING
    
    hashpair : goodterm ',' expression
             { $return = [$item[1], $item[3] eq $u ? undef : $item[3]]
+; }
             | term '=>' expression
             { $return = [$item[1], $item[3] eq $u ? undef : $item[3]]
+; }
    
    arraylist : expression ',' arraylist
              { $return = [$item[1] eq $u ? undef : $item[1], @{$item[
+3]}]; }
              | expression ','
              { $return = [$item[1] eq $u ? undef : $item[1]]; }
              | expression
              { $return = [$item[1] eq $u ? undef : $item[1]]; }
    
    hashlist : hashpair ',' hashlist
             { $return = [@{$item[1]}, @{$item[3]}]; }
             | hashpair ','
             { $return = $item[1]; }
             | hashpair
             { $return = $item[1]; }
    
    array : '[' arraylist ']'
          { $return = $item[2]; }
          | '[' ']'
          { $return = []; }
    
    hash : '{' hashlist '}'
         { $return = { @{$item[2]} };  }
         | '{' '}'
         { $return = {}; }

    object : 'bless' '(' primitive ',' anystring ')'
           { $return = bless($item[3], $item[5]); }

    primitive : hash
              | array
              | term

    expression : object
               | 'undef'
               { $return = $u; }
               | primitive
        
    startrule : expression
              { $return = (($text =~ m/^[\s;]*$/) ? ($item[1] eq $u ? 
+undef : $item[1]) : undef); }

_GRAMMAR_

sub new($$)
{
    my $invocant = shift;
    my $paramHRef = shift;
    my $class = ref($invocant) || $invocant;   # object or class name
    my $self = { };
    bless($self, $class);
    $self->_Initialize();
    return $self;
}

sub Undump($$)
{
    my $self = shift;
    my $string = shift;
    return $self->{'parser'}->startrule($string);
}

sub _Initialize($)
{
    my $self = shift;
    my $parser = Parse::RecDescent->new($grammar);
    $self->{'parser'} = $parser;
}
Replies are listed 'Best First'.
Re: Undumper
by mirod (Canon) on Jun 29, 2001 at 11:09 UTC

    You might also want to have a look at Data::Denter which generates a more compact output that Data::Dumper and which does not rely on eval to get the data back into Perl.

Re: Undumper
by DrZaius (Monk) on Jun 29, 2001 at 20:30 UTC
    Why not use a Safe? You can only let assignment operators work.

    Check out Data::Undumper, which I occasionally use. It isn't the nicest piece of code.. the interface is very touchy, but it works.

Re: Undumper
by mugwumpjism (Hermit) on Jul 01, 2001 at 20:47 UTC
    You might also want to check out the "Storable" module, which converts data structures to binary format & back with "freeze" & "thaw".
•Re: Undumper
by merlyn (Sage) on Mar 30, 2003 at 19:01 UTC
Re: Undumper
by epoptai (Curate) on Jun 29, 2001 at 11:57 UTC
      How does using 'require' differ from using 'eval'? The documentation of 'require' states that it is fundamentally a fancy 'do', which is just another way of saying 'eval'. It's not the "catching errors" part that worries people. It's the fact that your required file could contain stuff you didn't expect, such as a program to send your password file to some remote system.

      All in all, it is probably best to not require, include, or in any way run code that is arbitrary. 'use', being a compile time thing (outside of eval, of course) is a lot safer since the code can't really be modified while the program is running.

      However, if you are operating in a "clean room" environment, such as a dedicated server with strictly controlled access, where the output from Data::Dumper cannot be tampered with in any conceivable way, I would say that eval'ing that code is not as risky as some would have you believe.

      The real risk comes from running on shared systems with untrusted users who may be able to "deposit" files in your dump directory since they are using the same Web server user (i.e. nobody) and then are able to execute arbitrary code.
      I always thought just  require'ing the file was enough (not that that makes it any more secure).

        p