http://qs321.pair.com?node_id=298282

svsingh has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing an application that keeps track of contact information for my cousins. I just finished the part that updates the data file based on input into a form.

I'd like to ensure my data isn't tainted and the script isn't exploitable. After reading Untainting Safely and Ovid's CGI Tutorial, I came up with the code at the end of this posting.

If it's not too much touble, can I get some comments on my first attempt at untainting data. Am I missing anything or doing anything risky?

Thanks for your help.

#!/usr/local/bin/perl -wT . . . sub saveChanges { sub untaint { # remove any characters except those listed in regex + my $s = $_[0]; $s =~ s/[^\w \-\@\(\)\,\.\/]//g; # untaint the data return $s; } # only put valid keys in the hash my %c = (); my @keys = qw(name birthday email phone cell addr1 addr2 ICQ AIM MS +N Y!); foreach my $i (@keys) { if ( defined param("$i") && !param("$i") =~ /^\s*$/ ) { $c{$i} = untaint( param("$i") ); } } # update datafile my $tmpFile = $$config{"data"} . ".tmp"; open (TFILE, ">$tmpFile") or die "Can't open file: $!\n"; flock (TFILE, 2) or die "Can't lock file: $!\n"; my $cousins = &readData(); # read the current datafile foreach my $cousin ( keys %$cousins ) { if ( $cousin eq $c{"name"} ) { $$cousins{"name"} = \%c; } print TFILE "\nname :: $cousin\n"; foreach my $field ( keys %{$$cousins{$cousin}} ) { print TFILE "$field :: $$cousins{$cousin}{$field}\n"; } } flock(TFILE, 8); close(TFILE); rename( $tmpFile, $$config{"data"} ); }

Update: Added first line of the script to the code block. I'm assuming the -T argument runs the script in taint mode.

Replies are listed 'Best First'.
•Re: First Time Untainting Data
by merlyn (Sage) on Oct 10, 2003 at 15:32 UTC
    There's no point in flocking a file that you destroy before you obtain a flock.

    But, getting to your taint issue, there's also no place in having a generic "untaint" subroutine. The act of untainting is always specific to the narrowest definition of what is permitted in the data. You don't have just "untaint", you have "untaint_username" or "untaint_hostname". And "untaint_email_address" cannot exist, because every possible character is possible in an email address. {grin}

    Also, "tainting" is generally associated with programs running in "taint" mode, which I'm not seeing in your snippet. And when that happens, you need to execute a specific form of match to get rid of the taint. Something like:

    $data = /^([a-z]+)$/ or die "data isn't just alphabetic!"; $data = $1; # now grab the untainted version

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      There's no point in flocking a file that you destroy before you obtain a flock.

      Thanks for your comments. From what I learned in Order of flock and open, I cannot lock the file before I open (and destroy) it.

      Here, I was flocking the file to handle the case where two users submit the form data at the same time. I tried using flock with a test script and it seems to protect the temp file as it's being written. Won't that happen in this script as well? (I know what I'm observing, but my test conditions are fairly controlled. I'm happy to defer to your experience.)

        No, you've got a destruction going on. Consider:
        • process 1 opens for create - file deleted
        • process 1 flocks - and continues
        • process 1 writes its data
        • process 2 opens for create - blam, process 1 data is gone
        • process 1 closes, releasing the flock
        • process 2 flocks - and continues
        • process 2 writes its data
        • process 2 closes
        I guess if your goal is to have only the most recent data, you've succeeded, but you didn't need to do the flock for that... you can just leave the flocks entirely out.

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

        The problem merlyn is trying to point out is that when you open the file for writing (open (FH, ">file")), you already deleted the contents of the file regardless of any locks. From perlopentut:
          To get an exclusive lock, typically used for writing, you have to be careful. We "sysopen" the file so it can be locked before it gets emptied. You can get a nonblocking version using "LOCK_EX | LOCK_NB".
          use 5.004; use Fcntl qw(:DEFAULT :flock); sysopen(FH, "filename", O_WRONLY | O_CREAT) or die "can't open filename: $!"; flock(FH, LOCK_EX) or die "can't lock filename: $!"; truncate(FH, 0) or die "can't truncate filename: $!"; # now write to FH
        Hope that helps!

        -- zigdon

Re: First Time Untainting Data
by Abigail-II (Bishop) on Oct 10, 2003 at 15:39 UTC
    Two things: first, your untaint function isn't untainting the data. Removing parts from a tainted string doesn't make it untainted. Match (in parens) what you allow, and use that ($1) - that will be untainted. See the perlsec manual page.

    But looking at the rest of the program, I don't think you are doing anything that isn't insecure.

    Abigail

      Thanks for the tip. I rewrote the untaint subroutine as follows (borrowing from perlsec):
      sub untaint { my $s = $_[0]; if ($s =~ /^([\w \-\@\(\)\,\.\/]+)$/) { $s = $1; # $data now untainted } else { die "Bad data in $s"; # log this somewhere } return $s; }

      When I get home, I'll look into writing a separate untaint subroutine for each field, but is this more correct than the original?

      Thanks again.

        This is at least a syntactical valid way of untainting the data, but I do not know whether it's semantically correct. That depends on how the data is used. And do yourself (and all readers of your code), don't backwack everything. Only use a backslash when it's really needed. Like this:
        if ($s =~ m{^([-\w \@(),./]+)$}) {

        Abigail