Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Detect SQL injection

by jeanluca (Deacon)
on Aug 09, 2010 at 20:04 UTC ( [id://853909]=perlquestion: print w/replies, xml ) Need Help??

jeanluca has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I have this situation in which I have to create a table, but the user defines the column names and column definitions.
Anyway, I know now that the column name is handled by quote_identifier(..), but the definition, for example, a user might give me:
VARCHAR(100) ); DROP DATABASE mysql; --'
is a lot harder to validate. I couldn't find a function that does this.
So I wrote the following test script to detect this kind of SQL injection:
#! /usr/bin/perl -l use strict ; use warnings ; # USER INPUT my $SQL = 'VARCHAR(100) ); DROP DATABASE mysql; --'; validate_column_definition( $SQL ) ; my $query = sprintf( 'CREATE TABLE test ( a %s, b INT )', $SQL ) ; print $query ; sub validate_column_definition { my $SQL = shift ; # remove all strings (DEFAULT 'str' OR COMMENT 'str') $SQL =~ s/(['][^']+['])//g ; # remove quoted strings $SQL =~ s/(["][^"]+["])//g ; # remove quoted strings if ( $SQL =~ /#|--|\/\*|;|'|`|"/ ) { # detect invalid charact +ers print "SQL INJECTION: $SQL\n" ; } }
I think this could work, but I'm not sure; does it detect all possible attacks and doesn't it break legal SQL?
Any help would really be appreciated

cheers
LuCa

Replies are listed 'Best First'.
Re: Detect SQL injection
by graff (Chancellor) on Aug 09, 2010 at 22:44 UTC
    Given this as the starting point:

    I have this situation in which I have to create a table, but the user defines the column names and column definitions.

    If this is being done via an http request, I would structure the cgi query parameters for table columns as a list of related pairs: one member of each pair would be the "column name" (which needs to match a very simple regex) and the other would be the data type (which needs to come from a closed set of known alternatives).

    How you structure the user's input form to support that is up to you -- the cleanest approach would probably involve some javascripting that allows the user to accumulate one or more column definitions, each of which adds two related parameters to the request (a type-in for the column name, a pull-down menu for the data type) -- e.g. the query might contain params sort of like this:

    ...colname_1=foo&coltype_1=int&colname_2=bar&coltype_2=varchar...
    (You don't have to support all possible variations and sizes of data types -- a minimal set with default sizes ought to suffice.)

    When it comes to structuring the server-side activity to handle the request, of course, you can't assume that every request will come from your nifty input form page, but all you have to do is grep out the paired parameters, make sure that you have a usable name and a known data type in each pair, and assemble your SQL "create table ..." statement accordingly.

    That all assumes that the user requirements for creating tables do not involve "special features" like uniqueness constraints on one or more given columns, foreign-key relations to other tables, particular character-set and/or collation specs on varchars, and so on. (To some extent, some of these features could be included as options in a more elaborate input form.)

    In any case, by structuring the input task so that each user-specified column involves two separate user inputs (name and data type), I think you'll be making things easier and safer overall, for the user and for the server.

    (updated 2nd paragraph to be more coherent)

Re: Detect SQL injection
by psini (Deacon) on Aug 09, 2010 at 20:16 UTC

    I think that quote_identifier avoids the injection problem by itself, but I'm not sure, and it could depend on the database used.

    In my opinion the real question is: do you really need to allow users to use ANY valid name for a table/column? For it is certainly easier allowing only a subset of valid names (say /[_A-Z][_A-Z0-9]*/i) than trying to foresee any possible attack strategy.

    Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."

Re: Detect SQL injection
by planetscape (Chancellor) on Aug 10, 2010 at 06:59 UTC
Re: Detect SQL injection
by JavaFan (Canon) on Aug 10, 2010 at 00:04 UTC
    Who is the "user" anyway? If it's someone who already has the power to drop tables, delete/modify rows, etc, there's little he can do via your program he can't do otherwise.

    Also, most databases have a fine grained access system. If your table creating program is running as a webservice, give the connection only permission to create tables - and nothing else.

    But parsing of types shouldn't be that hard - specially if you don't allow to set all kinds of constraints. I would just check for allowable types, and reject anything that doesn't look like a valid type definition (instead of trying to detect anything that may be a problem).

Re: Detect SQL injection
by aquarium (Curate) on Aug 10, 2010 at 00:11 UTC
    Another take on the problem is to internally use pre-computed table names and column names but setup a structure to display the aliases to table/column names as specified by user, which would actually be stored in a table. As these user defined aliases are ever used for display only, it prevents SQL injection problem. However there is some overhead initially setting up this structure.
    the hardest line to type correctly is: stty erase ^H
Re: Detect SQL injection
by proceng (Scribe) on Aug 09, 2010 at 23:31 UTC
    I think this could work, but I'm not sure; does it detect all possible attacks and doesn't it break legal SQL?
    Any help would really be appreciated
    I would be more apt to allow a user to only create/drop databases in their own directories or to filter out the unsafe commands (ie: They can create, but have to contact you to drop a database or index).

    Alternatively,You can force a (non-escaped) literal ';' to be the end of the statement under all circumstances.

    As an aside, is there some reason that you want minimal constraints on what the user can do (especially if this is a production environment)? Use taint mode for input and validate that input for ;, followed by any of the SQL commands that you don't want the user to execute occurring anywhere in the line. That way (given that there is a very small set of commands) you can throw an exception (and log it when they occur.

    UPDATE: Corrected for the real question :-(

    I have this situation in which I have to create a table, but the user defines the column names and column definitions.
    Anyway, I know now that the column name is handled by quote_identifier(..), but the definition, for example, a user might give me: VARCHAR(100) ); DROP DATABASE mysql; --'
    Within your code, make sure "taint" mode is enabled (at least in the code block where the user input is occurring), then redefine (locally) $\ (The input record separator, newline by default) to be ";". Scan each input line for any of the "dangerous" commands such as DROP DATABASE (see the command reference for your flavor of SQL platform) and throw an exception if they should occur.
Re: Detect SQL injection
by DeepThought (Initiate) on Aug 10, 2010 at 11:19 UTC

    DBI does the quoting for you. You should never use sprintf or other string formatting because the likeliness that your quoting is not sufficient is too big. Other people have solved that issue for you decades ago once and forever.

    I have to admit this is untested, but it is very likely to work like this.
    $dbh->do('CREATE TABLE test (a ?, b INT)', undef, $SQL);

    If not, have a look at http://search.cpan.org/~timb/DBI-1.613/DBI.pm#quote Or consolidate your SQL driver for quoting functions if you don't use DBI (Dare you!).

    Regards
    Tilman Baumann
      A few points.
      • Not every database(driver) allows place holders in just any place you want.
      • You do not want to quote your types. Even simple types usually consist of several (lexical) tokens, for instance char(3), which has 4 tokens. Quoting that makes it one token, and not valid SQL.
      • Why the Dare you! at the end? There can be many reasons to prefer more closely following the companies database's API instead of using a greatest-common-divisor strategy. I've done rapid prototyping, trying out solution using Perl, which then later got implemented in the company's main code repo, which was written in C. I wrote my solutions using sybperl, and DBlib/CTlib calls, because those are the calls the Sybase C-libraries provided. DBI would not have been a good choice.
      Thats what I was looking for.
      On the other hand, I just tried to execute a possible SQL-injected query
      $dbh->do( 'CREATE TABLE a (x INT) ; DROP TABLE a -- );' ) ; // or $dbh->do( 'CREATE TABLE a (x INT) ; DROP TABLE a' ) ;
      Both give an error. Is it a rookie SQL mistake or what. I'm confused now!!
Re: Detect SQL injection
by afoken (Chancellor) on Aug 11, 2010 at 12:51 UTC

    Generally: Don't try to detect injections, avoid them. Restrict every input to a set of save characters, abort whenever you see invalid input. Don't try to repair invalid input, doing so often opens another way for injections.

    Even more generally for any DBI application: Let DBI handle the quoting for you. Use prepared statements with placeholders, DBI will take care of proper quoting, should the database wire protocol require any quoting. Use the quote_identifier() method to quote any identifier (column names, table names). DBI might not contain code to handle your specific database requirements, but the DBD you are using sure does.

    Specifically for your problem: Don't allow free text input. You know (or can find out) what types are supported by your database. Use something like a dropdown list or radio buttons to let the user select one of the available types. Some types need additional data, like VARCHAR or NUMBER. You know when to ask for that, it depends on the selected type. Use one or two additional input fields, and accept only numbers there. NULL / NOT NULL: Use a checkbox. The same for the primary key constraint. Unique constraints need a name and a list of existing column names. Also no problem, ask for a name and offer a multi-select for the columns. The same rules apply for other constructs found in CREATE TABLE statement, like foreign keys. Finally, assemble all of the information bits into a big CREATE TABLE statement. Bonus points for detecting missing primary keys.

    You did not say anything about your environment, so I assumed web or GUI. In text mode, you have to ask a lot more questions than you do now, but the principle stays the same: Ask for one piece of information at a time, and validate the input.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Detect SQL injection
by ikegami (Patriarch) on Aug 11, 2010 at 16:02 UTC
    Sounds like you accept an SQL type from the user, so what you need to do is validate that you got a valid SQL type (or whatever subset you deem acceptable). Doing so doesn't involve removing characters.
    $type =~ s/\s+/ /g; # Liberal for inputs, strict for outputs. $type =~ s/^\s//; $type =~ s/\s\z//; $type = uc($type); $type =~ / ^ (?: VARCHAR (?: \s* \( \s* [0-9]+ \s* \) )? | CHAR (?: \s* \( \s* [0-9]+ \s* \) )? | ... ) \z /x or die;

    You could also accept two fields, one of them a drop down. The user wouldn't have to know SQL.

    if ($type eq 'VARCHAR') { if ($arg eq '') { $sql = $type; } elsif ($arg =~ /^[0-9]+\z/) { $sql = "$type($arg)"; } else { die } } elsif ($type eq 'CHAR') { if ($arg eq '') { $sql = $type; } elsif ($arg =~ /^[0-9]+\z/) { $sql = "$type($arg)"; } else { die } } ... else { die }
Re: Detect SQL injection
by mpeppler (Vicar) on Aug 15, 2010 at 07:41 UTC
    With Sybase, one of the best ways that I've found to avoid any risk of SQL Injection is to mandate the use of stored procedures, and to always use RPC semantics when calling the stored procedures rather than language commands.

    This means that there is no parsing of the input parameters (other than ensuring that they match with the parameter's datatypes), so any text that is passed in for a particular parameter can never be executed.

    Another advantage (on large systems) is that the procs encapsulate all the SQL, making it a lot easier to find offending (badly performing) queries, and tuning them independently of the client code.

    I realize that this isn't always practical, but once everyone in the team knows how this works it's actually quite efficient, in particular on large systems (200+ developers, several thousand tables, three million+ lines of SQL code...)

    Michael

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://853909]
Approved by Corion
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (7)
As of 2024-04-23 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found