http://qs321.pair.com?node_id=609632

derby has asked for the wisdom of the Perl Monks concerning the following question:

update: As of 2007/11/30, CGI v 3.31 has the patch to accept PUT data correctly. Thanks to prodding from rhesa, CGI and CGI::Simple now support all the HTTP methods necessary to build REST services.


I finally started building a true REST webservice when I ran smack into a wall. The service is your basic crud service where the HTTP GET will retrieve products, DELETE will delete, POST will update and PUT will create:

GET http://foo.com/webservice/<productid> DELETE http://foo.com/webservice/<productid> POST http://foo.com/webservice/<productid> PUT http://foo.com/webservice
Nothing out of the ordinary there right? The thing is, I'm a big fan of CGI::Application and it uses CGI at its core (but that's overridable). The wall is the way CGI handles the PUT method (it doesn't really) and the way it handles POST methods -- it's designed for html form parsing. No problem, I thought. CGI::Application has the capability to switch out CGI with any other module as long as that module adheres to the CGI interface (well, not the entire interface).

So I needed a module that would

  1. adhere to the CGI interface
  2. support the HTTP PUT method
  3. not form parse PUT and POST data
after much searching, I couldn't find a module for those needs. The closest I came was CGI::XMLpost but that wasn't even horseshoe close.

I finally decided to build one myself but given the nature of CGI, I was pretty sure it wasn't going to be quick and it wasn't going to be pretty. I had used CGI::Simple in the past and started thinking if there was a way to co-opt it into what I wanted.

After looking at the code, I figured out all I need to do was override its' _read_parse method and then add accessors for the POST and PUT data. There were only 4 changes needed for _read_parse

package Foo::CGI::Rest; use base 'CGI::Simple'; sub _read_parse { my $self = shift; my $data = ''; my $type = $ENV{'CONTENT_TYPE'} || 'No CONTENT_TYPE received'; my $length = $ENV{'CONTENT_LENGTH'} || 0; my $method = $ENV{'REQUEST_METHOD'} || 'No REQUEST_METHOD received'; # change #1 - added or "PUT" here ... we don't want # malicious PUTs either # first check POST_MAX Steve Purkis pointed out the previous bug if( ( $method eq 'POST' or $method eq "PUT" ) and $self->{'.globals'}->{'POST_MAX'} != -1 and $length > $self->{'.globals'}->{'POST_MAX'}) { $self->cgi_error( "413 Request entity too large: $length bytes on STDIN exceeds +\$POST_MAX!" ); # silently discard data ??? better to just close the socket ??? while ($length > 0) { last unless sysread(STDIN, my $buffer, 4096); $length -= length($buffer); } return; } if( $length and $type =~ m|^multipart/form-data|i ) { my $got_length = $self->_parse_multipart; if( $length != $got_length ) { $self->cgi_error("500 Bad read on multipart/form-data! wanted $l +ength, got $got_length"); } # changed #2 - or "PUT" here too } elsif( $method eq 'POST' or $method eq 'PUT' ) { if( $length ) { # we may not get all the data we want with a single read on larg +e # POSTs as it may not be here yet! Credit Jason Luther for patch # CGI.pm < 2.99 suffers from same bug sysread(STDIN, $data, $length); while( length($data) < $length ) { last unless sysread(STDIN, my $buffer, 4096); $data .= $buffer; } # change 3 - don't send data to parse params ... it's not form d +ata if( $length == length $data ) { $self->set_data( $data ); } else { $self->cgi_error("500 Bad read on POST! wanted $length, got " +. length($data)); } } } elsif( $method eq 'GET' or $method eq 'HEAD' ) { $data = $self->{'.mod_perl'} ? $self->_mod_perl_request()->args() : $ENV{'QUERY_STRING'} || $ENV{'REDIRECT_QUERY_STRING'} || ''; $self->_parse_params($data); } else { unless ($self->{'.globals'}->{'DEBUG'} and $data = $self->read_from_cmdline()) { $self->cgi_error("400 Unknown method $method"); } } } # change 4 - create accessors sub set_data { my( $self, $data ) = @_; $self->{_data} = $data; } sub get_data { my( $self ) = @_; return $self->{_data}; } 1;
Now in my CGI::Application all I have to do is
sub cgiapp_get_query { my $self = shift; require Foo::CGI::Rest; return Foo::CGI::Rest->new(); }
and in my handlers for POST and PUT:
sub update { my $self = shift; my $cgi = $self->query(); my $xmlstr = $cgi->get_data(); ... } sub create { my $self = shift; my $cgi = $self->query(); my $xmlstr = $cgi->get_data(); ... }

The thing that worries me though, is REST has been around a few years now and CGI has been around forever. So any ideas why CGI and it's derivatives treat PUT like a read headed step child?

-derby

Update: Updated title.

Update: Given the great feedback from rhesa, I've further simplified the _read_parse override by setting the POSTDATA and PUTDATA params if the POST'ed and PUT'ed data is not of type 'application/x-www-form-urlencoded' ... hmmm maybe I should submit this as a patch to the CGI::Simple author. Here it is in all it's simpleness:

package Foo::CGI::Rest; use base 'CGI::Simple'; sub _read_parse { my $self = shift; my $data = ''; my $type = $ENV{'CONTENT_TYPE'} || 'No CONTENT_TYPE received'; my $length = $ENV{'CONTENT_LENGTH'} || 0; my $method = $ENV{'REQUEST_METHOD'} || 'No REQUEST_METHOD received'; # first check POST_MAX Steve Purkis pointed out the previous bug if( ( $method eq 'POST' or $method eq "PUT" ) and $self->{'.globals'}->{'POST_MAX'} != -1 and $length > $self->{'.globals'}->{'POST_MAX'}) { $self->cgi_error( "413 Request entity too large: $length bytes on STDIN exceeds +\$POST_MAX!" ); # silently discard data ??? better to just close the socket ??? while ($length > 0) { last unless sysread(STDIN, my $buffer, 4096); $length -= length($buffer); } return; } if( $length and $type =~ m|^multipart/form-data|i ) { my $got_length = $self->_parse_multipart; if( $length != $got_length ) { $self->cgi_error("500 Bad read on multipart/form-data! wanted $l +ength, got $got_length"); } } elsif( $method eq 'POST' or $method eq 'PUT' ) { if( $length ) { # we may not get all the data we want with a single read on larg +e # POSTs as it may not be here yet! Credit Jason Luther for patch # CGI.pm < 2.99 suffers from same bug sysread(STDIN, $data, $length); while( length($data) < $length ) { last unless sysread(STDIN, my $buffer, 4096); $data .= $buffer; } if( $length == length $data ) { if( $type !~ m|^application/x-www-form-urlencoded| ) { $self->_add_param( $method . "DATA", $data ); } else { $self->_parse_params( $data ); } } else { $self->cgi_error("500 Bad read on POST! wanted $length, got " +. length($data)); } } } elsif( $method eq 'GET' or $method eq 'HEAD' ) { $data = $self->{'.mod_perl'} ? $self->_mod_perl_request()->args() : $ENV{'QUERY_STRING'} || $ENV{'REDIRECT_QUERY_STRING'} || ''; $self->_parse_params($data); } else { unless ($self->{'.globals'}->{'DEBUG'} and $data = $self->read_from_cmdline()) { $self->cgi_error("400 Unknown method $method"); } } } 1;

Update: Submitted a patch to the author of CGI::Simple