Stegalex has asked for the wisdom of the Perl Monks concerning the following question:

NOT an off-topic question, but bear with me...

My application needs to upload common document formats and store them in a BLOB field in Oracle. It also needs to allow users to download the documents via browser. Pretty easy, right?
br> The problem is that while my code works great for most file types, the three Office 2007 file formats give me problems (.docx, .pptx, and .xlsx).

I have checked to see that I am using the correct mime types in the Content-type header. Any feedback would be great.

BTW - I am using Embperl::Object

<!-- the Uploader (excerpt) --> # upload the file undef $buffer; open FILE, "> /tmp/bld$$" or $req->error("Could not open /tmp/bld$$. $ +!"); print FILE $buffer while read($fdat{filedata}, $buffer, 32768); close FILE; $filename=$fdat{filedata}; $filename=~s(^.*/|^.*\\)(); # depathify $filename =~ s/\s/_/g; # sanitize for security reasons, then apostrophe-quote the two updatea +ble fields. $subs->fdat_sanitize (\%fdat); # Note: we are doing this after the fi +ledata has been extracted $filename =~ s/'/''/g; $bld_comment = $fdat{bld_comment}; $bld_comment =~ s/'/''/g; # insert blank BUDGET_LINE_DOCUMENT $bld_id = $req->nextval(sequence_name => "BLD_SEQ"); $req->error("Budget Line with bld_id = $fdat{bl_id} does not exist") unless ($req->select_tupple ("select bl_id from budget_line where bl +_id = '$fdat{bl_id}'")); unless ($req->errors) { # -- Extension -- $extension = $filename; $extension =~ s/^.*(\.)/$1/; $extension = lc($extension); $mime_type = $req->select_tupple("select mime_type from mime_type wh +ere extension = '$extension'"); if ($mime_type) { # document format is recognized # -- Insert BUDGET_LINE_DOCUMENT row without DOCUMENT $sql = " insert into budget_line_document (bld_id, bl_id, username, bld_c +omment, mime_type, filename, document) values ('$bld_id', '$fdat{bl_id}', '$req->{username}', '$bld_com +ment', '$mime_type', '$filename', EMPTY_BLOB()) "; $req->{dbh}->do(qq{$sql}); $req->db_error_check(); # select row for update $sql = "select document from budget_line_document where bld_id = ' +$bld_id' for update"; my $sth = $req->{dbh}->prepare (qq{$sql}, { ora_auto_lob => 0 } ); $req->db_error_check(); $sth->execute; $req->db_error_check(); ($document) = $sth->fetchrow_array; $req->db_error_check(); $sth->finish; # update the document tupple from the temporary file open FILE, "/tmp/bld$$" or die; my $chunk_size = 4096; $buffer = ''; while ( read ( FILE, $buffer, $chunk_size) ) { $req->{dbh}->ora_lo +b_append ( $document, $buffer ); } close FILE; $req->message("$filename attached to BUDGET_LINE_DOCUMENT table fo +r budget_line $bl_id."); . . . <form name="fupload" id="fupload" method="post" enctype="multipart/for +m-data"> <label for="filedata">File: </label> <input type="file" name="filedata" id="filedata" style="width: 500px +;"> </form> <!-- The RETRIEVAL code (excerpt) --> [- $sql = "select document from budget_line_document where bld_id='$fda +t{bld_id}'"; $sth = $req->{dbh}->prepare(qq{$sql}, {ora_auto_lob => 0 } ); die $req->{dbh}->errstr if ($req->{dbh}->errstr); $sth->execute; die $req->{dbh}->errstr if ($req->{dbh}->errstr); $documentloc = $sth->fetchrow; die $req->{dbh}->errstr if ($req->{dbh}->errstr); $sth->finish; $offset=1; $chunk_size=4096; while ($data = $req->{dbh}->ora_lob_read ($documentloc, $offset, $ch +unk_size) ) { die $req->{dbh}->errstr if ($req->{dbh}->errstr); $document .= $data; $offset += $chunk_size; } die $req->{dbh}->errstr if ($req->{dbh}->errstr); $mime_type = select_tupple ("select mime_type from budget_line +_document where bld_id = '$fdat{bld_id}'"); $http_headers_out{'Content-type'} = "$mime_type"; -] [+ $document +]

Replies are listed 'Best First'.
Re: Retrieving Office 2007 files from a BLOB using Perl
by Corion (Patriarch) on Mar 10, 2010 at 19:28 UTC

    Maybe you would fare better if you used DBI placeholders instead of interpolating your data into your SQL? I think that a carefully crafted filename could still subvert your SQL, because you don't escape/remove newlines and other binary data from it.

    Also, maybe you need to use binmode when reading your file, but I don't know that much about Oracle. Also, why are you using such a low chunk size of 4096 instead of a chunk size that would fetch most of your documents in one go?

      Thanks for your remarks. I have increased the chunk size and am doing a binmode with the same results. Not sure how placeholders would impact my results as the ora_lob_append is feeding chunks to the tuple instead of trying to write the entire contents to the file then the tuple all in one shot.