Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Retrieving Office 2007 files from a BLOB using Perl

by Stegalex (Chaplain)
on Mar 10, 2010 at 19:05 UTC ( #827867=perlquestion: print w/replies, xml ) Need Help??

Stegalex has asked for the wisdom of the Perl Monks concerning the following question:

NOT an off-topic question, but bear with me...

My application needs to upload common document formats and store them in a BLOB field in Oracle. It also needs to allow users to download the documents via browser. Pretty easy, right?
br> The problem is that while my code works great for most file types, the three Office 2007 file formats give me problems (.docx, .pptx, and .xlsx).

I have checked to see that I am using the correct mime types in the Content-type header. Any feedback would be great.

BTW - I am using Embperl::Object

<!-- the Uploader (excerpt) --> # upload the file undef $buffer; open FILE, "> /tmp/bld$$" or $req->error("Could not open /tmp/bld$$. $ +!"); print FILE $buffer while read($fdat{filedata}, $buffer, 32768); close FILE; $filename=$fdat{filedata}; $filename=~s(^.*/|^.*\\)(); # depathify $filename =~ s/\s/_/g; # sanitize for security reasons, then apostrophe-quote the two updatea +ble fields. $subs->fdat_sanitize (\%fdat); # Note: we are doing this after the fi +ledata has been extracted $filename =~ s/'/''/g; $bld_comment = $fdat{bld_comment}; $bld_comment =~ s/'/''/g; # insert blank BUDGET_LINE_DOCUMENT $bld_id = $req->nextval(sequence_name => "BLD_SEQ"); $req->error("Budget Line with bld_id = $fdat{bl_id} does not exist") unless ($req->select_tupple ("select bl_id from budget_line where bl +_id = '$fdat{bl_id}'")); unless ($req->errors) { # -- Extension -- $extension = $filename; $extension =~ s/^.*(\.)/$1/; $extension = lc($extension); $mime_type = $req->select_tupple("select mime_type from mime_type wh +ere extension = '$extension'"); if ($mime_type) { # document format is recognized # -- Insert BUDGET_LINE_DOCUMENT row without DOCUMENT $sql = " insert into budget_line_document (bld_id, bl_id, username, bld_c +omment, mime_type, filename, document) values ('$bld_id', '$fdat{bl_id}', '$req->{username}', '$bld_com +ment', '$mime_type', '$filename', EMPTY_BLOB()) "; $req->{dbh}->do(qq{$sql}); $req->db_error_check(); # select row for update $sql = "select document from budget_line_document where bld_id = ' +$bld_id' for update"; my $sth = $req->{dbh}->prepare (qq{$sql}, { ora_auto_lob => 0 } ); $req->db_error_check(); $sth->execute; $req->db_error_check(); ($document) = $sth->fetchrow_array; $req->db_error_check(); $sth->finish; # update the document tupple from the temporary file open FILE, "/tmp/bld$$" or die; my $chunk_size = 4096; $buffer = ''; while ( read ( FILE, $buffer, $chunk_size) ) { $req->{dbh}->ora_lo +b_append ( $document, $buffer ); } close FILE; $req->message("$filename attached to BUDGET_LINE_DOCUMENT table fo +r budget_line $bl_id."); . . . <form name="fupload" id="fupload" method="post" enctype="multipart/for +m-data"> <label for="filedata">File: </label> <input type="file" name="filedata" id="filedata" style="width: 500px +;"> </form> <!-- The RETRIEVAL code (excerpt) --> [- $sql = "select document from budget_line_document where bld_id='$fda +t{bld_id}'"; $sth = $req->{dbh}->prepare(qq{$sql}, {ora_auto_lob => 0 } ); die $req->{dbh}->errstr if ($req->{dbh}->errstr); $sth->execute; die $req->{dbh}->errstr if ($req->{dbh}->errstr); $documentloc = $sth->fetchrow; die $req->{dbh}->errstr if ($req->{dbh}->errstr); $sth->finish; $offset=1; $chunk_size=4096; while ($data = $req->{dbh}->ora_lob_read ($documentloc, $offset, $ch +unk_size) ) { die $req->{dbh}->errstr if ($req->{dbh}->errstr); $document .= $data; $offset += $chunk_size; } die $req->{dbh}->errstr if ($req->{dbh}->errstr); $mime_type = select_tupple ("select mime_type from budget_line +_document where bld_id = '$fdat{bld_id}'"); $http_headers_out{'Content-type'} = "$mime_type"; -] [+ $document +]

Replies are listed 'Best First'.
Re: Retrieving Office 2007 files from a BLOB using Perl
by Corion (Patriarch) on Mar 10, 2010 at 19:28 UTC

    Maybe you would fare better if you used DBI placeholders instead of interpolating your data into your SQL? I think that a carefully crafted filename could still subvert your SQL, because you don't escape/remove newlines and other binary data from it.

    Also, maybe you need to use binmode when reading your file, but I don't know that much about Oracle. Also, why are you using such a low chunk size of 4096 instead of a chunk size that would fetch most of your documents in one go?

      Thanks for your remarks. I have increased the chunk size and am doing a binmode with the same results. Not sure how placeholders would impact my results as the ora_lob_append is feeding chunks to the tuple instead of trying to write the entire contents to the file then the tuple all in one shot.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://827867]
Approved by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (2)
As of 2022-11-26 13:15 GMT
Find Nodes?
    Voting Booth?