in reply to Handling huge BLOB fields with DBI and MySQL

Storing a large piece of data in a database is not always the best solution, but sometimes is the only thing you can do, so it's good to know that you can.

On the other hand, If I store something in a database, it's because I need to manipulate it. If I just store it to a database and then retrieve it without any additional benefit such as searching, counting, averaging, summing up, then I am just wasting my time and I should rather resort to FTP.

For my side, I don't like using filenames in the database and the data stored in files, because it adds one level of complexity to my system. Not to the uploading part, which could be actually easier, but to the maintenance business.

If I have everything in my database, I should take care of one thing only. All the system is in one place, for good and for bad.

I actually tried the filenames way and I found that it is not as practical as I want.

I noticed that by not letting the database dealing with the storage, some operations were even more complicated. For example, I need to guarantee the data integrity of my fields, and I do so by running a CRC function and comparing the results to other values stored in a table. I can do this with one single query. Not very practical to do with the files scattered in the filesystem. Or I need to calculate the total size of my fields, depending on the values of some other fields. All things that you can do if you have access to the server's filesystem, while the database can accomodate your needs with a simple call. Besides, you may want to store BLOBs that you need to search for. They could be documents, whose header you need to index. I can find so many advantages of having the data in a database, that I am very reluctant to the idea of relinquishing my possibilities to do so.

The difficult thing is only sending the data and fetching it. Apart from that, the rest of the operations are easier in a database than in a collection of files.

I am not saying that the "filenames" solution is wrong. I just want to have it simple. For my needs, BLOBS are the right choice.

I am basically agnostic, so I won't try to convince anybody that what I am doing is the thing to do. Rather, I tell my reasons and I will welcome anybody who share my thoughts. I know that many people want to pursue the BLOBs choice but they have had technical problems. So I am offering my solution.

 _  _ _  _  
(_|| | |(_|><
  • Comment on Re: Handling huge BLOB fields with DBI and MySQL

Replies are listed 'Best First'.
Re: Re: Handling huge BLOB fields with DBI and MySQL
by Rhose (Priest) on Mar 08, 2002 at 15:26 UTC
    While this is not a perl question, how does storing these files in blobs impact your database recovery procedures?

    From what I understand, MySQL does have a (slightly crude) recovery method -- the database keeps an update log of all activity. You can then replay the update log which stores the changes since your last database backup to bring the system "up to date". (Although I have not played with this, I also assume you could edit the update log to simulate a point in time recovery.)

    It would seem that storing just the file names (assuming you do not have version control in place for the documents) means that you would have a very difficult time recovering from certain failures. However, (as I mentioned earlier, I have not played with the MySQL update logs) it would also seem that storing the changes to multi-meg or multi-gig fields would cause the update log to exceed the OS file size limitations.

    How have these concerns impacted your implementations?

      Recovery of a database could be as easy as running your latest backup and restart business, if you are well organized.

      If you are using binary logs, the system can recover fairly easily. BLOBs are not a problem here, they are just more data in your database.

      About organizing yourself, you might have noticed that I added a timestamp field to my table. This way, I can have a progressive backup of the fields that were modified in a given timeframe, to integrate with a full weekly backup.
      The subject deservers more space than we can dedicate here. The matter is explained much better than this in Paul Dubois' book, MySQL.

      Personally, I would say that storing blobs in sparse files makes your task more difficult, but TMTOWTDI, after all, and I might be wrong. Let's say that I am just more confortable with my current architecture.

       _  _ _  _  
      (_|| | |(_|><
Re^2: Handling huge BLOB fields with DBI and MySQL
by Anonymous Monk on Jul 26, 2004 at 14:20 UTC
    If you are writting binary blobs to file don't you need to set binmode on your output stream ? ie binmode OUTFILE;

      binmode is mandatory only on MS-DOS/Windows systems.

      perldoc -f binmode
      On some systems (in general, DOS and Windows-based systems) binmode() is necessary when you're not working with a text file.