Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

DBD::RAM

by mirod (Canon)
on Nov 30, 2000 at 16:37 UTC ( [id://44114]=modulereview: print w/replies, xml ) Need Help??

Item Description: A DBI driver for files and data structures

Review Synopsis:

Description

DBD::RAM is Jeff Zucker's driver for the DBI that allows you to import files in memory and treat them as relational tables, with SQL queries.
Several tables can be created this way and SQL joins can be simulated through loops.
Changes to the data can be reflected to the original file (if the table is created using the catalog function) or the table can be export-ed to one of the formats supported by DBD::RAM.

DBD::RAM can process the following formats:

  • CSV (Comma Separated Values),
  • Fixed-width records, using pack to define the format,
  • Perl structures (arrays of arrayrefs or hasrefs) so you can create the data in the Perl script and then use it as a relational table,
  • XML,
  • User-defined, which allows the user to define a subroutine that will parse the input and return an array with the various fields,
  • Other DBI data bases, so you can load a table in memory, close the connection and then process the data,
  • MP3 headers from a group of directories.

The data in all formats can be input either locally, from strings, files or pipes, or remotely through LWP

Why use DBD::RAM

  • you are dealing with existing data, which format you have no control over, but you still want to access it through the DBI interface, using SQL,
  • you want to use SQL without installing a relational DB,
  • you want to prototype an application without a DB but think that you might add one down the line,
  • you want to convert data from a DBD::RAM supported format to an other,
  • you want to use an XML file as a table (exporting it back might not work for you though)

Why NOT use DBD::RAM

You will not use DBD::RAM essentially if you need a real data base.

  • you process huge amounts of data,
  • the data is already in an existing DB.

Example

The obligatory XML example:

# connect to the DB my $dbh= DBI->connect( "DBI:RAM:" , {RaiseError => 1} ); # create the table $dbh->func( { table_name => 'projects', data_type => 'XML', record_tag => 'projects project', col_names => 'pid,name,description', file_source => 'project.xml', }, 'import'); # prepare the SQL statement my $sth= $dbh->prepare( "SELECT * FROM projects"); # execute the statement $sth->execute(); # output the result of the query while( my $hashref= $sth->fetchrow_hashref()) { foreach my $field ( keys %$hashref) { print "$field: $hashref->{$field}\t"; } print "\n"; } # export the table back as XML $dbh->func( { data_type => 'XML', data_target => "new_projects.xml", data_source => "select * from projects", record_tag => 'projects project', col_names => 'pid,name,description', }, 'export' );

Note on XML import: make sure you include the whole hierarchy of tags in the record_tag argument, from the document root to the record tag itself.

Personal Notes

I really like DBD::RAM. It allows treating lots of structured data as a relational table, including XML (mostly for extracting data from an XML file though). It also allows quick prototyping without having to go through the pain of yet-another-mysql install.

The good
DBD::RAM is really flexible. For example in the CSV format the field and record separators can actually be redefined so it's more like ASV (Anything Separated Values). For most formats field definitions can also be extracted from the first line of the file instead of being hard coded in the script.
The XML import allows you a good deal of customizing the data you want to extract from the XML file, including having records inherit attributes from their parents. Encoding conversions, and especially latin-1 output are also handled.
The documentation is pretty good: it is comprehensive and includes lots of examples that can be cut-n-paste'd.

The not-so-good
The initial debugging of an application can be quite a pain though, as error messages on import are no too helpful, they tell you that something is wrong but not quite where the exact error is.
Some XML data is difficult to extract (for example if several parent tags have attributes with the same name you can't fold them properly) but this can be fixed by a simple pre-processing of the data.
The syntax of the XML option is slightly confusing (space separated list of tags for the record_tag argument, but comma (no space) for the col_names argument.
This should improve with future versions of the module.

Related Modules

For XML processing you might want to have a look at XML::RAX or at DBIx::XML_RDB (to export XML from a RDB).

The future

DBD::RAM should be replaced by AnyData and DBD::AnyData any time now, see this note for the planned architecture.

Update: after the author reviewing the... review I have fixed a couple of (embarassing) mistakes: DBD::RAM does file locking (through flock), but it does not do join. I have fixed the review accordingly (plus a couple of other points Jeff mentioned.)

Update (2): AnyData and DBD::AnyData are out and officially replace DBD::RAM.

Replies are listed 'Best First'.
Re: DBD::RAM
by clemburg (Curate) on Nov 30, 2000 at 22:48 UTC

    If you are interested in processing CSV files with DBI, take a look at DBD::CSV, too.

    Christian Lemburg
    Brainbench MVP for Perl
    http://www.brainbench.com

      Actually DBD::RAM uses DBD::CVS for CVS input

        CSV, not CVS, but I assume you mean that. Funny.

        Christian Lemburg
        Brainbench MVP for Perl
        http://www.brainbench.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: modulereview [id://44114]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2024-03-28 18:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found