Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Basic Database?

by gildir (Pilgrim)
on Jun 11, 2001 at 19:14 UTC ( [id://87525]=note: print w/replies, xml ) Need Help??


in reply to Basic Database?

There are two possible solutions:

*DBM
Use dbm file with embeded data structures serialized. You may use MLDB for this purpose, but as it uses Data::Dumper it could be somewhat slow for your purposes (see Serialization uncovered). But the real pain in this are indexes. *DBM allow you only one key. You must maintain several other *DBM files with mappings of secondary keys to primary keys. And you must maintain them consistent. And if you want to do a substring search, there is no easy method except for linear search. And that could be terribly slow even for small database. One more problem with *DBM files is parallel access. You cannot open *DBM file for writing in two precesses. And if you do open/close scenario, data cannot be cached and the access is very inefficient.

database/directory
Use RDBMS (Oracle,MySQL,...). Choose some lighweighted one and it will have little impact on performance compared with programming comfor you gain. Or even better use a directory service, like LDAP. There is excelent open-source LDAP server: OpenLDAP. This will do all the dirty index/schema/caching things. There are at leas two modules on CPAN for LDAP access (Net::LDAP and perldap) and both works fine for me.

Replies are listed 'Best First'.
Re: Re: Basic Database?
by eduardo (Curate) on Jun 11, 2001 at 19:24 UTC
    Only one comment:
    And if you want to do a substring search, there is no easy method except for linear search. And that could be terribly slow even for small database.
    In DB_File under the section Matching Partial Keys we see:

    Matching Partial Keys

    The BTREE interface has a feature which allows partial keys to be matched. This functionality is only available when the seq method is used along with the R_CURSOR flag.
    $x->seq($key, $value, R_CURSOR) ;
    Here is the relevant quote from the dbopen man page where it defines the use of the R_CURSOR flag with seq:
    Note, for the DB_BTREE access method, the returned key is not necessarily an exact match for the specified key. The returned key is the smallest key greater than or equal to the specified key, permitting partial key matches and range searches.
    In the example script below, the match sub uses this feature to find and print the first matching key/value pair given a partial key.
    use strict ; use DB_File ; use Fcntl ; use vars qw($filename $x %h $st $key $value) ; sub match { my $key = shift ; my $value = 0; my $orig_key = $key ; $x->seq($key, $value, R_CURSOR) ; print "$orig_key\t-> $key\t-> $value\n" ; } $filename = "tree" ; unlink $filename ; $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0640, $D +B_BTREE or die "Cannot open $filename: $!\n"; # Add some key/value pairs to the file $h{'mouse'} = 'mickey' ; $h{'Wall'} = 'Larry' ; $h{'Walls'} = 'Brick' ; $h{'Smith'} = 'John' ; $key = $value = 0 ; print "IN ORDER\n" ; for ($st = $x->seq($key, $value, R_FIRST) ; $st == 0 ; $st = $x->seq($key, $value, R_NEXT) ) { print "$key -> $value\n" } print "\nPARTIAL MATCH\n" ; match "Wa" ; match "A" ; match "a" ; undef $x ; untie %h ;
    Here is the output:
               IN ORDER
               Smith -> John
               Wall  -> Larry
               Walls -> Brick
               mouse -> mickey
    
               PARTIAL MATCH
               Wa -> Wall  -> Larry
               A  -> Smith -> John
               a  -> mouse -> mickey
    
    So, although I don't know how it is that DBM implements that internally, it does seem like it would give you a little bit more functionality (or at least a more elegant interface) than just a linear search. Comments?
      The returned key is the smallest key greater than or equal to the specified key

      That will allows you to search for foo*, but not for *foo nor *foo*. Substring search is *foo*-like, but you are right about the 'added functionality'. In some cases even foo* search can be sufficient.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://87525]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-26 07:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found