Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re^2: Need DBM file that holds data up to 50,000 bytes

by pvaldes (Chaplain)
on Aug 12, 2014 at 11:01 UTC ( [id://1097098] : note . print w/replies, xml ) Need Help??

in reply to Re: Need DBM file that holds data up to 50,000 bytes
in thread Need DBM file that holds data up to 50,000 bytes

I was asking myself exactly the same question: 'why not one of the bigs like mysql or postgres or firebird or...?'

Alternatively you could consider a NoSQL database (i.e Mongo) or even plain perl for this. As you said that speed is secondary, and those use text files for storage, a 'terabyte level' size file (much more that what you need probably) is guaranteed in most systems

mongo tutorial (CPAN)
  • Comment on Re^2: Need DBM file that holds data up to 50,000 bytes

Replies are listed 'Best First'.
Re^3: Need DBM file that holds data up to 50,000 bytes
by Tux (Canon) on Aug 12, 2014 at 11:28 UTC

    Though maybe interesting, Postgres' hstore feature is a language on itself and does not easily integrate with how other access methods work. There is Pg::hstore, but the API is IMHO not very obvious. It for sure is not an easy replacement for DB_File.

    In my perception *all* databases suck. Not all suck the same way, but there is no perfect database (yet). You will need to investigate your needs before making a choice. Oracle has NULL problems (and is costly), MySQL does not follow ANSI in its default configuration and uses stupid quoting, Postgres will return too much by default on big tables, Unify does not support varchar, CSV is too slow, SQLite does not support multiple concurrents sessions, Firebird has no decent DBD (yet), DB2 is bound to IBM, Ingres has not many users in the Perl community etc etc.

    Too many factors to think about. For a single-user easy DB_File replacement, BerkeleyDB comes first, then Tie::Hash::DBD in combination with DBD::SQLite. I say so because neither needs any special environment or configuration. Once you choose a major DB (whatever you choose), you will need additional knowledge or services. My choice then would be Postgres, as it is the easiest to work with and confronts me with the least irritation.

    Nobody mentioned other alternatives yet:

    Enjoy, Have FUN! H.Merijn
      Postgres will return too much by default on big tables

      What do you mean with "return too much"? That sounds serious.

      And I think you give not enough credit for the freedom of the software. Oracle has a great database but it's ridiculously expensive to run even a single instance, and Mysql and BerkeleyDB are pawns in Oracle hands. In my opinion that is a *very* good reason not to use them (I kicked them out when Oracle took them over).

      With regard to cdb: its main annoyance is that it is for databases that do not change (this is by design: it's after all named cdb: "constant database"). Perhaps it is fits the OPs requirements but it is often a pain (kicked that out, too ;-))

      SQLite is nice but pretty simple (and was and is inspired by PostgreSQL, its author told us at PGCon - see here, the talk by Richard Hipp).

      Yeah, I agree: Too many factors to think about :)

      And IMHO PostgreSQL does not suck. :)

        I subscribe to most of your comments.

        I value freedom of software more than you can imagine. I supply patches for fixes or new features for many OpenSource projects I use in daily life. Today I enhanced Claws-Mail. Awaiting further discussion two of their main committers liked the patch and even want to change the default to the new behavior (still being able to use the old behavior).

        Maybe that is also what I like so much in our beloved Perl community. CPAN motivates users to give valuable feedback. Either by patches, bug reports or comments.

        I find myself using MySQL with more and more reluction over the years. The problem is that none of my annoyences are gone with new releases.

        I have to use Oracle and Unify because of our customers, but I keep wondering how Oracle can get away with bad NULL/Empty-string handling in varchar. By now "backward compatibility" is no valid excuse anymore.

        I have close to no grunts with Postgres. What I remember is that select * from foo; where foo has millions of records, wil build the complete list as result set before it returns the first record. That may take long. Now that my postgres servers are fast and have lots of RAM, it is no real concern to me, but I hate having to use LIMIT in a loop to get smaller sets. Things might have changed since then and I am quite happy with postgres 9.3.5

        Unify is great for BLOB handling. it is completely transparent and has no limits, unlike that sucking 4k limit in Oracle.

        Enjoy, Have FUN! H.Merijn

      Care to elaborate on the "don't use" with the xDBM_File? What sort of issues?

        Mainly that they are old, highly undermaintained and lack features that DB_File and BerkeleyDB ship with.

        Another issue is that you might not really use the version you think you use, as many Linux systems ship with compatability libraries nowadays (if available at all). GDBM_File might be what you expect it to be, but likely that the libraries for ndbm are not what you think they are. odbm and sdbm are getting rare.

        Enjoy, Have FUN! H.Merijn