Some clarifications on those points.
- If your final version won't need an SQL database, then
a dbm database is fine for the concept stage.
- What separates the need for a relational database from
a dbm is your data model. If you are starting to get into
relationships and correlations between data (eg taking
sales figures and getting reports of sales by customer,
by product etc) then you clearly wanted a relational
database. If you want a simple lookup, then a dbm is
just fine.
- Berkeley DB is indeed an industrial strength database.
It is particularly well suited to situations which need
very high performance for simple tasks. (It is also great
for embedded use, but I digress.) The bottlenecks that
you will hit first have to do with the CGI model.
- Yes. GDBM may as well have BTrees. The wins of BTrees
here are that they keep data in order (hashes do not) and
get better locality of reference (a very organized access
pattern). If your data fits in memory then hashes are
generally faster. If not, then BTrees are not.
- Yes. In high performance read-write situations,
locking is important and how it is done is going to be
your bottleneck. Most web applications are write seldom,
read many times.
- Yes. Backup. And don't expect that binary data
formats will be portable from machine to machine.
- If you want a website to scale, definitely. It is
much easier to balance a load across 5 webservers than
keep 5 databases in sync. However if you are anticipating
this need, using a dbm solution will likely involve some
custom work. Relational databases all have the data
access segregated into its own process so the database can
be moved to another machine. dbms traditionally do not.
- I think you are dramatically overestimating the needed
resources.