http://qs321.pair.com?node_id=109898

pmas has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to decide how to implement my CGI script (reading from database into buffer-like structures). I am not ready to implement it in OO, DBIx::Recordset looks too complicated, Tie:DB claims is rather slow... I decided to go KISS way: just to implement simple layer around DBI calls to be used for 80% calls, and do rest by some custom way later, if and when needed.

Now, thinking about reading one record of data: I can fetch array, or hashref. Reading into hash structure looks more promising (field values will be placed in hash elements, conveniently named as fieldnames), obviously it is for the price. But what is the price? I do not know how to find it.

I am aware that premature optimization is evil, but before developing rules and guidelines I would like to know consequences of my decision. I can benchmark access time and compare hash and array, but how about memory usage? Right now it is not a big issue, but later my scripts should run website with dozens of daily users, using mod_perl.

Before reading source, I would like to tap vast knowledge of our monks here and ask:

  1. What strategy should I use when trying to access memory usage of different ways of implementing features? Do we have some Benchmark::Memory package for it? Any links to articles/approaches?
  2. Your gut feeling: How much more memory uses hash comparing with the same data placed in plain array? I read that some "buckets" exist, but do not know details. What to read? What to worry about?
  3. I probably will use many hashes. Is better to use many hashes with small number of items, or overhead is too big, and preferable is small number of hashes with large number of items? (I.e. JOIN multiple tables and read them into one hash, or read every table into own hash). I am aware that reading it in one DB query is better time-wise, I just want to know if putting all JOIN-able records in one hash will be another incentive - to save memory.

I know I should go object way. I promise version 2 will be object. I am not ready yet... ;-)

pmas
To make errors is human. But to make million errors per second, you need a computer.