Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Many times a database table fits nicely in a list of lists (2 dimensional array). Other times a hash works better. I worked on a project where list of lists was chosen after extensive testing.

Questions to think about for hash versus array:

Is your data rectangular?
Regular database tables without NULLs nicely fits a rectangular data structures. Irregular data fits better in hashes.

What sort of iteration will you need to do?
If you will know the key to find your data element, a hash is much better than searching over a list. If you need to do a comparison on each key anyway, an array might be easier to search. For example, if you are checking each key to see if it matches a regular expression, an array can be better. (Let me know if you want to know why :-).

Are the keys of the hash simple to implement?
If you have one database field that is the key, it is easy to use it as a hash key. If the database rows are keyed with multiple columns, the hash gets more complicated since you will need to combine columns to make the hash key.

Do you need to keep reloading your data structures from the database, or are they static?
If they are static, you can use a few tricks that save memory. If you have a machine with shared libraries and copy-on-write virtual memory, you can get multiple modperl http daemons to share the database data.

You can presize arrays so that they have less memory overhead. For measuring memory consumption, the normal tools such as ps and top will work reasonably well.

To see if high-level behavior such as copy-on-write is working properly, you need to stress test your server to see how much traffic load causes it to swap. You really need to use a development server for this type of testing.

Writing a stress-test program is fun! Create a program using LWP to simulate the behavior of a single user. Run a bunch of these programs at the same time to simulate the load caused by many users. You can get typical user behavior patterns by examining your server log files. This approach allows you make impressive claims, such as "This system is scaled for two second page load times with 1000 simultaneous users."

It should work perfectly the first time! - toma


In reply to Re: Comparing memory requirements for hashes and arrays by toma
in thread Comparing memory requirements for hashes and arrays by pmas

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2024-04-19 23:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found