Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
For a simple lookup at a given index, an array cannot be beaten for pure speed. I'm assuming what you were trying to do in the original script was check to see if a certain value was contained in an array. I will use a bit of terminology from lhoward's excellent post here, namely the big-O notation.

Hashing is done by taking the key of the hash, performing a "hashing" function on this key, and obtaining a number from it. Then, this number is used as an index into an array. A good hashing function will produce an array that is fairly well populated. For instance, if your key is 'name', a function will be called with the argument of 'name'. Some pseudocode might look like

my $random = hashing_function('name'); $array_of_hash_values[$random] = $value;
The hashing_function is usually something that calculates a value based on the string, and performs a modulo (%) by the fixed size of the array. Then, for any subsequent lookups into the hash table, you run the key through the hashing_function again, and take whatever is at that index.
sub lookup_value { my $index = hashing_function(shift()); return $array_of_hash_values[$index]; }
If more than one key hashes to the same index, usually an array is built off of that array, which is why hashes are -ideally- O(1), but not necessarily.

Array lookups are always O(1), and faster than hash lookups since they don't have to perform any hashing function beforehand. My guess is that you were having to do some traversing of the array to look for a certain value, in which case, your lookup function would be O(n), where n is the size of the array. By switching to a hash table, you were able to increase the speed of this lookup function. Other places where a hash would help would be in a delete function, as you could just get rid of the value, instead of having to worry about shifting all values down if the deleted index is in the middle of the array.

In general, hashes are much more efficient than arrays for anything that needs to have a certain relationship between index and its value, but arrays are much faster than hashes if all you want is a list of values where the index is unimportant.

Hope this all makes sense.


In reply to Re: Why are hashes so much faster? by plaid
in thread Why are hashes so much faster? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-19 12:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found