Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: mathematical proof

by tilly (Archbishop)
on Feb 03, 2009 at 17:27 UTC ( [id://741064]=note: print w/replies, xml ) Need Help??


in reply to Re^3: mathematical proof
in thread mathematical proof

Hashes and arrays can both be implemented as in memory data structures, or on disk data structures. Therefore it is perfectly reasonable to talk about what each looks like in memory and on disk.

In Perl the two options don't even look different for hashes, you just add an appropriate tie. The difference is slightly larger for arrays because you don't want to use the built-in sort on a large array that lives on disk.

As for my duplicates comment, I am not denying that a hash is better than an array whether or not there are duplicates. However the speed of accessing a hash is basically independent of how many duplicates there are in your incoming data structure. (There are subtle variations depending on, for instance, whether you are just before or after a hash split, but let's ignore that.) The speed of doing a merge sort that eliminates duplicates ASAP varies greatly depending on the mix of duplicates in your incoming data structure. Therefore the array solution improves relative to the hash as you increase the number of duplicates. This doesn't make the array solution

In fact in the extreme case where you have a fixed number of distinct lines in your data structure, the array solution improves from O(n log(n)) to O(n). The hash solution does not improve, it is O(n) regardless.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://741064]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-04-20 01:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found