Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: An informal introduction to O(N) notation

by pg (Canon)
on Jan 18, 2003 at 23:05 UTC ( [id://228083]=note: print w/replies, xml ) Need Help??


in reply to An informal introduction to O(N) notation

  1. As hash is so heavily used in Perl, it is worth to explain why hash search is O(n).

    One runs into the worst case of a hash search when all the elements calculate to the same hash key. In this case, a hash becomes no more than a one-dimentional chain of elements.

    Even worse, the element one trying to look up happens to be the last element in the sequence. Now the searching engine has to go throguh the entire chain, all n elements, to find what one wants.

  2. (In rest of the post, I will just use small o instead of big O, as now I am more focused on complexity, doesn't matter whether it is for worst case, best case or whatever. Big O is just a notion saying that it is for the worst case.)

    In an average case, if n is the number of elements we have in a hash, and k is the number of all possible hash key values. Ideally all elements would spread nearly even among possible hash keys, so the chain length under each hash key is n/k. Averagely you need to go though half of the chain to get what you want, thus you need to go thru n/2k elements.

    So averagely a hash search would close to o(n/2k) (be very careful, n is a variable when k is a constant, this is serious), which ~ o(n).

    How come the average case is the same as the worst case, NO they are not the same, but they are ~.

  3. Some time it is seriously dangerous to casually simplify things that is seriously complex.

    o(n) is not o(n/2k), but o(n) ~ o(n/2k) (again, n is variable, and k is constant, this is very serious), the easiest way to explain ~, yet does not lose seriousness too much is that: the speed o(n) and o(n/2k) approach infinite is the same.

    Although o(n) ~ o(n/10^100), what takes o(n/10^100) algorithm 1 sec, might take a o(n) alogorithm 10^100 sec to finish. They are far from the same.
  • Comment on Re: An informal introduction to O(N) notation

Replies are listed 'Best First'.
Re: Re: An informal introduction to O(N) notation
by Jenda (Abbot) on Jan 19, 2003 at 17:54 UTC

    Most often k is not a constant, but a function of n. If for example k would be ~ c * n then the average length of the lists of items in each bucket would be ~ n / (c * n) which is ~ 1 / c. Which is a constant.

    Therefore the key lookup of the usual implementations of hashes has average complexity of o(1).

    Of course this implementation forces you to recreate the hash when the number of keys grows over some limit which can be time consuming.

    Jenda

    P.S.: I just looked at the number of buckets in the Perl hashes (I iteratively added elements, watched the number of buckets and printed the number of elements and buckets after every rehashing). This is the results:

    1 => 8 9 => 16 18 => 32 34 => 64 70 => 128 130 => 256 258 => 512 514 => 1024 1024 => 2048 2052 => 4096 4098 => 8192 8193 => 16384 16386 => 32768 32768 => 65536 65538 => 131072 Perl 5.8.0 built for MSWin32-x86-multi-thread
Re: Re: An informal introduction to O(N) notation
by theorbtwo (Prior) on Jan 18, 2003 at 23:49 UTC

    Thank you. This explains everything I was attempting to, but does it lucidly. Moreover, it goes in to a lot of things I didn't know about.

    Only one thing I have to add: You can find the number of items in a hash with scalar keys %hash, and the number of used/total buckets with ($used_buckets, $total_buckets) = split '/', scalar %hash. With those numbers in hand, you should be able to get more exact timings, or demonstrate to yourself that what pg says is true.


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://228083]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (2)
As of 2024-04-25 19:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found