Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: writing perl function like sql IN

by davido (Cardinal)
on Aug 31, 2004 at 06:00 UTC ( [id://387110]=note: print w/replies, xml ) Need Help??


in reply to Re: writing perl function like sql IN
in thread writing perl function like sql IN

grep may be a common solution, but it's not necessarily the most efficient. It probably doesn't matter for small sets or infrequent tests. But if efficiency does matter, one must keep in mind that the grep solution always has an O(n) outcome, even if the item you're looking for is the first element found. grep always searches the entire list.

Any linear search solution where the loop terminates as soon as the item is found will have a worst case of O(n), but an average case of O(n/2) (which isn't really true big-oh notation)...assuming randomly ordered list.

A binary search solution will achieve O(log N), again, assuming that the binary search quits as soon as the item is found. That's an order of magnitude faster than the linear search solution.

A hashing algorithm, well written, such as Perl's %hash can achieve O(1), and is not sensitive to order nor should it be affected by dataset size (see the O'Reilly Wolf book, Mastering Algorithms with Perl, pp. 158-159, 168). So for many purposes a good old fashioned Perl hash might be one of the best solutions. ...assuming the dataset is in memory.

For those unfamiliar with big-oh (O(x)) notation, it's often used as an analysis of order of growth of work an algorithm must do as a dataset grows. "O(1)" means no growth. "O(n)" means a one-to-one growth ratio. "O(log n)" is an order of magnitude slower growth than "O(n)", for example. The reason that I said O(n/2) for average case was not really valid big-oh notation is because big-oh isn't concerned with average cases. Big-O notation represents an "order of growth won't be worse than..." relationship. Just a quick refresher. ;)


Dave

Replies are listed 'Best First'.
Re^3: writing perl function like sql IN
by Aristotle (Chancellor) on Aug 31, 2004 at 09:00 UTC

    You forget to mention the other side of the trade-off: a hash trades memory for speed. grep will run in O(N), but also consume a lot less memory. It also requires a certain amount of work to set up the hash; if you're only going to test a particular set few times, it may well end up not amortizing that up front effort.

    TANSTAAFL.

    Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://387110]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-23 20:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found