Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^5: Heap sorting in perl

by adrianh (Chancellor)
on Apr 05, 2003 at 16:29 UTC ( [id://248313]=note: print w/replies, xml ) Need Help??


in reply to Re: Re^3: Heap sorting in perl
in thread Heap sorting in perl

I don't follow your reasoning.

We were talking at cross purposes. I thought that you were proposing to heap sort the entire dataset, not the subset of smallest elements.

I admit that my using sort to order the smallest subset is a hack, but unless N is large or the comparison function is expensive, it will be faster than a pure-perl heap.

As for overkill, using Perl to solve a problem that can be solved with the correct option to the Unix sort utility is overkill. But if you need a general purpose solution in Perl and you have a heap implementation there already, why not use it?

I don't see how the Unix sort applies. We don't want to sort the whole dataset - we want the smallest N elements (and we also don't know if the dataset consists of line oriented data.)

Replies are listed 'Best First'.
Re: Re^5: Heap sorting in perl
by Anonymous Monk on Apr 05, 2003 at 17:02 UTC
    I was proposing to solve the problem originally stated. I thought that my description of how to solve it using a heap was fairly clear on that point by telling you when to start removing elements while still adding them.

    I agree that until N is large (causing, eg, memory problems) or the comparison function is expensive, using sort will be faster. The heap solution outlined is helpful only for large N. For an expensive comparison function, it is helpful once you start pulling off elements to keep track of the largest that you have in your set. Anything which is larger than or equal to that one should not be added. This reduces the number of comparisons significantly.

    As for the sort utility, I had thought that it had an option to only return the largest N elements. But looking at it more carefully I don't see it. It must have been somewhere else that I saw that. My bad. :-(

      I was proposing to solve the problem originally stated. I thought that my description of how to solve it using a heap was fairly clear on that point by telling you when to start removing elements while still adding them.

      I know. Just misinterpreted it. Sorry. One of my many moments of stupidity ;-)

      I agree that until N is large (causing, eg, memory problems) or the comparison function is expensive, using sort will be faster. The heap solution outlined is helpful only for large N.

      A good XS heap implementation would still win over sort with small N. It's just a C vs Perl issue speedwise.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://248313]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-26 00:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found