Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

My best attempt at sorting. Kindly suggest improvements/modifications.

by pritesh (Scribe)
on Jan 29, 2018 at 21:50 UTC ( [id://1208073]=perlquestion: print w/replies, xml ) Need Help??

pritesh has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I've been trying to get back to Perl and was working on a sorting method.

The goal was, to sort a list of numbers. These numbers will be in decimal/hex/octal/binary format, negative and positive, and duplicates. Duplicates should not be removed.

After browsing the PerlMonks forum, I came across some interesting ideas, and came up with the following script:

Note:- kcott helpfully directed me to the inbuilt sort function, but what I was trying to do was create my own sorting function.

use strict; use warnings; my @array = (-10,30,5,0b1011,7,9,8, -01111,-10,1,3,9,-30,5,-100,0,-1,0 +xFF,-3,30,0b1011,3,2,-300,-0xFF,15); sub sortie { my @sorted; my @sortit = @_; my @biglist; my @uniq; my $biggest_so_far = 0; my $smallest_so_far = 0; foreach my $num(@sortit) { if ($num > $biggest_so_far) { $biggest_so_far = $num; } if ($num < $smallest_so_far) { $smallest_so_far = $num; } } @biglist = ($smallest_so_far..$biggest_so_far); foreach my $bignum ( @biglist) { foreach my $sortnum (@sortit) { if ($bignum == $sortnum) { push @sorted, $bignum; } } } print "\@array with ", scalar @array, " elements = @array\n +"; print "\@sorted with ", scalar @sorted, " elements = @sorted\n" +; } sortie(@array);

And here is the output:

PS C:\perl_practice> perl .\best_attempt_sort.pl @array with 26 elements = -10 30 5 11 7 9 8 -585 -10 1 3 9 -30 5 - +100 0 -1 255 -3 30 11 3 2 -300 -255 15 @sorted with 26 elements = -585 -300 -255 -100 -30 -10 -10 -3 -1 0 +1 2 3 3 5 5 7 8 9 9 11 11 15 30 30 255

Kindly suggest improvements/modifications.

Replies are listed 'Best First'.
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by kcott (Archbishop) on Jan 29, 2018 at 22:06 UTC

    G'day pritesh,

    Perl doesn't care how you present the numbers (decimal/octal/etc.). To sort numbers use the '<=>' operator (sometimes called the spaceship operator). The documentation for sort has a huge number of examples.

    All you really needed here was:

    my @unsorted = ( ... ); my @sorted = sort { $a <=> $b } @unsorted;

    Here's a quick command line test to show that.

    $ perl -E 'say "@{[ sort { $a <=> $b } 30,5,7,9,8, -01111,-10,1,3,9,-3 +0,5,-100,0,-1,0xFF,-3,30,3,2,-300,-0xFF,15 ]}"' -585 -300 -255 -100 -30 -10 -3 -1 0 1 2 3 3 5 5 7 8 9 9 15 30 30 255

    — Ken

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by karlgoethebier (Abbot) on Jan 30, 2018 at 09:46 UTC
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by Laurent_R (Canon) on Jan 30, 2018 at 00:54 UTC
    This is inefficient for a number of reasons.

    To start with, suppose you want to sort only 3 numbers, all of them fairly large, say all positive, with 9 digits, and pretty close together. For example: 999,999,998, 999,999,997 and 999,999,999. None of these numbers will be smaller than 0, so that $smallest_so_far will be 0, although 0 is not one of your numbers. This means that your big list will contain all numbers between 0 and 999,999,999, so almost a billion numbers, although if you had used a correct method to find the smallest number so far, it would contain only three numbers. It depends of course on your hardware, but it is likely that Perl might not be able to create such a large array. And, even if it can, it will take ages to iterate over that huge array, and this is of course not necessary with just three numbers so close together. The same argument goes the other way around for $biggest_so_far if all numbers are hugely negative. The correction is that, at the very least, you should initialize $biggest_so_far and $smallest_so_far with a number belonging to the list.

    Correcting this error will still not save your day. Your algorithm will be very inefficient if you try to sort just two numbers that are very far apart, say 999,999,999 and -999,999,999. You would build an even larger list (almost two billion candidates), although you really need to sort just two numbers. For such a dataset, even bogosort (see https://en.wikipedia.org/wiki/Bogosort) is extremely likely to be more efficient.

    I could go on with other inefficiencies for quite a while.

    Computer science has spent a huge number of man-years studying sorting algorithms. It is extremely unlikely that you'll come up with something better on your first try (of course, if you do, you might end up with a Turing Award, but, again, this is extremely unlikely). So I would suggest that you take the time to study the known sorting algorithms (see, for example, https://en.wikipedia.org/wiki/Sorting_algorithm for a start, you might want to look at Robert Sedgewick's writings if you want to go deeper into the matter).

    Just for fun, this is a simple Perl implementation of quicksort, an algorithm invented by Tony Hoare in 1959:

    sub qs { return @_ if @_ < 2; my $p = pop; qs(grep $_ < $p, @_), $p, qs(grep $_ >= $p, @_); }
    For data which is already almost sorted, this implementation will be somewhat inefficient (merge sort is better in that case); for random data, it is one of the best known algorithms. A couple of small changes will make it efficient in all cases.

      And just for fun, here's a simple implementatin of a mergesort:

      sub mergesort { @_ < 2 and return @_; my @low = mergesort( @_[0 .. $#_ / 2] ); my @high = mergesort( @_[$#_ / 2 + 1 .. $#_] ); map !@high || @low && $low[0] <= $high[0] ? shift @low : shift @high +, 1..@_; }

      Also, while looking at that simple quicksort, I realized it could be made stable by adding a third grep:

      sub sqs # stable quicksort { @_ < 2 and return @_; my $p = $_[@_ / 2]; # pivot sqs( grep $_ < $p, @_ ), grep( $_ == $p, @_ ), sqs( grep $_ > $p, @_ + ); }

      Or, if three passes over the input is just too much for you, it can be done with only one pass:

      sub spsqs # single pass stable quicksort { @_ < 2 and return @_; my ($pivot, @items) = ($_[@_ / 2], [], [], []); push $items[$_ <=> $pivot]->@*, $_ for @_; spsqs($items[-1]->@*), $items[0]->@*, spsqs($items[1]->@*) }

      ( wondering if that's the first ever use of <=> in a subscript ? )

      Elegantly explained Sir...Thank you very much.

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by AnomalousMonk (Archbishop) on Jan 30, 2018 at 03:20 UTC

    Update: Please note that this reply does not consider the OPed algorithm, but instead addresses a general approach to playing with algorithms and algorithm implementations.

    Any time you start playing with different algorithms that do the same thing or different implementations of the same algorithm (or, in fact, any new algorithm whatsoever), it's extremely useful to set up a testing framework of some kind. In fact, you might as well get used to setting up the testing framework before you do almost anything else because you'll almost certainly end up doing so sooner or later anyway. (All this assumes you're serious and professional about the work you're doing.) Such a framework allows testing of degenerate, corner/edge, and common cases (update: and exception cases!) easily and quickly. It may be so easy to add new cases that you end up with a huge set of them.

    As usual, Perl supports testing with Test::More, Test::Exception and others of that family. I don't say that what follows is ideal, but perhaps food for thought...

    (Of course, tests for correctness don't address questions of efficiency.)


    Give a man a fish:  <%-{-{-{-<

      Whoa...that's gonna take a while for me to sink in. Thanks for the guidance.

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by BillKSmith (Monsignor) on Jan 29, 2018 at 23:42 UTC
    Writing your own sort can be a good exercise in learning Perl. Developing your own algorithm is surprisingly difficult and error-prone and adds little or nothing to the objective. There are several 'classic' algorithms. Each has its own advantages. Bubble-sort is almost certainly the easiest to understand and implement. This makes it a good candidate for your first sort routine. Find a specification (A flow chart, a detailed description, or perhaps an implementation in another language). When you understand the algorithm, it is time to start writing your Perl implementation. After you have finished, you may wish to create a second version making use of knowledge you have gained. At this point, you are ready to repeat the whole process with an algorithm better suited to the task you have in mind. Ask questions as they come up. Try to keep them very specific.
    Bill

      Hi,

      Thanks a lot for your suggestions.

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by afoken (Chancellor) on Jan 30, 2018 at 23:18 UTC

    As long as you only want to sort a reasonable number of small positive integers, try sleep sort. It features a very nice behaviour regarding CPU load, but unfortunately, it is a little bit slow. Nevertheless, it does sort small positive integers quite well.

    Some implementations also allow sorting positive floating point numbers.

    By adding a sufficiently large number to the numbers to be sorted, the algorithm can easily be extended to sort both positive and negative numbers.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by pritesh (Scribe) on Jan 29, 2018 at 22:38 UTC

    Hi Anonymous Monk,

    Please note. I am not stating that this is the best one there ever was. Just stating that this is the best one I could come up with.

    As for the script not working when the biggest number is a negative, here is what I tried

    my @array = (-10,30,5,0b1011,7,9,8, -01111,-10,1,-0xFFF,-19923,3,9,-30 +,5,-100,0,-1,0xFF,-3,30,0b1011,3,2,-300,-0xFF,15);

    And here is what I get

    @array with 28 elements = -10 30 5 11 7 9 8 -585 -10 1 -4095 -1992 +3 3 9 -30 5 -100 0 -1 255 -3 30 11 3 2 -300 -255 15 @sorted with 28 elements = -19923 -4095 -585 -300 -255 -100 -30 -10 + -10 -3 -1 0 1 2 3 3 5 5 7 8 9 9 11 11 15 30 30 255

    Even when there are no negative numbers, the script seems to work fine

    my @array = (5,3,4,5,5,1,432,4,3,2)

    Output:

    @array with 10 elements = 5 3 4 5 5 1 432 4 3 2 @sorted with 10 elements = 1 2 3 3 4 4 5 5 5 432

    Kindly let me know which numbers did you try in the list when you noticed the issue

    As for CPAN having a lot of pre tested and production ready sort functions, I wholehearted agree with that, and I also know that Perl has it's own sort function, which serves the purpose pretty well. I was just doing this to pacify my own curiosity.

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by tybalt89 (Monsignor) on Feb 02, 2018 at 22:51 UTC

    Instead if guessing what a starting value for the biggest and smallest values are,

    my $biggest_so_far = 0; my $smallest_so_far = 0;

    Use the first value in the array for both :

    my $biggest_so_far = my $smallest_so_far = $sortit[0];

    This way, you get the exact range.

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by pritesh (Scribe) on Jan 29, 2018 at 22:30 UTC

    Hi kcott,

    Thanks for the update. I was trying to sort the list without using any inbuilt sorting.

      "Thanks ..."

      You're welcome.

      For future reference, please use the [reply] link associated with the post to which you're replying. I see a couple of instances, including this one to me, where you've actually replied to your original post. The problem with doing this is that it breaks the flow of the thread and, the person to whom you're replying, won't receive a notification of your response (you, in turn, may not receive further feedback which you might have requested).

      — Ken

Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by Anonymous Monk on Jan 29, 2018 at 22:32 UTC
    I say in all kindness that your algorithm is an example of a naiive, remarkably inefficient, and actually-broken sorting algorithm which would not work at all if the biggest number in the list was negative and which does not properly identify the smallest one unless it is less than zero. If you are interested in classical sorting algorithms – Shell sort, merge sort, and so on – working examples are readily available in source-code form either in Perl or in other languages that can be adapted to Perl. But, as you've already learned, Perl contains a very robust built-in sort verb. Beyond that, the CPAN library of contributed code contains battle-tested packages that are well suited to any production situation. You never have to write your own sort.

        +1 for Heavy Metal

        Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past

      sundialsvc4 a.k.a Mark, is that you? Did you even test the code?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1208073]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (2)
As of 2024-04-26 04:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found