pritesh has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks,
I've been trying to get back to Perl and was working on a sorting method.
The goal was, to sort a list of numbers. These numbers will be in decimal/hex/octal/binary format, negative and positive, and duplicates. Duplicates should not be removed.
After browsing the PerlMonks forum, I came across some interesting ideas, and came up with the following script:
Note:- kcott helpfully directed me to the inbuilt sort function, but what I was trying to do was create my own sorting function.
use strict;
use warnings;
my @array = (-10,30,5,0b1011,7,9,8, -01111,-10,1,3,9,-30,5,-100,0,-1,0
+xFF,-3,30,0b1011,3,2,-300,-0xFF,15);
sub sortie {
my @sorted;
my @sortit = @_;
my @biglist;
my @uniq;
my $biggest_so_far = 0;
my $smallest_so_far = 0;
foreach my $num(@sortit) {
if ($num > $biggest_so_far) {
$biggest_so_far = $num;
}
if ($num < $smallest_so_far) {
$smallest_so_far = $num;
}
}
@biglist = ($smallest_so_far..$biggest_so_far);
foreach my $bignum ( @biglist) {
foreach my $sortnum (@sortit) {
if ($bignum == $sortnum) {
push @sorted, $bignum;
}
}
}
print "\@array with ", scalar @array, " elements = @array\n
+";
print "\@sorted with ", scalar @sorted, " elements = @sorted\n"
+;
}
sortie(@array);
And here is the output:
PS C:\perl_practice> perl .\best_attempt_sort.pl
@array with 26 elements = -10 30 5 11 7 9 8 -585 -10 1 3 9 -30 5 -
+100 0 -1 255 -3 30 11 3 2 -300 -255 15
@sorted with 26 elements = -585 -300 -255 -100 -30 -10 -10 -3 -1 0
+1 2 3 3 5 5 7 8 9 9 11 11 15 30 30 255
Kindly suggest improvements/modifications.
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by kcott (Archbishop) on Jan 29, 2018 at 22:06 UTC
|
G'day pritesh,
Perl doesn't care how you present the numbers (decimal/octal/etc.).
To sort numbers use the '<=>' operator (sometimes called the spaceship operator).
The documentation for sort has a huge number of examples.
All you really needed here was:
my @unsorted = ( ... );
my @sorted = sort { $a <=> $b } @unsorted;
Here's a quick command line test to show that.
$ perl -E 'say "@{[ sort { $a <=> $b } 30,5,7,9,8, -01111,-10,1,3,9,-3
+0,5,-100,0,-1,0xFF,-3,30,3,2,-300,-0xFF,15 ]}"'
-585 -300 -255 -100 -30 -10 -3 -1 0 1 2 3 3 5 5 7 8 9 9 15 30 30 255
| [reply] [d/l] [select] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by karlgoethebier (Abbot) on Jan 30, 2018 at 09:46 UTC
|
| [reply] [d/l] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by Laurent_R (Canon) on Jan 30, 2018 at 00:54 UTC
|
This is inefficient for a number of reasons.
To start with, suppose you want to sort only 3 numbers, all of them fairly large, say all positive, with 9 digits, and pretty close together. For example: 999,999,998, 999,999,997 and 999,999,999. None of these numbers will be smaller than 0, so that $smallest_so_far will be 0, although 0 is not one of your numbers. This means that your big list will contain all numbers between 0 and 999,999,999, so almost a billion numbers, although if you had used a correct method to find the smallest number so far, it would contain only three numbers. It depends of course on your hardware, but it is likely that Perl might not be able to create such a large array. And, even if it can, it will take ages to iterate over that huge array, and this is of course not necessary with just three numbers so close together. The same argument goes the other way around for $biggest_so_far if all numbers are hugely negative. The correction is that, at the very least, you should initialize $biggest_so_far and $smallest_so_far with a number belonging to the list.
Correcting this error will still not save your day. Your algorithm will be very inefficient if you try to sort just two numbers that are very far apart, say 999,999,999 and -999,999,999. You would build an even larger list (almost two billion candidates), although you really need to sort just two numbers. For such a dataset, even bogosort (see https://en.wikipedia.org/wiki/Bogosort) is extremely likely to be more efficient.
I could go on with other inefficiencies for quite a while.
Computer science has spent a huge number of man-years studying sorting algorithms. It is extremely unlikely that you'll come up with something better on your first try (of course, if you do, you might end up with a Turing Award, but, again, this is extremely unlikely). So I would suggest that you take the time to study the known sorting algorithms (see, for example, https://en.wikipedia.org/wiki/Sorting_algorithm for a start, you might want to look at Robert Sedgewick's writings if you want to go deeper into the matter).
Just for fun, this is a simple Perl implementation of quicksort, an algorithm invented by Tony Hoare in 1959:
sub qs {
return @_ if @_ < 2;
my $p = pop;
qs(grep $_ < $p, @_), $p, qs(grep $_ >= $p, @_);
}
For data which is already almost sorted, this implementation will be somewhat inefficient (merge sort is better in that case); for random data, it is one of the best known algorithms. A couple of small changes will make it efficient in all cases.
| [reply] [d/l] [select] |
|
sub mergesort
{
@_ < 2 and return @_;
my @low = mergesort( @_[0 .. $#_ / 2] );
my @high = mergesort( @_[$#_ / 2 + 1 .. $#_] );
map !@high || @low && $low[0] <= $high[0] ? shift @low : shift @high
+, 1..@_;
}
Also, while looking at that simple quicksort, I realized it could be made stable
by adding a third grep:
sub sqs # stable quicksort
{
@_ < 2 and return @_;
my $p = $_[@_ / 2]; # pivot
sqs( grep $_ < $p, @_ ), grep( $_ == $p, @_ ), sqs( grep $_ > $p, @_
+ );
}
Or, if three passes over the input is just too much for you,
it can be done with only one pass:
sub spsqs # single pass stable quicksort
{
@_ < 2 and return @_;
my ($pivot, @items) = ($_[@_ / 2], [], [], []);
push $items[$_ <=> $pivot]->@*, $_ for @_;
spsqs($items[-1]->@*), $items[0]->@*, spsqs($items[1]->@*)
}
( wondering if that's the first ever use of <=> in a subscript ? )
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by AnomalousMonk (Archbishop) on Jan 30, 2018 at 03:20 UTC
|
Update: Please note that this reply does not consider the OPed algorithm, but instead addresses a general approach to playing with algorithms and algorithm implementations.
Any time you start playing with different algorithms that do the same thing or different implementations of the same algorithm (or, in fact, any new algorithm whatsoever), it's extremely useful to set up a testing framework of some kind. In fact, you might as well get used to setting up the testing framework before you do almost anything else because you'll almost certainly end up doing so sooner or later anyway. (All this assumes you're serious and professional about the work you're doing.) Such a framework allows testing of degenerate, corner/edge, and common cases (update: and exception cases!) easily and quickly. It may be so easy to add new cases that you end up with a huge set of them.
As usual, Perl supports testing with Test::More, Test::Exception and others of that family. I don't say that what follows is ideal, but perhaps food for thought...
(Of course, tests for correctness don't address questions of efficiency.)
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by BillKSmith (Monsignor) on Jan 29, 2018 at 23:42 UTC
|
Writing your own sort can be a good exercise in learning Perl. Developing your own algorithm is surprisingly difficult and error-prone and adds little or nothing to the objective. There are several 'classic' algorithms. Each has its own advantages. Bubble-sort is almost certainly the easiest to understand and implement. This makes it a good candidate for your first sort routine. Find a specification (A flow chart, a detailed description, or perhaps an implementation in another language). When you understand the algorithm, it is time to start writing your Perl implementation. After you have finished, you may wish to create a second version making use of knowledge you have gained. At this point, you are ready to repeat the whole process with an algorithm better suited to the task you have in mind. Ask questions as they come up. Try to keep them very specific.
| [reply] |
|
| [reply] [d/l] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by afoken (Chancellor) on Jan 30, 2018 at 23:18 UTC
|
As long as you only want to sort a reasonable number of small positive integers, try sleep sort. It features a very nice behaviour regarding CPU load, but unfortunately, it is a little bit slow. Nevertheless, it does sort small positive integers quite well.
Some implementations also allow sorting positive floating point numbers.
By adding a sufficiently large number to the numbers to be sorted, the algorithm can easily be extended to sort both positive and negative numbers.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by pritesh (Scribe) on Jan 29, 2018 at 22:38 UTC
|
Hi Anonymous Monk,
Please note. I am not stating that this is the best one there ever was. Just stating that this is the best one I could come up with.
As for the script not working when the biggest number is a negative, here is what I tried
my @array = (-10,30,5,0b1011,7,9,8, -01111,-10,1,-0xFFF,-19923,3,9,-30
+,5,-100,0,-1,0xFF,-3,30,0b1011,3,2,-300,-0xFF,15);
And here is what I get
@array with 28 elements = -10 30 5 11 7 9 8 -585 -10 1 -4095 -1992
+3 3 9 -30 5 -100 0 -1 255 -3 30 11 3 2 -300 -255 15
@sorted with 28 elements = -19923 -4095 -585 -300 -255 -100 -30 -10
+ -10 -3 -1 0 1 2 3 3 5 5 7 8 9 9 11 11 15 30 30 255
Even when there are no negative numbers, the script seems to work fine
my @array = (5,3,4,5,5,1,432,4,3,2)
Output:
@array with 10 elements = 5 3 4 5 5 1 432 4 3 2
@sorted with 10 elements = 1 2 3 3 4 4 5 5 5 432
Kindly let me know which numbers did you try in the list when you noticed the issue
As for CPAN having a lot of pre tested and production ready sort functions, I wholehearted agree with that, and I also know that Perl has it's own sort function, which serves the purpose pretty well. I was just doing this to pacify my own curiosity.
| [reply] [d/l] [select] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by tybalt89 (Monsignor) on Feb 02, 2018 at 22:51 UTC
|
Instead if guessing what a starting value for the biggest and smallest values are,
my $biggest_so_far = 0;
my $smallest_so_far = 0;
Use the first value in the array for both :
my $biggest_so_far = my $smallest_so_far = $sortit[0];
This way, you get the exact range.
| [reply] [d/l] [select] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by pritesh (Scribe) on Jan 29, 2018 at 22:30 UTC
|
| [reply] [d/l] |
|
"Thanks ..."
You're welcome.
For future reference, please use the [reply] link associated with the post to which you're replying.
I see a couple of instances, including this one to me, where you've actually replied to your original post.
The problem with doing this is that it breaks the flow of the thread and, the person to whom you're replying,
won't receive a notification of your response
(you, in turn, may not receive further feedback which you might have requested).
| [reply] |
Re: My best attempt at sorting. Kindly suggest improvements/modifications.
by Anonymous Monk on Jan 29, 2018 at 22:32 UTC
|
I say in all kindness that your algorithm is an example of a naiive, remarkably inefficient, and actually-broken sorting algorithm which would not work at all if the biggest number in the list was negative and which does not properly identify the smallest one unless it is less than zero. If you are interested in classical sorting algorithms – Shell sort, merge sort, and so on – working examples are readily available in source-code form either in Perl or in other languages that can be adapted to Perl. But, as you've already learned, Perl contains a very robust built-in sort verb. Beyond that, the CPAN library of contributed code contains battle-tested packages that are well suited to any production situation. You never have to write your own sort. | [reply] |
|
In Re: My best attempt at sorting. Kindly suggest improvements/modifications. we seem to see that all your assertions are inaccurate. Could you perhaps—as already naïvely requested by pritesh, an absolute beginner not worthy of supplication to, say, a THIRTY YEAR VETERAN OF THE PSYC^W CODE WARS—supply test code that demonstrates the problems you describe? Otherwise the lack of code, lack of testing, and the battle tested idioms already noticed seem to point to the necessity of downvoting you. Downvoting you to the max.
See also–
| [reply] |
|
+1 for Heavy Metal
Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past
| [reply] |
|
sundialsvc4 a.k.a Mark, is that you? Did you even test the code?
| [reply] |
|
|