ketema has asked for the wisdom of the Perl Monks concerning the following question:
Is there a way to splice() into an array correctly that has already been sorted by the sort function? Example:
@known = sort byIndex(@known);
splice(@known,(the correct Index),1,$newValue);
sub byIndex{
$a->{'Attribute'} <=> $b->{'Attribute'};
}
print "Done\n";
Thanks Ketema
P.S. the array is of references to hashes and the array is sorted based on a numeric attribute of those hashes.
Re: Insert Into a Sorted Array
by Zaxo (Archbishop) on Sep 29, 2004 at 03:21 UTC
|
The mergesort version of sort in perl 5.8+ will do that as well as anything1 by just sorting in the new value,
@known = sort byIndex @known;
@known = sort byIndex $newValue, @known;
The quicksort of previous versions of perl is bad for this. It has a quadratic worst case for nearly sorted lists.
1 Actually not as good as a binary search for a single insertion point, which is O(logN). That is how you'd perform your own suggestion. As tachyon says, though, insertion with splice is still O(N/2).
| [reply] [d/l] |
|
This I would like for ease of readability, did a perldoc -f mergesort, and didn't find anything. Where is this function documented?
| [reply] |
|
| [reply] |
Re: Insert Into a Sorted Array
by tachyon (Chancellor) on Sep 29, 2004 at 03:11 UTC
|
my @ary = (1..4, 6..10);
print "@ary\n";
for my $val( 0, 5, 11 ) {
bubble( \@ary, $val );
print "@ary\n";
}
sub bubble {
unshift @{$_[0]}, $_[1];
for my $i( 0..@{$_[0]}-2 ) {
last if $_[0]->[$i] <= $_[0]->[$i+1];
( $_[0]->[$i],$_[0]->[$i+1] ) = ( $_[0]->[$i+1],$_[0]->[$i] )
}
}
__DATA__
1 2 3 4 6 7 8 9 10
0 1 2 3 4 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
| [reply] [d/l] |
|
This works, but is it O(N)? Sort algorithms, man I think I remember this, can't I do a binary search? Would that be faster to find the correct placement, or am I getting terms mixed up?
Thank you
| [reply] |
|
Here's your bin search. Worst case time for the search (not counting the splicing): O(log N)
sub compare {
$a cmp $b
}
sub binsearch {
# Returns the index of the position before
# the one where $s would be found, when $s
# is not found.
my ($f, $s, $list) = @_;
my $i = 0;
my $j = $#$list;
my $k;
my $c;
$a = $s;
for (;;) {
$k = int(($j-$i)/2) + $i;
$b = $list->[$k];
$c = &$f();
return $k if ($c == 0);
if ($c < 0) {
$j = $k-1;
return $j if ($i > $j);
} else {
$i = $k+1;
return $k if ($i > $j);
}
}
}
@a = qw( z b x c y a );
@b = qw( o m n );
@a = sort compare @a;
@b = sort compare @b;
splice(@a, binsearch(\&compare, $b[0], \@a)+1, 0, @b);
print(@a); # abcmnoxyz
Of course, in your case, it will be
splice(@known, binsearch(\&byIndex, $newValue, \@known)+1, 0, $newValue);
Update: Passing array to binsearch using a reference now. oops! | [reply] [d/l] [select] |
|
|
|
Yes you can do a binary search. Yet is will find the correct placement faster for large lists. In fact I have a module on cpan called File::SortedSeek that implements binary searches in large files so I appreciate the algorithm :-) However in this scenario having found the index you then have a practical problem. It is not possible to insert a value into the middle of an array, which is really just a contiguous sequence of SV* What you have to do is make space for it. Splice does that by moving a large chunk (or the entire array). The bubble algorithm also does it and is (in past testing) about twice as fast as just calling sort. Here is a binary search implementation. Benchmarking the options is left to you.
my @ary = (1..4, 6..10);
for my $val( 0, 5, 11, 0, 5, 11 ) {
binary( \@ary, $val );
print "@ary\n";
}
sub binary {
my ( $ary, $val ) = @_;
my ( $min, $max, $last, $i ) = ( 0, scalar @{$ary}, 0, 0 );
while ( 1 ) {
$i = int( ($min+$max)/2 );
# print "i=$i\n";
last if $last == $i;
$last = $i;
if ( $ary->[$i] < $val ) {
$min = $i;
}
elsif ( $ary->[$i] > $val ) {
$max = $i;
}
else {
# values are equal so we have a valid index
last;
}
}
$i++ if $i; # for index 0 we want that, otherwise we want next p
+osition
splice @$ary, $i, 0, $val;
}
__DATA__
0 1 2 3 4 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10 11
0 0 1 2 3 4 5 6 7 8 9 10 11
0 0 1 2 3 4 5 5 6 7 8 9 10 11
0 0 1 2 3 4 5 5 6 7 8 9 10 11 11
| [reply] [d/l] |
Re: Insert Into a Sorted Array
by gryphon (Abbot) on Sep 29, 2004 at 03:21 UTC
|
Greetings ketema,
I don't know how wise this is (it feels unwise to me), but here's my code:
my @known = (
{ Attribute => 5 },
{ Attribute => 4 },
{ Attribute => 8 },
{ Attribute => 3 },
{ Attribute => 7 },
{ Attribute => 2 }
);
sub byIndex { $a->{Attribute} <=> $b->{Attribute} }
@known = sort byIndex @known;
my $new_item = { Attribute => 6 };
for (my $x = 0; $x < @known; $x++) {
if ($new_item->{Attribute} <= $known[$x]->{Attribute}) {
splice(@known, $x, 1, $new_item, $known[$x]);
last;
}
}
print $_->{Attribute}, "\n" foreach (@known);
Is there any way you can add items into @known all before your sort, then sort only once at the end? Seems to me that'd be better.
| [reply] [d/l] |
|
You and I must think alike because this is almost exactly what I did by myself, but it is too slow, and no I can't sort at the end for this application, we wind up sorting too much. :> I'm going to use teh binary search provided above, thanks for the input though.
| [reply] |
Re: Insert Into a Sorted Array
by Limbic~Region (Chancellor) on Sep 29, 2004 at 13:08 UTC
|
ketema,
I am surprised no one mentioned Tie::Array::Sorted. It is as simple as pushing a new element into the array. Additionally, your sort routine can be user-defined.
| [reply] |
|
|