Sorting a subset

seaver has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sorting a subset by tcf22 (Priest) on Dec 12, 2003 at 18:52 UTC
Try using grep, like so: `my @array = qw(Art bob joe andy willy Andrew john Archie); foreach(sort {substr($a,1) cmp substr($b,1)} grep /^A/, @array){ print "$_\n"; } __OUTPUT__ Andrew Archie Art` [download] Update:You could also do it using `substr` instead of a regex. I'm not sure which on is more efficient. `foreach(sort {substr($a,1) cmp substr($b,1)} grep {substr($_,0,1) eq ' +A'} @array)` [download] - Tom	[reply] [d/l] [select]
Re: Re: Sorting a subset by seaver (Pilgrim) on Dec 12, 2003 at 19:38 UTC
Hey, I like the grep substr combo, thanks for your help! Just one more thing, what if I wanted to return the resulting array: `return sort {substr($a,1) cmp substr($b,1)} grep {substr($_,0,1) eq 'A +'} @array;` [download] Can I do that? Cheers Sam	[reply] [d/l]
Re: Re: Re: Sorting a subset by BrowserUk (Patriarch) on Dec 12, 2003 at 19:55 UTC
It's a minor point, but as you know every string going into the sort starts with 'A', it is redundant, and much slower, to use substr within the sort block to exclude it. `return sort grep{ substr( $_ , 0, 1 ) eq 'A' } @array;` [download] Will produce the same result more quickly. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray!	[reply] [d/l]
Re: Re: Re: Re: Sorting a subset by Not_a_Number (Prior) on Dec 12, 2003 at 20:04 UTC
Re: Re: Re: Re: Re: Sorting a subset by BrowserUk (Patriarch) on Dec 12, 2003 at 20:13 UTC
Re: Re: Re: Sorting a subset by tcf22 (Priest) on Dec 12, 2003 at 19:47 UTC
Yep. The below code outputs the same result. `my @array = &filter(qw(Art bob joe andy willy Andrew john Archie)); print "$_\n" for(@array); sub filter(){ return sort {substr($a,1) cmp substr($b,1)} grep {substr($_,0,1) e +q 'A'} @_; }` [download] - Tom	[reply] [d/l]
Re: Re: Sorting a subset by Roy Johnson (Monsignor) on Dec 12, 2003 at 19:48 UTC
~~The calls to `substr` in the sort function are misguided microoptimization.~~ Oops. Numerical sort of the substr. The PerlMonk `tr///` Advocate	[reply]
Re: Sorting a subset by dragonchild (Archbishop) on Dec 12, 2003 at 18:53 UTC
The first part, where you create @tempArray, is better written as: `my @tempArray = grep { /^A/ } @array;` [download] So, your entire part can be rewritten to: `foreach my $ref ( sort { substr($a, 1) <=> substr($b, 1) } grep { /^A/ } @array ) { ... do whatever ... }` [download] ------ We are the carpenters and bricklayers of the Information Age. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.	[reply] [d/l] [select]
Re: Sorting a subset by flounder99 (Friar) on Dec 12, 2003 at 20:04 UTC
In your example the second `for` loop will `sort` based on the numerical value of array elements with the first character chopped off. If that is what you want then something like this will work: `for my $ref (sort {substr($a,1) <=> substr($b,1)} grep {substr($_,0,1) + eq 'A'} @array) { #whatever with $ref }` [download] If you want to sort asciibeticly just use `sort`. You don't need to do the `substr` since the first letter will be 'A' already. `for my $ref (sort grep {substr($_,0,1) eq 'A'} @array) { #whatever with $ref }` [download] A `substr` will be faster than doing a regex but if you want to make it short you could also write it like this. `for my $ref (sort grep /^A/, @array) { #whatever with $ref }` [download] -- flounder	[reply] [d/l] [select]
Re: Sorting a subset by hardburn (Abbot) on Dec 12, 2003 at 20:31 UTC
Your sort can be faster if you move the substr out of the sort routine and into a map, then peice it back together afterword. This is known as the "Schwartzian Transform": `my @data = map { $_->[1] } sort { $a->[0] <=> $b->[0] } map { [ substr($_,1), $_ ] } @tempArray;` [download] Or, since your data is a simple string, it shouldn't be too much of a problem to modify that string in the first map (counting in the order of execution) and thus eliminate the need for the middle sort block (which is called the ~~GMT~~ GRT (thanks Abigail-II)): `my @data = map { s/\A (.)(.) \z/$2$1/x } sort map { s/\A (.)(.) \z/$2$1/x } @tempArray;` [download] Using perl's internal sort should be faster than making it continually call a custom block. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer `: () { :\|:& };:` Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: Re: Sorting a subset by BrowserUk (Patriarch) on Dec 12, 2003 at 22:56 UTC
There is a problem with your implementation of the GRT. Your map blocks are returning the return code (0\|1) from the substitutions, rather than the modified values. As is, all you will get is an array of 1's in the resultant array. Besides that, as you know that the first character is 'A', there is little point in using the GRT to move that constant character to the end, do an alpha sort and then move it back to the front. A straight alpha sort on the original strings will produce the same results and you avoid the need for the two transformations. However, if the string following the first char is numeric (as implied by the OP, and your ST example), then moving the 'A' to the end prior to doing an alpha sort will not result in the correct ordering. `print map{ s/\A (.)(.) \z/$2$1/x; $_ } sort map{ s/\A (.)(.) \z/$2$1/x; $_ } (@temp = qw[ A1 A11 A111 A2 A5 A50 ]); A111 A11 A1 A2 A50 A5` [download] In order for the GRT to work, you need to ensure that the prefix applied to the strings will sort correctly when an alpha sort is applied to them. If the arbitrating values are numeric, then you need to prefix the string such that those numeric values are represented in such a way that they will sort numerically, even though an alpha sort is being used. That's where pack came into the original name. `perl> print map{ substr $_, 2 } sort map{ pack( 'n', $_ =~ m[(\d+)] ) . $_ } (@temp = qw[ A1 A11 A111 A2 A5 A50 ]); A1 A2 A5 A11 A50 A111` [download] Note: The choice of pack format implies that I know that the numeric values in the strings will fit in a 2-byte integer. If they won't, then moving to a 4-byte integer is called for. It is also important to realise that using 'v' instead of 'n' as the pack format would not have worked. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray!	[reply] [d/l] [select]
Re: Sorting a subset by Abigail-II (Bishop) on Dec 12, 2003 at 21:32 UTC
which is called the GMT Greenwich Mean Time? ;-) It's a transformation invented by Uri Guttman and Larry Rossler. They called it something like 'Packed sort', but since it's not a different way of sorting (it's just doing pre- and postprocessing), I coined it the Guttman-Rossler Transform, and that name stuck. Abigail	[reply]
Re: Re: Sorting a subset by hardburn (Abbot) on Dec 12, 2003 at 21:42 UTC
Bah, one letter off ;)	[reply]
Re: Re: Sorting a subset by ihb (Deacon) on Dec 14, 2003 at 01:22 UTC
I'll keep this post short. Three important issues: `s///` returns boolean, not the modified string. An s-modifier is needed on the `s///` to make it match elements with newlines in them. The `s///` modifies `@tempArray`. Perhaps this is wanted, but probably not. `ihb`	[reply] [d/l] [select]
Re: Sorting a subset by !1 (Hermit) on Dec 12, 2003 at 21:20 UTC
Hmmm. `for (map "A$_", sort { $a <=> $b } map /^A\d+$/ ? substr($_,1) : (), @ +array) { ...` [download] Just another way. I am assuming you wanted numeric comparison. I also took the liberty of filtering out inputs not of the form A followed by numeric digits. Apologies for allowing neither negatives nor decimals. Although: `for (sort grep /^A/, @array) { ...` [download] Would work if you just want it for strings starting with A.	[reply] [d/l] [select]


Syntactic Confectionery Delight
	PerlMonks