Idiomatic Array Index Search?

footpad has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
(ar0n) Re: Idiomatic Array Index Search? by ar0n (Priest) on May 27, 2001 at 01:14 UTC
I'd probably write it as: `sub arrayindex { my $value = shift; for (0..$#_) { return $_ if $_[$_] eq $value; } return -1; }` [download] Update code changed to reflect MeowChow's comment. The wrong way to do it would be: `my $index; for $index (0..$#values) { last if $values[$index] eq $value; }` [download] [ ar0n ]	[reply] [d/l] [select]
Re: Re: Idiomatic Array Index Search? by MeowChow (Vicar) on May 27, 2001 at 04:13 UTC
What if there's no match? That would be indistinguishible from matching the last element. MeowChow s aamecha.s a..a\u$&owag.print	[reply]
Re: Idiomatic Array Index Search? by tachyon (Chancellor) on May 27, 2001 at 07:12 UTC
I think your code is fine. arOn's code is brief and certainly idiomatic. This is a little more self documenting and what I might use, depending on my mood as much as anything else! See below for the one liner. `sub getArrayIndex { my $value = shift; for my $index(0..$#_) { return $index if $value eq $_[$index]; } return -1; }` [download] The loop structure: `for ($i=0,$i<$x,$i++)` [download] is available and consistent with many other languagues but in Perl these idioms are shorter and sweeter: `# to just iterate over an array; each element assigned to $_ # changing $_ changes the original array element for (@array) { # assigning to $_ here will change original array # comparing to $_ of course will not } # to iterate over a range for (0..10) { # $_ will be set to 0 to 10 in sequence } # to iterate over a range with a named op for $i(0..10) { # $i will be set to 0 to 10 in sequence }` [download] With the iterate over a range options note that $#array (returns the last index) whereas @array (in scalar context) return a scalar corresponding to the length of the array so that: `for (0..$#array) { } for (0..@array) { }` [download] are almost equivalent. This originally read are equivalent which is absolutely wrong. As dvergin points out because @array returns the number of elements in @array the second example will actually access an undefined array element one past the real end of the array. OK so it still usually works but consider this code: `@array = qw(1); print "My array @array\n"; for (0..10) { for (0..@array) {$array[$_]++} } print "My array @array\n";` [download] If you run this you will discover a potentially nasty suprise. `My array 1 My array 12 11 10 9 8 7 6 5 4 3 2 1` [download] Auto-vivification means that in perl if you operate on a variable which does not exist then perl just creates it for you. As a result each time through the inner loop we create a new array element. As a result our initial one element array consists of twelve elements at the end. Nice neat memory leak there. Neither use strict nor warnings catch this either. In perl there are some outrageously idiomatic (read short and often difficult to understand) ways to do things, and while it is fun to play golf (make really short idiomatic code to do a task) it is not the best practice if you have to maintain the code in the future. There is a happy medium there somewhere but as with all things beauty is in the eye of the beholder. A "look how clever I am :p" really idiomatic solution `# really short idiomatic version sub getArrayIndex { map{return --$_ if $_[0] eq $_[$_]}(1..@_-1) and -1 } # test code @array = qw(0 2 4 6 8 10 12); print getArrayIndex(6,@array),"\n"; print getArrayIndex(5,@array),"\n";` [download] I think this is about as idiomatic as you can get. It performs the task, but is not perhaps as easy to understand! I will post an explanation if you want - send a chat message. tachyon nysus wanted an explanation of this GOLF from Idiomatic Array Index Search? sub getArrayIndex { map{return --$_ if $_[0] eq $_[$_]}(1..@_-1) and -1 } # First you need to know that a sub returns the last value it evals # Next you need to know that a function will generally return true # Next you need to know that the part after an and is only evaluated # if the first part evals true. # Next you need to understand that map{}@array is essentially the same as # for (@array) {} - it just uses two less chars (the parenths around @array) # In this case the function is passed two args - the value to search for # and the array to search for it in. These args appear in @_ so @_[0] # more correctly written as $_[0] is the search value. Everything # else in @_ is the search array. # So what we do is iterate from 1 to the end of the array. As you probably # know the last index of an @array is given by $#array. The number of # elements is @array (in scalar context) so the last index is @array-1 # By now you should see that 1..@_-1 is therfore a range from 1 to our # last index in @_. As a result our map iterates from 1..@_-1 assigning # the index value to $_ # At this point it is important to remember that element 0 of our search # array (in the real world) is actually at index 1 (in our sub) because # the first element in @_ is our search value, thus causing an offset. # So when we find a match the actual index will be ($_ - 1). # Now --$_ just happens to be $_ -1 so this is what we return when we find # that [0] eq $_[$_]. Note --$_ is required as we need # this to occur before the return so we return this. $_-- fails to work. # If we fail to return out of the map, eventually it finishes and evals to # true so the part on the right hand side of the and gets evaluated. -1 # evals to uhm ah -1 so the last value evaluated by the sub is -1 and # this is what it returns. # ultra GOLF sub getArrayIndex { map{return --$_ if[0]eq$_[$_]}1..@_-1and-1 } # different style, shaves a few chars sub getArrayIndex { $i--;map{$i=$_-1if[0]eq$_[$_]}1..$#_;$i } # shave a few more by being sneaky sub getArrayIndex { map{$i=$_ if[0]eq$_[$_]}1..$#_;--$i } # less GOLF, using for and an unnecessary $i insead of $_ sub getArrayIndex { for $i(1..$#_) { return ($i-1) if [0] eq $_[$i]; } -1 } # almost production code sub getArrayIndex { $find = shift; @array = @_; for $index(0..$#array) { return $index if $find eq $array[$index] } return -1 # we return -1 if search fails } So thats it. Idiomatic perl with comments!!!! Not as short as baby perl!	[reply] [d/l] [select]
Re: Re: Idiomatic Array Index Search? by dvergin (Monsignor) on May 27, 2001 at 07:52 UTC
Terrific write-up tachyon. Just one small (though crucial) quibble about the difference between '$#array' and 'scalar @array': `my @array = qw(a b c d e f); for (0..$#array) {print $_} # zero thru last index for (0..@array) {print $_} # zero thru size of array` [download] prints: '0123450123456'	[reply] [d/l]
Re: Re: Re: Idiomatic Array Index Search? by tachyon (Chancellor) on May 27, 2001 at 09:28 UTC
Ah... good point. Perls auto vivification has obviously let me silently access one element past the end of the real array in that example. I normally use $#array but recently had to use @_ as $#_ was causing a syntax error (it was wickedly twisted code). I had thus far failed to note the difference. Thanks very much. I have added a correction to previous post with appropriate credit of course. On more pearl of wisdom duly filed tachyon	[reply]
Re: Idiomatic Array Index Search? by jorg (Friar) on May 27, 2001 at 02:03 UTC
I think you just need a reassuring pad on the back here footpad You're looking for a direct case sensitive match and used eq, which is perfect. You want to preserve the order of the values. You could build a hash where the key is the arrayindex of the value. eg `my %data = ( 1 => 'test1', 2 => 'test2', 3 => 'test3);` I would see this as unnecessary complication. The usage of an array is perfectly adequate here. Grep is inefficient yes because it searches ALL the matches and thus always loops your entire list. Basically I like the code you've written, it's easy to understand, maintainable, fits it's purpose and didn't take you long to write. Now if only all code was as easy this.... Jorg "Do or do not, there is no try" -- Yoda	[reply] [d/l]
Re: Re: Idiomatic Array Index Search? by Vynce (Friar) on May 27, 2001 at 08:51 UTC
i agree that footpad has nothing to worry about here. however, i think the hash solution that makes sense isn't the one you argued against. there's a hash solution that makes sense if three things are true: you're finding indices in this list fairly often you're changing the array fairly rarely you have no repeats or a simple decision making process about what to do with them. what do i mean by "fairly" often and "fairly" frequently? i think probably if you are looking up 3 for each change you make to the array, and you can capture all array changes, then this is a good solution. as for #3, i'll assume you always want the 1st occurence. the idea is, basically, to construct a lookup hash where for any `$array[$i] == $value`, there's `$lookup{$value} == $i`. for instance, you might write it this way: `{ my %lookup; sub _init_lookup_hash { %lookup; for my $i (0..$#_) { $lookup{$_[$i]} = $i unless defined $lookup{$_[$i]}; } } sub list_index { return $lookup{$_[0]} or -1 } ; }` [download] .	[reply] [d/l] [select]
Re: Idiomatic Array Index Search? by danger (Priest) on May 27, 2001 at 03:12 UTC
As an alternative suggestion, you might consider using Tie::IxHash, which has an Indices() method you can use from the tied object: `#!/usr/bin/perl -w use strict; use Tie::IxHash; my $IxObj = tie my %hash, 'Tie::IxHash'; my @values = qw(zero one two three four five six); @hash{@values} = (1) x @values; my $index = $IxObj->Indices('two'); print "$index\n"; # prints: 2 my @indices = $IxObj->Indices(qw/three one five/); print "@indices\n"; # prints: 3 1 5 __END__` [download] As for maintaining order, it also has a Push() method for adding key/value pairs, and SortByKey(), SortByValue(), and Reorder() methods which may be sufficient for your odering purposes. I haven't bothered to bench anything, so you may want to do that if performance is an issue.	[reply] [d/l]
Re: Idiomatic Array Index Search? by MeowChow (Vicar) on May 27, 2001 at 02:01 UTC
`<pimp>` [download] You may want to check out my index and rindex for arrays. They save a constant factor over grep, considering that they don't bother to iterate over the rest of the array once a match is found; however, they may be slower due to the implementation of iteration in Perl rather than as a built-in... (maybe I'll update with a few benchmarks.) `</pimp>` However, if the arrays are sorted, as you've said, then I would do a fast binary search which will find your element in O(log n) time, rather than O(n). MeowChow s aamecha.s a..a\u$&owag.print	[reply] [d/l] [select]
Simple things should be coded simply (Re: Idiomatic Array Index Search?) by japhy (Canon) on May 27, 2001 at 09:04 UTC
Notes: A hash is useful for multiple look-ups, since you have to realize that the CREATION of the hash is going to require a look at the entire array. `grep` is going to go through your entire array, and "jumping" out of `grep` isn't a "nice" thing to do. That said, I would take an approach like: `# prototype here for "look and feel good"ness # if you remove it, be sure to pass array by # reference (this is done for speed reasons) # $index = get_index @array, $element; sub get_index (\@$) { my ($a, $e) = @_; my $i = 0; for (@$a) { return $i if $e eq $_; $i++; } return -1; }` [download] There's no reason to get more idiomatic, because it will cloud the meaning of your code. The most simple things should be coded simply. `japhy` -- Perl and Regex Hacker	[reply] [d/l]
Re: Simple things should be coded simply (Re: Idiomatic Array Index Search?) by merlyn (Sage) on May 28, 2001 at 20:33 UTC
`sub get_index (\@$) { my ($a, $e) = @_; my $i = 0; for (@$a) { return $i if $e eq $_; $i++; } return -1; }` [download] I don't like code like this because you've got two things in that loop that are being kept in sync only coincidentally: the iteration of the foreach, and the incrementing of the index. I'd let there be one thing, and use the indirection to ensure consistency: `sub get_index (\@$) { my ($a, $e) = @_; for (my $i = 0; $i <= $#$a; $i++) { return $i if $e eq $a->[$i]; } return -1; }` [download] This is clearer to me. Simple things should be coded simply. {grin} -- Randal L. Schwartz, Perl hacker	[reply] [d/l] [select]
Re: Simple things should be coded simply (Re: Idiomatic Array Index Search?) by mattr (Curate) on May 28, 2001 at 18:57 UTC
I agree with all the above (simple things simple etc.) but am wondering what happens if you have two matching elements. I don't remember anybody saying they were unique, and an array just looks like a set when you use qw() but it isn't one mathematically. So maybe you should return an array, sorted of course. Then may I propose the most idiomatic would be to use another language, SQL of course? I am thinking of DBD::RAM. You could say something like `print $dbh->selectcol_arrayref(qq [ SELECT phrase FROM my_phrases WHERE id = 1 ORDER BY phrase ])->[0];` [download] Some other neat looking modules may include Search::InvertedIndex, Search::Binary, and Math::IntervalSearch (Search where an element lies in a list of sorted elements). ..to answer japhy and merlyn below, honored to join this thread with you both. To be sure, grep is good. Thank you for your elegant code.	[reply] [d/l]
Re: Re: Simple things should be coded simply (Re: Idiomatic Array Index Search?) by japhy (Canon) on May 28, 2001 at 19:21 UTC
Then `grep()` would be a super choice: `sub get_index { my ($a, $e) = @_; my @idx = grep $a->[$_] eq $e, 0 .. $#$a; return wantarray ? @idx : $idx[0]; }` [download] `japhy` -- Perl and Regex Hacker	[reply] [d/l]
(dkubb) Re: (2) Idiomatic Array Index Search? by dkubb (Deacon) on May 27, 2001 at 12:35 UTC
I agree with japhy on this one, simple things should be kept simple. Here is a variation on his get_index routine that combines his routine with other idioms mentioned in this node: `my $index = get_index(\@array, $element); sub get_index { my ($a, $e) = @_; $e eq $a->[$_] and return $_ for 0..$#$a; return -1; }` [download]	[reply] [d/l]
Re: Idiomatic Array Index Search? by toma (Vicar) on May 28, 2001 at 22:42 UTC
I liked all these solutions so much that I couldn't resist benchmarking a few of them to help me choose my favorite. Thought I'd share the results in case any of you also like benchmarks. ar0n: 4 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 1826.79/s (n=9682) clever_tachyon: 4 wallclock secs ( 5.01 usr + 0.00 sys = 5.01 CPU) @ 605.19/s (n=3032) dkubb: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 1821.65/s (n=9509) footpad: 5 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @ 385.96/s (n=2061) japhy: 6 wallclock secs ( 5.21 usr + 0.00 sys = 5.21 CPU) @ 2507.87/s (n=13066) merlyn: 6 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @ 1056.57/s (n=5547) tachyon: 6 wallclock secs ( 5.59 usr + 0.00 sys = 5.59 CPU) @ 1766.01/s (n=9872) I have edited this a few times since I didn't get my post perfect the first time use strict; use Benchmark 'timethese'; my @array; my $lookfor; my @array = qw ( Juan Montoya Buddy Lazier Eliseo Salazar Jeff Ward Ed +die Cheever Robby Gordon Jimmy Vasser Stephan Gregoire Scott Goodyear Scott Sharp +Mark Dismore Donnie Beechler Jaques Lazier Jeret Schroeder Billy Boat Raul Boesel J +ason Leffler Buzz Calkins Steve Knapp Davey Hamilton Robby McGehee Johnny Unser Sta +n Wattles Sam Hornish Jr Airton Dare Robbie Buhl Richie Hearn Andy Hillenburg Al + Unser Jr Jimmy Kite Sarah Fischer Lyn St. James Greg Ray ); $lookfor= "toma"; test_get_index($lookfor); $lookfor= $array[21]; test_get_index($lookfor); + sub test_get_index { my ($lookfor) = @_; print "Looking for $lookfor\n", "dkubb found ",get_index_dkubb(\@array, $lookfor),"\n", "merlyn found ",get_index_merlyn(\@array, $lookfor),"\n", "japhy found ",get_index_japhy(\@array, $lookfor),"\n", "clever tachyon found ",get_index_clever_tachyon($lookfor, @arra +y),"\n", "tachyon found ",get_index_tachyon($lookfor, @array),"\n", "footpad found ",get_index_footpad($lookfor, @array),"\n", "ar0n found ",get_index_ar0n($lookfor, @array),"\n"; } timethese(-5, { dkubb => sub {get_index_dkubb(\@array, $lookfor)}, merlyn => sub {get_index_merlyn(\@array, $lookfor)}, japhy => sub {get_index_japhy(\@array, $lookfor)}, clever_tachyon => sub {get_index_clever_tachyon($lookfor, @array)}, tachyon => sub {get_index_tachyon($lookfor, @array)}, footpad => sub {get_index_footpad($lookfor, @array)}, ar0n => sub {get_index_ar0n($lookfor, @array)}, }); sub get_index_dkubb { my ($a, $e) = @_; $e eq $a->[$_] and return $_ for 0..$#$a; return -1; } sub get_index_merlyn (\@$) { my ($a, $e) = @_; for (my $i = 0; $i <= $#$a; $i++) { return $i if $e eq $a->[$i]; } return -1; } sub get_index_japhy (\@$) { my ($a, $e) = @_; my $i = 0; for (@$a) { return $i if $e eq $_; $i++; } return -1; } sub get_index_tachyon { my $value = shift; for my $index(0..$#_) { return $index if $value eq $_[$index]; } return -1; } sub get_index_clever_tachyon { map{return --$_ if $_[0] eq $_[$_]}(1..@_-1) and -1 } sub get_index_footpad { my $value = shift; my @values = @_; my $result = -1; for ( my $index = 0; $index < @values; $index++ ) { if ( $value eq $values[ $index ] ) { $result = $index; last; } } return $result; } sub get_index_ar0n { my $value = shift; for (0..$#_) { return $_ if $_[$_] eq $value; } return -1; } [download] It should work perfectly the first time! - toma	[reply] [d/l]
Re: Idiomatic Array Index Search? by Zaxo (Archbishop) on May 28, 2001 at 12:32 UTC
Maybe the hash could be a perl hash: `%database_index; # Run me once my $idx=0; for (@database) { $database_index{$_} = $idx++; } # /Run me once # good candidate for a BEGIN block - caveat dynamic # This is pretty standard and not best for all situations` [download] Now $database_index{'query'} gives you what you want. This must be run again if the database changes, so it's awkward for dynamic situations. After Compline Zaxo	[reply] [d/l]


P is for Practical
	PerlMonks

Idiomatic Array Index Search?

A "look how clever I am :p" really idiomatic solution