Boolean array indexing

johnmillerflorida has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Boolean array indexing by BrowserUk (Patriarch) on Nov 22, 2013 at 00:22 UTC
If your arrays are small go with one of the many grep/map/slice answers offered, but be aware that for large arrays they will create two or more huge intermediate lists that consume substantial amounts of memory, and thus time. The alternative that avoids those huge stack allocations only requires a single temporary scalar and runs very efficiently: `@a=(6,7,8); @b=(3,2,1);; $i=0; $b[$_] > 1 and $c[ $i++ ] = $a[$_] for 0 .. $#a;; print @c;; 6 7` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: Boolean array indexing by Anonymous Monk on Nov 22, 2013 at 07:11 UTC
Why not use `push` though? `$b[$_] > 1 and push(@c, $a[$_]) for 0 .. $#a;;` [download]	[reply] [d/l] [select]
Re^3: Boolean array indexing by BrowserUk (Patriarch) on Nov 22, 2013 at 08:33 UTC
Good point. Does away with the need for a temp var. ++ With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re: Boolean array indexing by GrandFather (Saint) on Nov 21, 2013 at 22:29 UTC
A trivial change you might like is: `@a= map{$a[$_]} grep {$b[$_]>1 } 0..$#b;` [download] which doesn't reduce the complexity, it's exactly the same code, but does eliminate one manifest array. At the end of the day you pretty much can't reduce the complexity. You have to select elements based on your criteria and you have to map those elements to your other array. If you are free to alter your data structures you could instead: `my @data = ([6, 3], [7, 2], [8, 1]); my @result = grep {$_->[1] > 1} @data;` [download] which is much better if the data are paired. It may even be worth the effort to generate an array of paired data to gain the benefit of uniform handling in that way: `my @dataA = (6, 7, 8); my @dataB = (3, 2, 1); my @data = map {[$dataA[$_], $dataB[$_]]} 0 .. $#dataA;` [download] True laziness is hard work	[reply] [d/l] [select]
Re: Boolean array indexing by linuxer (Curate) on Nov 21, 2013 at 22:16 UTC
First shot is to let map do the grep's work: `#! /usr/bin/env perl use strict; use warnings; my @a = ( 6,7,8 ); my @b = ( 3,2,1 ); my @c = map { $b[$_] > 1 ? $a[$_] : () } 0 .. $#a; print "@c\n";` [download]	[reply] [d/l]
Re^2: Boolean array indexing by johnmillerflorida (Initiate) on Nov 21, 2013 at 22:28 UTC
Thank you very much, linuxer, this is indeed one line :-) Still, is there no built-in function in perl which returns the indices of an array based on a boolean query? I am moving to perl from matlab (don't want to get into fights about which programming languages are the best (or whether matlab can be considered as one in the first place ;-)), but I think matlab has an edge here, since the equivalent matlab code would be: a=a(find(b>1))	[reply]
Re^3: Boolean array indexing by hdb (Monsignor) on Nov 21, 2013 at 22:38 UTC
List::MoreUtils has a function `indexes` which does that: indexes BLOCK LIST Evaluates BLOCK for each element in LIST (assigned to $_) and returns a list of the indices of those elements for which BLOCK returned a true value. This is just like grep only that it returns indices instead of values: `@x = indexes { $_ % 2 == 0 } (1..10); # returns 1, 3, 5, 7, 9` So you could write `@a[ indexes { $_ > 1 } @b ]` Not quite as nice as in Matlab but close.	[reply] [d/l] [select]
Re^3: Boolean array indexing by GrandFather (Saint) on Nov 21, 2013 at 22:35 UTC
matlab is a special purpose tool tuned for handling arrays. Perl is much more a general purpose scripting language. It is unsurprising that there are elegant ways of performing tasks within matlab problem domain that Perl can't directly match. There are many things that Perl does nicely for which matlab has no equivalent. Ya pays ya money (a lot of money in the case of matlab if you're not in an educational institution) and takes ya choice. True laziness is hard work	[reply]
Re^3: Boolean array indexing by hdb (Monsignor) on Nov 21, 2013 at 22:49 UTC
I had a number of projects in the past where I used Perl to create input files for Matlab, mainly matrices, and then kicked off Matlab to do the processing of those, and then back in Perl created nice reports to present the results (LaTeX files). Using multiple tools (if available) can be very efficient, using each for what it is best for.	[reply]
Re: Boolean array indexing by Laurent_R (Canon) on Nov 21, 2013 at 22:51 UTC
In this specific case, I would probably go for a single `map` such as the solution proposed by linuxer. However, building on your own solution, I would like to point out that you don't need an intermediary array @ind if you just pipeline the `grep` and the `map`, as shown in the following Perl debugger session: `DB<1> @a = ( 6,7,8 ); DB<2> @b = ( 3,2,1 ); DB<3> @c = map {$a[$_]} grep {$b[$_] > 1} 0..$#b; DB<4> p "@c"; 6 7 DB<5>` [download] Again, in this specific case, I would probably do everything in one map, as in the solution proposed by linuxer, but the possibility of pipelining various list operators as above can be a very powerful tool. This is an example of the same idea on the same problem, using two more list operators: `DB<5> print join " ; ", map {$a[$_]} grep {$b[$_] > 1} 0..$#b; 6 ; 7 DB<6>` [download] To understand this type of command pipeline, it has to be read from right to left ( and from bottom to top if it is spread on several lines). Well known examples of such constructs are the Schwartzian Transform and variations thereon such as the Guttman-Rosler Transform for efficiently sorting data. Update: When I started to type this, there was only linuxer's answer. Then, while I was typing the above, my daughter came twice to ask me something, so that it took me 20 or 25 minutes to finish typing the above, and many good answers were delivered in between, so that the above is less useful. Alright, fair enough, you have to be fast if you want to be the first. ;-)	[reply] [d/l] [select]
Re^2: Boolean array indexing by johnmillerflorida (Initiate) on Nov 21, 2013 at 23:19 UTC
Wow, I am very impressed by all your very helpful responses in such a short time! (this was my first post on perlmonks). Your multiple responses provide a great reference of how to best deal with this problem in different circumstances. This helps a lot. Thank you so much!	[reply]
Re: Boolean array indexing by davido (Cardinal) on Nov 21, 2013 at 23:36 UTC
`use List::MoreUtils qw(indexes); my @a = ( 6, 7, 8 ); my @b = ( 3, 2, 1 ); @a = @a[ indexes{ $_ > 1 } @b ];` [download] Update: Others beat me to this one by hours. ;) Dave	[reply] [d/l]
Re: Boolean array indexing by hdb (Monsignor) on Nov 21, 2013 at 22:30 UTC
Instead of `map{ $a[$_] } @ind` you can write `@a[@ind]`, an array slice.	[reply] [d/l] [select]
Re^2: Boolean array indexing by GrandFather (Saint) on Nov 21, 2013 at 22:42 UTC
Or even: `my @result = @a[grep {$b[$_] > 1} 0 .. $#b];` [download] True laziness is hard work	[reply] [d/l]
Re: Boolean array indexing by roboticus (Chancellor) on Nov 22, 2013 at 14:33 UTC
johnmillerflorida: Sometimes altering your data structures can simplify things. Generally, maintaining parallel arrays is the wrong way to store and/or work with your data. If you have related data items, it's usually better to keep it together: # you originally have two parallel arrays of related data my @a = (6, 7, 8); my @b = (3, 2, 1); # So you need to use artificial means to put the data together, such a +s an index variable for my $index (0 .. $#a) { my ($a,$b) = ($a[$index], $b[$index]); print $a, "\n" if $b>1; } # If instead you keep the related data items together in a two-dimensi +onal array: my @c = ( [6,3], [7,2], [8,1] ); # Then printing the related data is easy, and you needn't track the ar +ray index for my $item (@c) { my ($a,$b) = @$item; print $a, "\n" if $b>1; } [download] Hopefully the (untested) code above is clear enough. (I need more coffee...) Another reason to keep the data together is that it can be easier to do some error checking/handling, as you don't have to worry about your arrays being mismatched, etc. Finally, you may even want to use an array of hashes, so your code can be a little more self-documenting in that you don't have to remember that "slot 0 is a and slot 1 is b": `# An array of hash references can be pretty nice, too: my @c = ( { a=>6, b=>3 }, { a=>7, b=>2 }, { a=>8, b=>1 }, ); # Then printing the related data is easy, and you needn't track the ar +ray index for my $item (@c) { print $item->{a},"\n" if $item->{b}>1; }` [download] ...roboticus When your only tool is a hammer, all problems look like your thumb.	[reply] [d/l] [select]


Your skill will accomplish what the force of many cannot
	PerlMonks