Comparing a value to a list of numbers

g_speran has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Comparing a value to a list of numbers by davido (Cardinal) on Jan 29, 2021 at 17:29 UTC
The hashtable lookup is by far the fastest approach if you can spare the time to build it in the first place. You wouldn't use the hash approach if you are only doing a search one time on the data set. But if you are doing it many times, the hash approach quickly shines. Other viable approaches include regex alternation, any, and grep. Observe: #!/usr/bin/env perl use strict; use warnings; use List::Util qw(any); use Benchmark qw(cmpthese); my @numlist = map {int(rand(100_000))} 1..10_000; my %numlist_table; @numlist_table{@numlist} = (); my $numlist_re = do { my $numlist_str = join('\|', @numlist); qr/^(?:$numlist_str)$/; }; my $target = 42; sub sgrep { return grep {$target == $_} @numlist; } sub table { return exists $numlist_table{$target}; } sub alternation { return $target =~ m/$numlist_re/; } sub sany { return any {$target == $_} @numlist; } cmpthese(-3, { grep => \&sgrep, table => \&table, alternation => \&alternation, any => \&sany, }); [download] The results: `Rate grep any alternation table grep 4055/s -- -28% -100% -100% any 5641/s 39% -- -100% -100% alternation 3440637/s 84745% 60895% -- -93% table 48177809/s 1187946% 853994% 1300% --` [download] However, if you consider the time it takes to build the alternation regex, and to build the hash, they are less efficient options for small data sets, or for small numbers of lookups. For large datasets with large numbers of lookups, the hash is a good answer (unless memory is tight). Dave	[reply] [d/l] [select]
Re^2: Comparing a value to a list of numbers by LanX (Saint) on Jan 29, 2021 at 17:40 UTC
Well, there is another question lurking behind, which is IMHO not too tangential What if large ranges are involved? The OP showed at least a little one with 41-56. I'm sure a binary search would outperform a hash then.° (Of course not if using random keys like you did) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery} update °) Well I need to be more precise: Large ranges are sometimes not practicable with hashes because of the memory complexity. This could lead to heavy swapping operations. Having a little slower code with binary search O(ln) is often better than hash lookup with O(1)	[reply]
Re^2: Comparing a value to a list of numbers by LanX (Saint) on Jan 30, 2021 at 02:30 UTC
Hi Dave, > `my $target = 42;` I think using a fixed number if some routines use a linear search might not lead to accurate results. But I have to admit, I don't know yet how to best combine a random $target with Benchmark Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]
Re: Comparing a value to a list of numbers by Discipulus (Canon) on Jan 29, 2021 at 16:19 UTC
Hello g_speran, you can use `grep` like in `if grep{ $given == $_ } @nums` or `any` from List::Util Anyway what you posted is not perl: the equal sign `=` is an assignement operator and you need `==` to compare nums and `eq` for strings. `if` must be lowercase and square brackets are for anonymous arrays. L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re: Comparing a value to a list of numbers by haukex (Archbishop) on Jan 29, 2021 at 18:47 UTC
a variable, which contains a number to a list of numbers, such as 1,2,5,6,9,10,41-56 I'm a little unclear on what you mean. Is this variable an array, like the other monks have assumed? Because if it's a string, you'll have to parse it first. Luckily, there is a module that can handle a string such as the one you've shown, Set::IntSpan. The advantage of this module is that if the spans are large, like say 1,2,3-100000, it'll save a lot of memory as compared to an array or a hash. Otherwise, this is just a TIMTOWTDI solution. `use warnings; use strict; use Set::IntSpan; my $set = Set::IntSpan->new("1,2,5,6,9,10,41-56"); for my $x (1,2,3,42,100) { print "$x is ", $set->member($x)?"":"NOT ", "in the set\n"; } __END__ 1 is in the set 2 is in the set 3 is NOT in the set 42 is in the set 100 is NOT in the set` [download]	[reply] [d/l]
Re^2: Comparing a value to a list of numbers by Bod (Parson) on Jan 29, 2021 at 23:25 UTC
Luckily, there is a module that can handle a string such as the one you've shown, Set::IntSpan I'm not sure why I continue to be surprised...but... It still amazes me just how many really useful modules are shared in the Monastery every day!	[reply]
Re: Comparing a value to a list of numbers by hippo (Bishop) on Jan 29, 2021 at 16:59 UTC
It's an FAQ. 🦛	[reply]
Re: Comparing a value to a list of numbers by Fletch (Bishop) on Jan 29, 2021 at 16:58 UTC
Straight from the examples in Quantum::Superpositions `use Quantum::Superpositions; if ($x == any($a, $b, $c)) { ... }` [download] The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l]
Re^2: Comparing a value to a list of numbers by LanX (Saint) on Jan 29, 2021 at 17:10 UTC
IIRC this is implemented by `any` returning an object overloading the `==` operator. And there is maybe also `tie`'ing involved. Hence many movable and slow parts. Essentially a very sweet syntax coming with a cost. Question: is there also intelligent support for ranges? I'd buy into this if yes. like `if ($x == any(1,2,5,6,9,10,[41,56]) { ... }` ? I suppose it's clear where the advantages are ... Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^3: Comparing a value to a list of numbers by Fletch (Bishop) on Jan 29, 2021 at 21:59 UTC
No clue, although I completely glossed on the range requirement in the OP. Having seen that I'd change my answer to Set::IntSpan probably. The cake is a lie. The cake is a lie. The cake is a lie.	[reply]
Re: Comparing a value to a list of numbers (in via hash) by LanX (Saint) on Jan 29, 2021 at 16:52 UTC
TIMTOWTDI Depending on your requirements on time and memory complexity, you could also use a hash `DB<28> %in = map { $_ => 1 } 1,2,5,6,9,10,41..56 DB<29> p $in{5} 1 DB<30> p $in{0} DB<31>` [download] update corrected C&P error omitting the `1` Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re: Comparing a value to a list of numbers by jcb (Parson) on Jan 31, 2021 at 04:50 UTC
Other monks have suggested a binary search solution, but did not provide an example. I took writing an optimized binary search for this as a challenge and wrote this: #!/usr/bin/perl # -- CPerl -- use strict; use warnings; # sparse range structure: # array of arrayref/number # single integers represented as number # contigous ranges as arrayref [LOW,HIGH] inclusive # given: string "A,B,C,X-Y,Z" # return: sorted array [A, B, C, [X, Y], Z] sub parse ($) { my @elements = split /,/, shift; foreach (@elements) { s/\s+//g; # prune whitespace next unless m/\d+-\d+/; # skip conversion if single integer $_ = [split /-/, $_]; # convert ranges to arrayrefs } # sort range set @elements = sort {(ref($a)?$a->[0]:$a) <=> (ref($b)?$b->[0]:$b)} @el +ements; # merge overlapping loose elements into preceding ranges for (my $i = 0; $i < $#elements; $i++) { next unless ref $elements[$i]; # skip single integers while ($i+1 <= $#elements and $elements[$i+1] <= $elements[$i][1]) { splice @elements, $i+1, 1 } # remove elements included in r +ange } # coalesce contiguous integers into ranges for (my $i = 0; $i < $#elements; $i++) { next if ref $elements[$i]; # skip ranges if ($elements[$i]+1 == $elements[$i+1]) { my $j = 1+$i; $j++ while !ref($elements[$j]) && $elements[$j]+1 == $elements[$ +j+1]; splice @elements, $i, 1+$j-$i, [$elements[$i], $elements[$j]]; } } # merge adjacent loose elements into succeeding ranges for (my $i = 0; $i < $#elements; $i++) { next if ref $elements[$i]; # skip ranges next unless ref $elements[$i+1]; # but next element is a range # There can be at most one such element, since contiguous integers + were # coalesced into ranges above. if ($elements[$i]+1 == $elements[$i+1][0]) { splice @elements, $i, 2, [$elements[$i], $elements[$i+1][1]] } } # merge adjacent loose elements into preceding ranges for (my $i = 0; $i < $#elements; $i++) { next unless ref $elements[$i]; # skip single integers next if ref $elements[$i+1]; # but next element is a single int +eger # There can be at most one such element, since contiguous integers + were # coalesced into ranges above. if ($elements[$i][1]+1 == $elements[$i+1]) { splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1]] } } # merge overlapping ranges for (my $i = 0; $i < $#elements; $i++) { next unless ref $elements[$i] and ref $elements[$i+1]; splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1][1]] if $elements[$i][1] >= $elements[$i+1][0]; } # merge adjacent ranges for (my $i = 0; $i < $#elements; $i++) { next unless ref $elements[$i] and ref $elements[$i+1]; splice @elements, $i, 2, [$elements[$i][0], $elements[$i+1][1]] if $elements[$i][1]+1 == $elements[$i+1][0]; } return \@elements; } # given: sorted array from sub parse and integer # return true if the integer is in the sorted array sub search ($$) { my $set = shift; my $num = shift; my $left = 0; my $right = $#$set; my $i = $#$set >> 1; # bitshift for integer /2 while ($left < $right) { if (ref($set->[$i])) { # evaluate a range return 1 # number within this range if $num >= $set->[$i][0] && $num <= $set->[$i][1]; if ($num > $set->[$i][0]) { $left = $i+1 } else { $right = $i-1 } } else { # evaluate a single integer return 1 # number matched if $num == $set->[$i]; if ($num > $set->[$i]) { $left = $i+1 } else { $right = $i-1 } } $i = ($left + $right) >> 1; # bitshift for integer /2 } # last check if (ref($set->[$i])) { return $num >= $set->[$i][0] && $num <= $set->[$i][1] } else { return $num == $set->[$i] } } my $Set = parse shift; use Data::Dumper; print Data::Dumper->new([$Set],[qw(Set)])->Indent(0)->Dump,"\n"; foreach (@ARGV) { print $_, search($Set, $_) ? ' is' : ' is not', " in the set\n"; } [download] The script expects a set of numbers as its first command line argument, and then various numbers to test against the set as additional arguments. Examples: `$ ./bsearch.pl 1,2,5,6,9,10,41-56 1 4 42 17 $Set = [[1,2],[5,6],[9,10],[41,'56']]; 1 is in the set 4 is not in the set 42 is in the set 17 is not in the set` [download] `$ ./bsearch.pl 1,2,11-16,6,7,19,9,5-8,13,14,15,4 1 2 3 4 5 8 9 10 11 1 +2 16 17 18 19 20 $Set = [[1,2],[4,9],[11,16],19]; 1 is in the set 2 is in the set 3 is not in the set 4 is in the set 5 is in the set 8 is in the set 9 is in the set 10 is not in the set 11 is in the set 12 is in the set 16 is in the set 17 is not in the set 18 is not in the set 19 is in the set 20 is not in the set` [download] This latter example was used for developing the set-optimization logic.	[reply] [d/l] [select]
Re^2: Comparing a value to a list of numbers by haukex (Archbishop) on Jan 31, 2021 at 09:09 UTC
Other monks have suggested a binary search solution, but did not provide an example. Set::IntSpan uses a binary search.	[reply]
Re^3: Comparing a value to a list of numbers by LanX (Saint) on Jan 31, 2021 at 20:06 UTC
> Set::IntSpan uses a binary search. Interesting, so why is it restricted to Integers? Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply]
Re^4: Comparing a value to a list of numbers by haukex (Archbishop) on Feb 01, 2021 at 15:48 UTC
Re^5: Comparing a value to a list of numbers by LanX (Saint) on Feb 01, 2021 at 16:06 UTC
Some notes below your chosen depth have not been shown here
Re^2: Comparing a value to a list of numbers by LanX (Saint) on Jan 31, 2021 at 19:55 UTC
> Other monks have suggested a binary search solution, but did not provide an example. FWIW I did, but didn't publish, since it didn't fit into davido's benchmark (which has also it flaws by only testing 42) `DB<47> sub tst { if ($_>=9) { if ($_>=41 and $_<=56) {1} elsif ($_<= +10) {1} } elsif ($_<=6) { if ($_>=5){1} elsif \ ($_<=2 and $_>=1) {1} }} DB<48> %in = map { $_ => 1 } 1,2,5,6,9,10,41..56 DB<49> tst != $in{$_} and print "error $_" for 1..190 DB<50>` [download] I was also pondering about how to optimize a solution (time and memory) for all cases. Like potentially many points many ranges floats instead of integers multiple dimensions (like rectangular areas in a plane) An "optimal" solution would involve combining different techniques ... And some techniques which were prominent in similar older threads haven't been mentioned yet. Anyway all of this is far beyond what the OP wanted. But yes: again "I'm suggesting but not providing an example" ;-) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l]
Re^3: Comparing a value to a list of numbers by jcb (Parson) on Feb 01, 2021 at 03:20 UTC
I assumed that the set would vary (although using `eval` to convert the arrayref returned from `parse` into a closure like your `tst` could be interesting even if `compile` could end up longer than `parse`) and tried to optimize for the "many points" and "many ranges" cases; thus all those extra passes in the `parse` sub to simplify the result. I excluded floating point because exact comparisons are not reliable and it is easier to restrict the domain to integers. Multiple dimensions start to get complex enough that I suspect a general optimal solution is impossible — optimizing that case will depend on the actual data used. The "suggesting but not providing an example" remark was an excuse for posting my solution — usually these questions have been "done to death" before I get to them. ;-)	[reply] [d/l] [select]
Re^4: Comparing a value to a list of numbers by jcb (Parson) on Feb 01, 2021 at 05:52 UTC
Re: Comparing a value to a list of numbers (in via smartmatch) by LanX (Saint) on Jan 29, 2021 at 16:47 UTC
well, for completeness: There is `smartmatch ~~` (sometimes) acting like an `in` operator `DB<24> $x=5 DB<25> p $x ~~ [1,2,5,6,9,10,41..56] 1 DB<26> $x=0 DB<27> p $x ~~ [1,2,5,6,9,10,41..56] DB<28>` [download] but `WARNING: Smartmatch is experimental` so you'd need to deactivate the warning and hope it'll never vanish (at your risk) Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: Comparing a value to a list of numbers (in via smartmatch) by Marshall (Canon) on Jan 30, 2021 at 02:27 UTC
I personally would not use the smartmatch operator for anything. I did some experimenting with this experimental operator and found out that it is "too smart for me!". I could not accurately predict what this thing would actually do!	[reply]
Re^3: Comparing a value to a list of numbers (in via smartmatch) by LanX (Saint) on Jan 30, 2021 at 02:32 UTC
I agree, but there is a real need for an "in"-operator like in other languages. I hope this aspect will survive somehow and I listed it for completeness with full warning. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply]
Re: Comparing a value to a list of numbers by tybalt89 (Monsignor) on Feb 01, 2021 at 02:32 UTC
And now for something completely different... #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11127703 use warnings; no warnings qw( qw ); use Search::Dict; @ARGV = qw( 1,2,5,6,9,10,41-56 1 4 42 17 ); my $file = join '', map "$_\n", split /,/, shift; open my $fh, '<', \$file or die; for my $n ( @ARGV ) { look $fh, $n, { comp => sub { $_[0] <=> $_[1] }, xfrm => sub { $_[0] =~ s/\d+-//r }, }; my $have = <$fh>; my $not = (defined $have and $have =~ /(\d+)-(\d+)/ ? $1 <= $n <= $2 : $have == $n) ? '' : ' no +t'; print "$n is$not in set\n"; } [download] Outputs: `1 is in set 4 is not in set 42 is in set 17 is not in set` [download]	[reply] [d/l] [select]
Re: Comparing a value to a list of numbers by Polyglot (Chaplain) on Feb 02, 2021 at 03:34 UTC
The variety of answers you've already gotten just shows how broad the TMTOWTDI principle actually is in Perl. I would do something entirely different than has yet been suggested--might not be the best, but it has worked for me in similar situations a number of times. `my $check = sprintf("%0.2d", $x); next if ('01,02,05,06,09,10,41,42,43,44,45,46,47,48,49,50,51,52,53,54, +55,56' =~ m/$check/);` [download] Obviously, if the list gets too long, this might not be practical. As written, this accommodates one- or two-digit numbers (though this could be tailored for larger ones, with the addition of more leading zeros). But for a list such as you suggested, it might work well enough--better than taking time to hash it all in terms of its simplicity, if not its efficiency. Blessings, ~Polyglot~	[reply] [d/l]
Re: Comparing a value to a list of numbers by Anonymous Monk on Jan 29, 2021 at 18:51 UTC
Seriously ... List::Util is your friend. (Even if you just "steal" a particular algorithm from it.) There's no reason to do something over if someone else has already done it.	[reply]
Re^2: Comparing a value to a list of numbers by Anonymous Monk on Jan 29, 2021 at 19:49 UTC
Please give an example (in perl code) of hose List::Util can be used to solve the OP's problem. Otherwise your "help" is just a waste of space and a waste of everybody's time.	[reply]
Re^3: Comparing a value to a list of numbers by syphilis (Archbishop) on Jan 30, 2021 at 00:44 UTC
Please give an example (in perl code) of hose List::Util can be used to solve the OP's problem `use warnings; use strict; use List::Util 'uniq'; my @given = (1,2,5,6,9,10,41..56); my @check = (3,4,9,11,48,102); my $c = uniq(@given); for(@check) { push(@given, $_); print "$_ is in the list\n" unless $c + 1 == uniq(@given); pop(@given); } __END__ Outputs: 9 is in the list 48 is in the list` [download] List::Util::uniq() utilizes the hash lookup referred to elsewhere in this thread. Some efficiency is lost in this script because uniq() has to generate the hash every time it is called. It would be more efficient if the perl code itself created the hash (just once), and then re-used that hash for each number that needs to be checked. Perhaps anonymous had some other List::Util routine in mind. Cheers, Rob	[reply] [d/l]
Re^4: Comparing a value to a list of numbers by LanX (Saint) on Jan 30, 2021 at 20:07 UTC
Re^5: Comparing a value to a list of numbers by eyepopslikeamosquito (Archbishop) on Feb 01, 2021 at 04:43 UTC
Some notes below your chosen depth have not been shown here
A reply falls below the community's threshold of quality. You may see it by logging in.


Syntactic Confectionery Delight
	PerlMonks

Comparing a value to a list of numbers

update

update