Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

PDL's error or mine?

by jo37 (Deacon)
on Feb 22, 2023 at 14:05 UTC ( [id://11150528] : perlquestion . print w/replies, xml ) Need Help??

jo37 has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks and nuns,

I'm facing a strange behaviour of PDL's setops. The intersection of a set having a single value and the empty set produces the singleton instead of the empty set. Am I missing something or is this a bug?

#!/usr/bin/perl use v5.16; use warnings; use PDL; say $PDL::VERSION; my $e = zeroes 0; say "$_ and $e: ", setops $_, 'AND', $e for pdl(1), ones(1), ones(4), +sequence(4); __DATA__ 2.025 1 and Empty[0]: [1] [1] and Empty[0]: [1] [1 1 1 1] and Empty[0]: [1] [0 1 2 3] and Empty[0]: Empty[0]

Greetings,
-jo

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Replies are listed 'Best First'.
Re: PDL's error or mine?
by choroba (Cardinal) on Feb 22, 2023 at 15:23 UTC
    I can confirm the behaviour in PDL 2.081 (latest). The whole intersect is weird.
    1 and [1 1]: [1 1] [1] and [1 1]: [1 1] [1 1 1 1] and [1 1]: [1 1] [0 1 2 3] and [1 1]: [1]

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: PDL's error or mine?
by syphilis (Archbishop) on Feb 23, 2023 at 01:54 UTC

      Opened an issue as suggested, including the strange behaviour as discovered by choroba

      Interestingly, the search for a suitable workaround resulted in a remarkable boost for my specific application. Here the sets represent indices from another piddle and one operand is taken from a piddle holding several sets, which requires equal dimensions for all sets. This is achieved by padding the sets with BAD values at the end. With a small modification, the values can be placed just at the position they represent and the remaining values set to BAD. Then an intersection may be performed almost with a simple dice.

      E.g.

      my $data = pdl ...; # 1-d #my $set1 = $data->where(something); # before Update 1 #my $set1 = $data->which(something); # before Update 2 my $set1 = which($mask1); # $mask1 having the shape of $data my $full1 = zeroes(indx, $data->dim(0))->setvaltobad(0); $full1->dice($set1) .= $set1; #my $set2 = $data->where(other); # before Update 1 #my $set2 = $data->which(other); # before Update 2 my $set2 = which($mask2); # $mask2 having the shape of $data # Now the intersection of $set1 and $set2 is: my $tmp = $full1->dice($set2)->sever; my $intersect = $tmp->where(isgood $tmp);
      The result in $intersect can be found much faster with dice than with setops for a large sized $data. An older issue with setops revealed the usage of uniq within setops, which sorts the data making it O(N log N), while the dice approch should be O(N). In my application there is no overhead in constructing $full1.

      Probably this is already described somewhere else.

      Update 1:
      It must be $data->which(...) instead of $data->where(...). Modified the example.

      Update 2:
      The example was still faulty. The key for this to work is $set1 and $set2 being piddles holding indices from $data as said in the paragraph above. Modified the example once again. Regard it as pseudocode.

      Greetings,
      -jo

      $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
        This has been fixed in PDL and (so far) dev-released as 2.081_01. Thank you!

        Wouldn't "dice" fail if $set1->maximum > $data->dim(0) ? (or even s/set1/data/ in general?)