Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^2: Using hashes for set operations...

by LanX (Saint)
on May 23, 2011 at 08:51 UTC ( [id://906264]=note: print w/replies, xml ) Need Help??


in reply to Re: Using hashes for set operations...
in thread Using hashes for set operations...

Hmm interesting!

I was sure that using De Morgan's laws would just relocate the problem and not solve it. (you're now only deleting hash-slices)

But this relocation makes sense ... now it works for all cases where "undef" isn't allowed as part of an initial set. (undef deletes empty strings).

But you shouldn't use print to inspect data structures:

#!/usr/bin/env perl use strict; use warnings; use Data::Dump qw(pp); my @array1=(1..2,""); my @array2=(2..3,undef); my (%union, %sdiff, %sd1, %sd2, %inter); @union{(@array1,@array2)} = (@array1,@array2); %sd1 = %sd2 = %inter = %union; delete @sd1{@array2}; delete @sd2{@array1}; %sdiff = (%sd1,%sd2); delete @inter{keys %sdiff}; print "array1: ", pp(\@array1), "\n"; print "array2: ", pp(\@array2), "\n"; print "union: ", pp(\%union), "\n"; print "sdiff: ", pp(\%sdiff), "\n"; print "inter: ", pp(\%inter), "\n"; print "sd1: ", pp(\%sd1), "\n"; print "sd2: ", pp(\%sd2), "\n";

prints

Use of uninitialized value $array2[2] in hash slice at /home/lanx/B/PL +/PM/set2.pl line 12. Use of uninitialized value $array2[2] in delete at /home/lanx/B/PL/PM/ +set2.pl line 16. array1: [1, 2, ""] array2: [2, 3, undef] union: { "" => undef, 1 => 1, 2 => 2, 3 => 3 } sdiff: { 1 => 1, 3 => 3 } inter: { "" => undef, 2 => 2 } sd1: { 1 => 1 } sd2: { 3 => 3 }

But while you found a mathematical solution, needing so many operations for an intersection is in practice a little disappointing ...

Thanks anyway! :)

Cheers Rolf

Replies are listed 'Best First'.
Re^3: Using hashes for set operations...
by jaredor (Priest) on May 24, 2011 at 12:55 UTC

    Sir, to me "elegant" is equivalent to "mathematical"!

    (Even though I am a failed mathematician, it does sting a little that you felt the need to give me a link to De Morgan's laws.)

    True, I did treat this more as a little puzzle than a real world answer, but then what do you expect with only allowing the set operations, "union" and "subtraction"? As for that, you only get the union property as a side effect from hash key creation (which (I think) would not give even a squeak of protest if there were two key/value pairs with identical keys and different values, but by construction we avoid this issue).

    Although I've never used this behavior, apparently the delete function does return a list of keys values it's deleted, so delete(@A{keys B}) does return the intersection of A and B ... as well as an undef for every key value that was in B and not in A; so you're back to square 1 with ugly undefs in your answer. (Hence I still haven't used this behavior.) BTW, it just occurred to me that for the symmetric differences %sd1 and %sd2 don't have to be copies of the union, they only need to be the hash equivalents of @array1 and @array2, respectively.

    I love talking like a pedant, but in addition to this vice, I have to admit to a hypocrisy: I love using map and grep more, even to the point of using them for their side effects, which means a significant portion of my time in writing up my (few) posts on perlmonks is translating to while and for constructs. So I'm sincere when I say that your original post was elegant enough for me: Handling undefs in perl is just a fact of life, and if you can do it with just one swipe of the knife, then you are about as good as you're going to get without reformulating your approach.

    My last gratuitous squeak of protest is that print statements suffice if they work correctly for a code that is an illustration of a solution; in other words, it is not a data structure at that point so much as a formatted solution! (That being said, you've shamed me enough that I will try to start using Data::Dumper in my posts.)

    I've just gone through and removed all the smiley's from this post. So now I look smarter, but more crotchety. Anyone who reads this please note that I enjoy these kinds of "pure perl" puzzles and have enjoyed everyone's contribution to this query of LanX's. I spent years as a coder looking for good coders to critique my work with only modest success. Here at perlmonks you get high quality feedback quickly--and you still learn from the low quality feedback because you now have context in which to evaluate it. Good stuff, and thank you LanX.

      > apparently the delete function does return a list of keys it's deleted, so delete(@A{keys B}) does return the intersection of A and B ...

      Good observation! I forgot about this.... so one gets the difference and the intersection with one operation, thats brilliant.

      So a simple upgrade of delete with an extra switch to ignore undefs would give Perl a native primitive to handle set operations...

      And I enjoyed your post too, I'm still meditating about how to make use of your approach...

      > (Even though I am a failed mathematician, it does sting a little that you felt the need to give me a link to De Morgan's laws.)

      The link wasn't meant for you, believe me. :)

      I'm mathematician by training but not a native English speaker, so I need to elaborate my posts longer and stuff them with extra informations to avoid misunderstandings. (And the monastery is full of obsessed nitpickers...)

      Cheers Rolf

        Such a kind response deserves a little extra effort. Below is my "cleanest" attempt at solving your puzzle. It was motivated by two facts 1) I was wrong about the behavior of delete in that it returns the deleted values, not the deleted keys, so the "value equal key" array is necessary and 2) I've read a few responses on perlmonks that say hash elements that have undef as a value actually point to the same undef and thus potentially save some space. (I cannot find this stated in the documentation, but it probably is that I just don't know where to look; plus all the keywords that spring to mind are ubiquitous ;-)

        • Union is dead easy.
        • Intersection uses a temporary hash as well as a grep. I chose this way as being more succinct than efficient, but admit it violates your rules.
        • Symmetric Difference is about the same effort as two intersections; however, there is no grep!

        All three constructs were written to be independent of the others, but obviously that need not be a constraint; thus symmetric difference could be as easy as taking a copy of the union and deleting the intersection (as in your approach). I decided to make intersection "easier" than symmetric difference because I think intersections are more common, at least in the type of code I write. Lastly, I have decided to have all the result hashes "look the same" in that every element value is undef--which may or may not be more efficient!

        That's all I can do. Thanks, this has been fun.

        #!/usr/bin/env perl use strict; use warnings; use Data::Dump qw(pp); my @A = qw( 1 2 3 4 5 ); my @B = qw( 3 4 5 6 7 ); my %unionAB; @unionAB{@A,@B} = undef; my %interAB; { my (%tmpA); @tmpA{@A} = @A; @interAB{(grep {defined} delete @tmpA{@B})} = undef; } my %sdiffAB; { my (%tmpA, %tmpB); @tmpA{@A} = @A; @tmpB{@B} = @B; delete @tmpA{@B}; delete @tmpB{@A}; @sdiffAB{keys %tmpA, keys %tmpB} = undef; } # View results print 'unionAB: ', pp(\%unionAB), "\n"; print 'interAB: ', pp(\%interAB), "\n"; print 'sdiffAB: ', pp(\%sdiffAB), "\n";

        (Sadly, I feel your point about adding detail. Most of my time not coding was spent trying to guard against the charge that I'm an idiot who doesn't know set theory. Of course if my beloved Dr. Kaiser were on perlmonks and posted such, I would have to agree, but since he loves PROLOG, I think I'm safe.)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://906264]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-24 16:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found