http://qs321.pair.com?node_id=459349


in reply to Re: Homework question list
in thread Homework question list

2. Compare two arrays and show the elements that exist in both (I actually screwed up in my live question, but the answers were still informative). This is a great question because it shows how someone thinks. Lower level devs use nested loops (I did, when this was thrown at me a few years ago). Smart people use hash slices. Really smart people use 'undef'd hash slices to save memory.
Could you give an example of the "hash slice" method? Using hashes is a good way (something like my %temp; my @doubles = grep { ++$temp{$_} == 2 } @a1, @a2; works nicely), but I can't seem to figure out the "(undef'd) hash slice" method. Please enlighten me :-)

Arjen

Replies are listed 'Best First'.
Re^3: Homework question list
by Tanktalus (Canon) on May 22, 2005 at 15:37 UTC

    Your example fails if @a1 or @a2 have a value more than once. For example, @a1 = (1, 2, 3, 2, 1) and @a2 = (4, 5, 6, 5, 4). Your grep would show 1, 2, 4, and 5 as elements existing in both arrays, when the right answer is obviously zero.

    I will also disagree with cLive ;-) that "really smart people" use the undef'd hash slices to save memory - if memory is a concern, sure. But in general, the difference is not going to be significant. Premature optimisation and all that. I've even found sometimes where using the standard "++$hash{$value}" turns out to be handy three months later when the number of times a value shows up becomes relevant. I didn't need to change nearly as much code because I wasn't "really smart" according to cLive's definition.

    Anyway, as always, TIMTOWTDI, so being able to use the undef'd hash slice is still a tool to keep handy:

    my %a1; undef @a1{@a1}; my @in_both = grep { exists $a1{$_} } @a2;
    Unfortunately, this has the side effect of showing duplicates if @a1 has a value, and @a2 has that value multiple times. Which, of course, may be what you want, but it's not explicit in the original requirement.
    my (%a1, %a2); undef @a1{@a1}; undef @a2{@a2}; my @in_both = grep { exists $a1{$_} } keys %a2;
    Oh, and I also recommend better variable names than what I'm using here. :-)

    Update: Added the italicised part in the last line.

      The example isn't a real good one, I know. I was just playing around with variations on a theme, not trying to get a bullet-proof solution. But still, good point about the example :-)

      cLive ;-)'s solution is a variation on a theme explained in the Cookbook. I was under the impression that there existed some really nifty hash slice trick to get the results that I didn't know about, hence the question.

      In my production code, I use good variable names. When experimenting, I'm not as picky :-) I probably should be when posting code here.

      Thanks, Arjen

Re^3: Homework question list
by cLive ;-) (Prior) on May 23, 2005 at 22:40 UTC

    Interestingly, you made the same mistake an applicant made when doing the compare - as Tanktalus has pointed out :)

    By the way, Tanktalus, by "really smart people", I was referring to tilly here ;-)

    cLive ;-)

      It is more accurate to say that smart people know that they can use the undef trick - then don't. Because it is an unnecessary micro-optimization that could easily confuse others.