Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Hash order randomization is coming, are you ready?

by demerphq (Chancellor)
on Nov 22, 2012 at 12:02 UTC ( [id://1005122] : perlmeditation . print w/replies, xml ) Need Help??

Perl 5.18 will introduce per process hash randomization and almost certainly will feature a new hash function. This means that code that has improper dependencies on perls hash order will become much more obvious.

Is your code ready?

Test your code on 5.17.6, and make sure it is ready for 5.18!

You might be pleasantly surprised: it might help you track down that nasty heisenbug you could never quite explain.


Replies are listed 'Best First'.
Re: Hash order randomization is coming, are you ready?
by muba (Priest) on Nov 22, 2012 at 12:04 UTC

      Yes. I shepherded this change into blead. In the course of doing so I had to fix a surprisingly large amount of code that one way or another had bogus expectations of hash ordering. Some of these were surprisingly subtle. For instance code like this:

      my %hash= some_list_of_key_value_pairs(); my @keys= keys %hash; #... my %copy= %hash; #... my @values= values %copy; do_something_with($keys[$_],$values[$_]) for 0..$#keys;

      This code will work so long as the keys put into %hash all hash into different buckets. However if any collide it will fail. So it is very sensitive to what data is involved and to an uninformed observer would make no sense, especially if test samples happened to work out. With an older perl the same input would always produce the same output so a safe input set would remain safe for ever, on the other hand with hash seed randomization you are pretty much guaranteed to get a collision /sometime/ no matter what input you feed in, which will then make this code fail regularly.

      Ive seen all kinds of variants of this. All of them would be broken with old perls with tweaks to the state of the hash, such as pre-sizing the hash buckets to a particular size, or putting the same keys in two hashes with two different bucket array sizes. Things that would fail very very rarely, and would be very hard to debug. All of these kind of bugs will start happening much more often, and therefore become much easier to track down and fix.


        Hmm. I tried this test script:

        #!/usr/bin/perl use strict; use warnings; my %hash = map { $_ => 1 } (0..5); my %copy = %hash; my $hash_keys = join(' ', keys %hash); my $copy_keys = join(' ', keys %copy); print "$hash_keys --> $copy_keys, ", $hash_keys eq $copy_keys ? "Same" : "Different", " order \n";

        with several different versions of Perl. It prints "Different order" with all the versions I tried (5.8.9, 5.10.1, 5.12.5, 5.14.4, 5.16.3, 5.17.10). With all but 5.17.10, the output is the same each time the program is run:

        4 1 3 0 2 5 --> 1 4 0 3 2 5, Different order

        With 5.17.10, I get different orders each time I run the program.

        So, if I understand correctly, the code you posted is visibly broken and has been for a very long time.

Re: Hash order randomization is coming, are you ready?
by chrestomanci (Priest) on Nov 29, 2012 at 12:53 UTC

    Will the order of keys returned change from one call to the next on the same has in the same run of a script? Is there an explicit shuffle done every time you call keys etc?

    Consider the code:

    my $hashRef = someFuncCall(); my $keys1 = join(' ', keys %$hashRef); my $keys2 = join(' ', keys %$hashRef); print "Key order is not stable" if $keys1 ne $keys2;

      One certainly hopes they haven't broken one of the few guarantees given in the POD:

      keys HASH

      Returns a list consisting of all the keys of the named hash. (In scalar context, returns the number of keys.)

      The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see Algorithmic Complexity Attacks in the perlsec manpage).

      Then again, it would have been useful to jave been told how this new randomisation varies from that which has been in place since 5.8.1?

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        A casual test looks like that statement is still true:

        use 5.17.6; use strict; use warnings; my %a = ( z => 1, y => 2, x => 3); foreach (1..3) { print join(",", keys %a), "\n"; print join(",", values %a), "\n"; while (my ($d,$e) = each %a) { print "$d = $e\n"; } }

        run 1

        z,y,x 1,2,3 z = 1 y = 2 x = 3 z,y,x 1,2,3 z = 1 y = 2 x = 3 z,y,x 1,2,3 z = 1 y = 2 x = 3

        run 2

        x,y,z 3,2,1 x = 3 y = 2 z = 1 x,y,z 3,2,1 x = 3 y = 2 z = 1 x,y,z 3,2,1 x = 3 y = 2 z = 1

        No, this has not changed.


      Since we have to maintain the same order between values() and keys() I do not think that we can change this. However IMO you should not depend on it.


        Maintaining the same order between values and keys should mean you can depend on keys returning the same order each time, otherwise how is values supposed to know which order to present its order?

        my @keys1 = keys %some_hash; my @keys2 = keys %some_hash; #different from @keys1? my @values = values %some_hash; # same order as @keys1 or same as @key +s2?
        That makes it look pretty dependable to me, assuming: a) same process (or a fork of it), and b) hash otherwise unchanged between the two calls to keys.

        Personally, I only rely on the order when I'm doing something like @hash2{keys %hash1} = values %hash1;, otherwise I don't think I ever rely on the order from keys (if I care, I sort them) or values (there's no real way to sort values the same as sorting keys anyway), so I doubt I'm affected, but only one way to find out. :-)

Re: Hash order randomization is coming, are you ready?
by LanX (Saint) on Dec 05, 2012 at 01:09 UTC
    As a side info:

    Recently a friend of mine showed how to attack webservers running with languages which do not randomize hashes.

    DOS attack with hash collisions (Perl rulez)

    Perl was immune, and should stay ahead of fraud evolution.

    Thanx for your work! =)

    Cheers Rolf

Re: Hash order randomization is coming, are you ready? (perlfaq4)
by Anonymous Monk on Jul 19, 2013 at 07:22 UTC
    Has perlfaq/perlfaq4 been "vetted" for this issue of hash order randomization?

    @hash1{ keys %hash2 } = values %hash2;

      For a single hash, keys and values are (still) guaranteed to return the items in the same order, as long as the hash remains unmodified between the calls; see keys. This means that this exact line still keeps working. Doing otherwise would likely break much code in very hard to detect ways.

Re: Hash order randomization is coming, are you ready?
by Anonymous Monk on Nov 23, 2012 at 00:40 UTC