Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re^2: Hash order randomization is coming, are you ready?

by demerphq (Chancellor)
on Nov 22, 2012 at 12:18 UTC ( [id://1005128] : note . print w/replies, xml ) Need Help??

in reply to Re: Hash order randomization is coming, are you ready?
in thread Hash order randomization is coming, are you ready?

Yes. I shepherded this change into blead. In the course of doing so I had to fix a surprisingly large amount of code that one way or another had bogus expectations of hash ordering. Some of these were surprisingly subtle. For instance code like this:

my %hash= some_list_of_key_value_pairs(); my @keys= keys %hash; #... my %copy= %hash; #... my @values= values %copy; do_something_with($keys[$_],$values[$_]) for 0..$#keys;

This code will work so long as the keys put into %hash all hash into different buckets. However if any collide it will fail. So it is very sensitive to what data is involved and to an uninformed observer would make no sense, especially if test samples happened to work out. With an older perl the same input would always produce the same output so a safe input set would remain safe for ever, on the other hand with hash seed randomization you are pretty much guaranteed to get a collision /sometime/ no matter what input you feed in, which will then make this code fail regularly.

Ive seen all kinds of variants of this. All of them would be broken with old perls with tweaks to the state of the hash, such as pre-sizing the hash buckets to a particular size, or putting the same keys in two hashes with two different bucket array sizes. Things that would fail very very rarely, and would be very hard to debug. All of these kind of bugs will start happening much more often, and therefore become much easier to track down and fix.


Replies are listed 'Best First'.
Re^3: Hash order randomization is coming, are you ready?
by kst (Initiate) on Mar 29, 2013 at 05:25 UTC

    Hmm. I tried this test script:

    #!/usr/bin/perl use strict; use warnings; my %hash = map { $_ => 1 } (0..5); my %copy = %hash; my $hash_keys = join(' ', keys %hash); my $copy_keys = join(' ', keys %copy); print "$hash_keys --> $copy_keys, ", $hash_keys eq $copy_keys ? "Same" : "Different", " order \n";

    with several different versions of Perl. It prints "Different order" with all the versions I tried (5.8.9, 5.10.1, 5.12.5, 5.14.4, 5.16.3, 5.17.10). With all but 5.17.10, the output is the same each time the program is run:

    4 1 3 0 2 5 --> 1 4 0 3 2 5, Different order

    With 5.17.10, I get different orders each time I run the program.

    So, if I understand correctly, the code you posted is visibly broken and has been for a very long time.

      So, if I understand correctly, the code you posted is visibly broken and has been for a very long time.

      Well yes, it is broken, that was my point for bringing it up :-). Whether it is visibly broken or not depends on whether you have any collisions. In the case you posted there are two sets of collisions, 4 and 1, and 3 and 0. (You can tell because they reverse each copy in earlier perls.) However if changed it to be 1,2,3,5,7,8 you would not have any collisions and the order would appear the same. Eg try:

      my %hash = map { $_ => 1 } (1,2,3,5,7,8);

      In 5.18 the order will be different for every hash. No exceptions. Here is what perlfunc will say in 5.18:

      Hash entries are returned in an apparently random order. The actual r +andom order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any inser +tion into the hash may change the order, as will any deletion, with the exc +eption that the most recent key returned by C<each> or C<keys> may be deleted without changing the order. So long as a given hash is unmodified you +may rely on C<keys>, C<values> and C<each> to repeatedly return the same o +rder as each other. See L<perlsec/"Algorithmic Complexity Attacks"> for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

        Ok, I just didn't expect that many collisions for such a small set of keys. I've never paid much attention to how Perl computes hashes for %hashes; I've been content so far to assume that It Just Works.