http://qs321.pair.com?node_id=439830


in reply to Space Efficiency of Hashes

Well, nothing beats a good TIAS with Devel::Size in the mix:
#!/usr/bin/perl use strict; use warnings; my @printables = map chr, 33 .. 126; # random string 30 characters long sub randstr { join '', map { $printables[rand(@printables)] } 1..30 } use Devel::Size qw(size total_size); sub commas { my $str = ''.reverse shift; return scalar reverse join ',', grep length, split /(.{3})/, $str } my %hash = ( randstr => randstr ); for (0..5) { my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " b +ytes\n"; for (1..9 * $count) { my $s = randstr; $hash{$s} = randstr; } } my $count = scalar keys %hash; print commas($count), " elements: ", commas(total_size\%hash), " bytes +\n";
These are the results it printed out on my machine, for v5.8.4:
1 elements: 176 bytes 10 elements: 1,171 bytes 100 elements: 11,249 bytes 1,000 elements: 111,133 bytes 10,000 elements: 1,135,573 bytes 100,000 elements: 11,224,325 bytes 1,000,000 elements: 111,194,341 bytes
(That last one took several minutes to complete on my system...)

Taking my last result and multiplying by five, it seems that 5 million {acc, build} pairs will take only slightly more than half a gig of RAM to store all those hash entries.

So, if your machine has 2GB of RAM -- you should be good to go.

--Stevie-O
$"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc