Re: is rand random enough to simulate dice rolls?
by haukex (Archbishop) on Jan 03, 2021 at 15:16 UTC
|
But the case is I want to put this inside a .t file of a module
If this is for distribution to CPAN, then personally I would try and stay away from anything random - there are lots of different systems out there, and why risk literally random test failures? Instead, I'd ask what it is you're actually testing, because I personally wouldn't see the need for testing that Perl's rand is working correctly, that's the job of the Perl test suite. If you're testing that your code works properly given various inputs, then you can test that by mocking rand - or even write your code so that your PRNG is wrapped in a function, so if you've got a user who wants different sources of randomness, they can switch it out.
| [reply] [d/l] |
|
Yup. You'll get an average of 3.5 after a large number of rolls (for some definition of 'large') .. but it's possible that you won't meet the (3.4,3.6) range based on a string of high or low runs in the random number generator, and your test will fail. At that point the user will go .. "Huh?" I'm not sure what their next step might be -- if they're familiar with module installation, they might try a make test, and see that it passes the next time, and continue on to do the install. And some people might ditch the installation completely.
I'd recommend you change that to a non-critical test.
Alex / talexb / Toronto
Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.
| [reply] [d/l] |
|
> but it's possible that you won't meet the (3.4,3.6) range based on a string of high or low runs in the random number generator, and your test will fail.
But that's exactly what the definition of a PRNG guarantees.
There *exists* a large number, such that the range is met in the run.
Problematic is only the suggested number 10000, but my gut feeling says it's already big enough.
I'd say if the OP wants to be sure of that, then he should include it in his test-suite.
And if it's not critical enough to make the installation fail, he can still use it to emit a warning to the user.
| [reply] |
|
"If this is for distribution to CPAN, then personally I would try and stay away from anything random - there are lots of different systems out there, and why risk literally random test failures?"
I wholeheartedly agree with this. Literal random failures.
The primary reason isn't because a user might run into random failures, because all they have to do is re-run the installer and it'll likely pass the next time, but I know for fact that the maintainers of some of the Tester platforms do spend time investigating why distributions fail, then spend more time contacting the author to inform them as to what's broken and oftentimes spend yet even more time sorting out and offering solutions on the fix.
I've been the recipient of dozens of said kind emails over the years. I would cringe to think I'd wasted their time after I uploaded tests that would be known to false-negative fail on me.
| [reply] |
|
> there are lots of different systems out there, and why risk literally random test failures?
not random test failures, but testing the implementation.
afoken linked to an article stating that
from 5.20 on Perl brings it's own PRNG doesn't rely on the OS / C-Compiler anymore.
This also means there are "lots of different systems" out there which are less reliable.
And if this requirement is crucial for his app, he should abort installation and request another Perl version.
| [reply] |
Re: is rand random enough to simulate dice rolls?
by afoken (Chancellor) on Jan 03, 2021 at 14:46 UTC
|
While the above seems to be ok on my platform, can I expect the same result on every platform? and for every perl version around the world?
At least old perls for Windows had a very limited PRNG, with an interval of only 2^14 (?). I've no idea how evenly distributed its output is/was. So, the quality of rand() differs from platform to platform and may also change with the perl version.
Should I repeat the check 100 times to have an average of averages?
Averaging averages won't help. After all, you just sum up n*m numbers and divide by n*m - either in n steps or all at once. Using the median of n averages might help.
Update:
See also Random numbers are not random enough on Windows and https://www.effectiveperlprogramming.com/2014/06/perl-5-20-uses-its-own-random-number-generator/.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] |
|
> At least old perls for Windows had a very limited PRNG, with an interval of only 2^14 (?).
Having an interval of 2**14 ~= 16_000 means the sequence repeats.
> I've no idea how evenly distributed its output is/was.
By definition it must hit the average 3.5 exactly after 2**14 runs.°
IMHO 10000 runs for a margin of ± 0.2 is a safe bet. ²
If this test fails, something must be done.
°) modulo rounding errors with Perl floats
²) that's more than 3% of the interval!
| [reply] [d/l] |
Re: is rand random enough to simulate dice rolls?
by jo37 (Deacon) on Jan 03, 2021 at 14:56 UTC
|
For random numbers that need not be cryptographically secure, but shall be "a bit better" than rand I usually use the Mersenne Twister.
It has a period of 2^19937 - 1 and can be seeded with up to 624 long integers.
Greetings, -jo
$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
| [reply] [d/l] |
Re: is rand random enough to simulate dice rolls?
by syphilis (Archbishop) on Jan 05, 2021 at 00:59 UTC
|
Hi,
According to section 5.4.4 of Handbook of Applied Cryptography (by Menezes, van Oorschot and Vanstone), FIPS 140-1 specified 4 statistical tests for randomness - namely "monobit", "runs", "long run", "poker".
I implemented those tests (undocumented and not exported) in Math::GMPz.
On Windows (whether randbits == 48 or randbits == 15, when I use rand() to generate the 20000-bit sequence that these tests require, I've always found the generated bit sequence to pass those tests.
Moreover, I can see no sign of any cycling - I take a 30-bit sequence from near the end of the string, and cannot find that same sub-string elsewhere within the 20000-bit sequence.
I've also added a fifth "autocorrelation" test from the same section of the book.
Here's the script I run:
use strict;
use warnings;
use Math::GMPz qw(:mpz);
use Test::More;
my $s;
# Create a string of 20000 bits
for(1..20000) {
$s .= int(rand(2));
}
# Visually inspect the value
#open (my $fh, '>', 'val.txt') or warn "Open: $!";
#print $fh $s;
#close $fh;
# Vectorize that string into a
# Math::GMPz object
my $mpz = Math::GMPz->new($s, 2);
# Check that the no. of set bits is in
# the range 9655..10345
cmp_ok(Math::GMPz::Rmonobit($mpz), '==', 1,
'monobit test');
# Check that the longest run of the same bit is
# shorter than 34
cmp_ok(Math::GMPz::Rlong_run($mpz), '==', 1,
'long run test');
# Check that the numbers of 1-bit runs, 2-bit runs,
# 3-bit runs 4-bit runs, 5-bit runs and 6-bit runs
# (of both zeros and ones) meet expectations.
cmp_ok(Math::GMPz::Rruns($mpz), '==', 1,
'runs test');
# Check that the no. of occurrences of the various
# 4-bit sequences meets expectations.
cmp_ok(Math::GMPz::Rpoker($mpz), '==', 1,
'poker test');
# Check that the number of times that
# bit[pos] == bit[pos + 2] is in the
# range 9655..10345.
my @ret = Math::GMPz::autocorrelation($mpz, 2);
cmp_ok($ret[0], '>', 9654, 'autocorrelation count > lower limit');
cmp_ok($ret[0], '<', 10346, 'autocorrelation count < upper limit');
done_testing();
Cheers, Rob | [reply] [d/l] |
Re: is rand random enough to simulate dice rolls?
by shmem (Chancellor) on Jan 03, 2021 at 15:02 UTC
|
But I dont wont to do a cryptographically secure application.
I just want to be sure that if I use rand to simulate a six sided die and I repaet the roll 10000 times I will get an average result of 3.5 as expected.
If that's the case, why bother? If the result is other than expected, just output that, a short remark about why this is bad, and let users of specific platforms scratch their own head :P
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] |
|
my $tenk;
$tenk += 1 + int( rand(6) ) for 1..10000;
my $avg = $tenk/10000;
# $avg = 5; # uncomment this line to provoke the warning
if ( ($avg < 3.4) or ($avg > 3.6) ){
diag("\n\n\nPROBLEM: you got an average of $avg while was expected
+ a value > 3.4 and < 3.6\n\n\n".
"The average was made on 10000 results.\n".
"This can happen in old Perl distribution on some platform
+.\n".
"In future distributions of this module you might be able
+to load a different random number generator\n\n\n\n")
}
else{
ok ( $avg > 3.4, "average randomness ok (10000d6 / 10000 > 3.4)" )
+;
ok ( $avg < 3.6, "average randomness ok (10000d6 / 10000 < 3.6)" )
+;
}
L*
UPDATE nothing critical nor crucial: I'm just started a little project just to clean my rusty hands: it was a long time (for me) without coding :)
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
|
Re: is rand random enough to simulate dice rolls? (math definition)
by LanX (Saint) on Jan 03, 2021 at 15:08 UTC
|
The rand function is supposed to be a "Pseudorandom number generator".
And the requirements for PRNG's are mathematically defined.
One of them says, that for every range epsilon ε around the expected average P(E) you can find a minimum set size N such that the actual average will always be within that range.
That's math-speak for "the bigger the closer".
And that's very similar to your test, with the exception that there is no guaranty about this set size 10000. Could be less, could be more.
From my tests it's very likely to hold for 10000 tho, but you can only be "sure"° after checking the implementation.
HTH! :)
°) DISCLAIMER: security is relative if it comes to software and it's implementation | [reply] [d/l] [select] |
Re: is rand random enough to simulate dice rolls?
by LanX (Saint) on Jan 03, 2021 at 16:43 UTC
|
> Should I repeat the check 100 times to have an average of averages?
Not an "average" of averages, this would just be like one test for 100*N with N=10000.
You could rather repeat the test 100 times for N tries and require that it never fails. (that's a much stronger condition)
Again: N should be chosen to reflect a necessary condition for the correctness of your app.
| [reply] |
Re: is rand random enough to simulate dice rolls?
by kcott (Archbishop) on Jan 04, 2021 at 00:08 UTC
|
G'day Discipulus,
About a decade ago, I wrote dice rolling code using almost exactly the same code as you show here.
It was for use with a table-top RPG; it handled die with any number of sides.
Over the years, I added lots of features around the code (mostly ever-more fancy Tk interfaces)
but the core dice-rolling code was unchanged.
I've used this code with Perl versions ranging from 5.12 to 5.32.
Perl installations have included Perlbrew, Strawberry Perl and ActivePerl.
Platforms have included WinXP & Win10 (both +/- Cygwin) and various Mac OS X/macOS versions.
I don't possess any data on checking averages;
although, at different times, I have successfully run similar tests to those you show.
In real-world usage, I've never noticed any problems, such as persistently getting high or low rolls;
I appreciate that's not really any sort of strict or scientific check.
If your intended usage is similar to mine
— you only said "a little project just to clean my rusty hands" —
then I'd say rand() is sufficient.
If your use case differs, then I see a plethora of advice in other posts that might be more appropriate.
"... can I expect the same result on every platform? and for every perl version around the world?"
Given the number of platforms that can run Perl, and the number of Perl versions around,
I doubt you'd ever get a completely definitive answer to that question.
However, you could ask people to supply the output from the same code and at least get a representative answer.
Here's some suggested code (which I think should run on any version of Perl5):
$ perl -e '
my $iters = 100_000;
my $runs = 6;
for my $sides (qw{1 2 3 4 6 8 10 20 100}) {
printf "%-5s", "D$sides:";
for (1 .. $runs) {
my $tot = 0;
for (1 .. $iters) {
$tot += int(rand $sides)+1;
}
print " ", $tot/$iters;
}
print "\n";
}
'
As a one-liner, for a quick copy-and-paste
(although, given use of the [download] link, copying the original might be just as quick):
perl -e 'my $iters = 100_000; my $runs = 6; for my $sides (qw{1 2 3 4
+6 8 10 20 100}) { printf "%-5s", "D$sides:"; for (1 .. $runs) { my $t
+ot = 0; for (1 .. $iters) { $tot += int(rand $sides)+1; } print " ",
+$tot/$iters; } print "\n"; } '
Here's sample output using my Win10/Cygwin/Perlbrew/Perl 5.32.0:
D1: 1 1 1 1 1 1
D2: 1.49943 1.5049 1.49996 1.49833 1.50001 1.50155
D3: 2.00469 1.99941 1.99753 1.99721 1.99762 2.00254
D4: 2.49589 2.49891 2.49487 2.50727 2.49977 2.49472
D6: 3.49716 3.49463 3.49886 3.50365 3.49936 3.50401
D8: 4.50082 4.51098 4.50972 4.49368 4.50158 4.5112
D10: 5.4949 5.48275 5.50242 5.50131 5.49563 5.51278
D20: 10.47763 10.47848 10.50873 10.48685 10.48474 10.51209
D100: 50.36542 50.49534 50.49152 50.40469 50.53791 50.34646
If you could get a few people to post results for a variety of
platforms, installations and Perl versions that they might have available,
you may get something approaching a reasonable answer to your "every platform ... every version" question.
And, just in case those three nested loops resulting in 5,400,000 calls to rand() is of concern,
the whole run only took about one second on my computer.
I have a reasonably high-end rig: YMMV but hopefully not too much; I don't think this will tie up anyone's computer for hours.
| [reply] [d/l] [select] |
|
canis [shmem] /home/shmem > time perl -e 'my $iters = 100_000; my $run
+s = 6; for my $sides (qw{1 2 3 4 6 8 10 20 100}) { printf "%-5s", "D$
+sides:"; for (1 .. $runs) { my $tot = 0; for (1 .. $iters) { $tot +=
+int(rand $sides)+1; } print " ", $tot/$iters; } print "\n"; } '
D1: 1 1 1 1 1 1
D2: 1.5017 1.50273 1.49927 1.50004 1.50069 1.50234
D3: 2.00005 2.0013 1.99667 2.00228 2.00141 2.0018
D4: 2.49651 2.49254 2.50251 2.5008 2.49389 2.50032
D6: 3.4947 3.49803 3.4938 3.50231 3.50249 3.49613
D8: 4.50561 4.51195 4.51012 4.50395 4.4961 4.50096
D10: 5.52471 5.50722 5.50845 5.4933 5.4983 5.49319
D20: 10.51722 10.48846 10.52023 10.50762 10.48594 10.52206
D100: 50.43492 50.46292 50.4675 50.66063 50.35462 50.46591
293.510u 0.450s 4:54.97 99.6% 0+1599k 4+7io 1pf+0w
canis [shmem] /home/shmem > uname -a
SunOS canis 4.1.4 5 sun4m
canis [shmem] /home/shmem > perl -v
This is perl, version 5.004
Copyright 1987-1997, Larry Wall
Perl may be copied only under the terms of either the Artistic License
+ or the
GNU General Public License, which may be found in the Perl 5.0 source
+kit.
SPARCstation voyager, cpu = SUNW,S240. Sorry, no PDP-11 available, and the Atari is out of order.
Update: looks like perl4 is significantly slower. But the numbers look right.
canis [shmem] /home/shmem > time perl4 -e '$iters = 100_000; $runs = 6
+; for $sides (1,2,3,4,6,8,10,20,100) { printf "%-5s", "D$sides:"; for
+ (1 .. $runs) { $tot = 0; for (1 .. $iters) { $tot += int(rand $sides
+)+1; } print " ", $tot/$iters; } print "\n"; }'
D1: 1 1 1 1 1 1
D2: 1.4976199999999999513 1.4995000000000000551 1.500839999999999951
+9 1.5009200000000000319 1.4983500000000000707 1.4992000000000000881
D3: 1.9988500000000000156 2.0010099999999999554 1.997360000000000024
+3 1.9947699999999999321 2.0022500000000000853 1.9987600000000000922
D4: 2.4961099999999998289 2.5017000000000000348 2.501459999999999794
+8 2.4986999999999999211 2.4970900000000000318 2.5005099999999997884
D6: 3.50016000000000016 3.4905300000000001326 3.5027200000000000557
+3.4986999999999999211 3.4940500000000001002 3.4981499999999998707
D8: 4.5008200000000000429 4.4963499999999996248 4.499399999999999622
+ 4.505130000000000301 4.500880000000000436 4.4966600000000003234
D10: 5.5046299999999996899 5.4901400000000002422 5.493229999999999613
+2 5.4982899999999998997 5.4878000000000000114 5.481379999999999697
D20: 10.478590000000000515 10.511730000000000018 10.48712000000000088
+6 10.511749999999999261 10.520690000000000097 10.513320000000000221
D100: 50.332399999999999807 50.534730000000003258 50.68713999999999941
+8 50.558239999999997849 50.449489999999997281 50.479340000000000543
473.690u 154.350s 26:25.22 39.6% 0+4164k 0+7io 39159pf+0w
canis [shmem] /home/shmem > perl4 -v
This is perl, version 4.0
$RCSfile: perl.c,v $$Revision: 4.0.1.4 $$Date: 91/06/10 01:23:07 $
Patch level: 10
Copyright (c) 1989, 1990, 1991, Larry Wall
Perl may be copied only under the terms of either the Artistic License
+ or the
GNU General Public License, which may be found in the Perl 4.0 source
+kit.
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] [d/l] [select] |
|
D1: 1 1 1 1 1 1
D2: 1.50264 1.50089 1.49874 1.4986 1.49999 1.50096
D3: 1.99666 2.00076 1.99974 2.00305 1.99855 2.00081
D4: 2.49717 2.50553 2.49876 2.50477 2.50204 2.50298
D6: 3.49901 3.50656 3.49592 3.49852 3.51048 3.49704
D8: 4.49035 4.50043 4.48853 4.49734 4.49567 4.49206
D10: 5.50564 5.48776 5.50685 5.50959 5.50021 5.50671
D20: 10.50074 10.48721 10.49889 10.50993 10.48554 10.52566
D100: 50.45232 50.43541 50.54345 50.61081 50.64355 50.45597
Perl v5.30.1 MSWin32n
I added:
print "Perl $^V\ $^On";
As pointed out by the sharp eyed shmem, kcott and NetWallah, I fat fingered the \ for the \n. Thanks guys. I've leaving the error version to keep myself humble.
Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
|
|
|
perl -e "my $iters = 100_000; my $runs = 6; for my $sides (qw{1 2 3 4
+6 8 10 20 100}) { printf '%-5s', qq(D$sides:); for (1 ..$runs) { my $
+tot = 0; for (1 .. $iters) { $tot += int(rand $sides)+1; } print ' ',
+ $tot/$iters; } print qq(\n); } print qq(Perl $^V $^O\n)"
And here are mine test results, all tests done in win10 and various strawberry protable:
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
|
I saw this post while looking up old PDL nodes, and you mentioned PDL there. PDL as of 2.062 uses a "better" (and thread-safe, for potential massive performance gainz) pseudo-random number generator, xoroshiro256plus (see link for thorough statistical analysis of it and competing algorithms) rather than the often-weak and usually non-thread-safe system rand() that Perl uses.
| [reply] [d/l] |
|
| [reply] |
|
Re: is rand random enough to simulate dice rolls? -- adopted solution
by Discipulus (Canon) on Jan 11, 2021 at 07:45 UTC
|
Hello monks and nuns,
Following your advises and adding what I received on #perl irc channel, I both made the eventually failing test not critical (using SKIP ) and I also let the user to provide their custom rand function. Relevant part are:
# ./Games/Dice/Roller.pm
sub new{
my $class = shift;
my %opts = @_;
if ( defined $opts{sub_rand} ){
croak "sub_rand must be a code reference meant to replace core
+ rand function"
unless ref $opts{sub_rand} eq 'CODE';
}
return bless {
sub_rand => $opts{sub_rand} // sub{ rand($_[0]) },
}, $class;
}
sub single_die{
my $self = shift;
my $sides = shift;
croak "single_die expect one argument" unless $sides;
croak "Invalid side [$sides]" unless $sides =~/^(\d+)$/;
$sides = $1;
return 1 + int( $self->{sub_rand}($sides) );
}
# ./t/01-single-die.t
my $tenk;
$tenk += $dice->single_die(6) for 1..10000;
my $avg = $tenk/10000;
# $avg = 5; # uncomment this line to provoke the warning
if ( ($avg < 3.4) or ($avg > 3.6) ){
diag(
"\n\n\nPROBLEM: you got an average of $avg while was expec
+ted a value > 3.4 and < 3.6\n\n\n".
"The average was made on 10000d6 rolls.\n".
"This can happen in old Perl distribution on some platform
+.\n".
"You can use sub_rand => sub{.. during constructor to prov
+ide an\n".
"alternative to core rand function (using rand from Math::
+Random::MT for example).\n\n\n\n"
);
}
else {
pass "average randomness ok (3.4 < 10000d6 / 10000 < 3.6)" ;
}
# ./t/07-rand-custom-funcion.t
SKIP: {
eval { require Math::Random::MT };
skip "Math::Random::MT not installed", 2 if $@;
my $gen = Math::Random::MT->new();
my $mt_dicer = Games::Dice::Roller->new(
sub_rand => sub{
my $sides = shift;
return $gen->rand( $sides );
},
);
my ($res, $descr) = $mt_dicer->roll('13d4kh7');
ok( $res >= 7, "succesfully used rand from Math::Random::MT as ran
+dom number generator");
ok( $res <= 28, "succesfully used rand from Math::Random::MT as ra
+ndom number generator")
}
The house of this toy project (for hippo' happines ;) is at gitlab
Thanks to you all
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
Re: is rand random enough to simulate dice rolls?
by bliako (Monsignor) on Jan 04, 2021 at 21:21 UTC
|
In choosing a PRNG I would also consider the period (as others pointed out) as well as the CPU burden. And what about the "randomness" of any subset of those 10,000 rolls? That is, you don't want a PRNG which outputs first 10,000/6 1's, followed by 10,0000/6 2's etc. The average of the sum will still be 3.5 but no randomness there.
Random tests are tricky as they may fail randomly but well within their design margins. But consider also how you are going to seed the PRNG for each repeated run. You may leave that to Perl and its "semi-random" way of chosing a seed for you, but is that good enough with repeated runs within the same script or from different ones?
Programmers have it easy: I don't know how a real dice is tested but it must be hard work!
By definition "real" (dice) means imperfect geometry, painting, material density. If you want to emulate a real dice then there's a lot more work than using a fair PRNG.
bw, bliako
| [reply] |
A reply falls below the community's threshold of quality. You may see it by logging in. |