Re^3: Split range 0 to M into N non-overlapping (roughly equal) ranges.

in reply to Re^2: Split range 0 to M into N non-overlapping (roughly equal) ranges.
in thread Split range 0 to M into N non-overlapping (roughly equal) ranges.

Good catch! As originally coded, the error rate across all M/N combinations (< 2**32), seems to come out at ~ 1 in 15 (6.66%).

#! perl -slw
use strict;
use List::Util qw[ min ];
use Math::Random::MT qw[ rand ];

$|++;

sub check {
    my( $m, $n ) = @_;
    my $step = ( $m +1 ) / $n;
    my $f = int( $n * $step ) -1;
    return if  $f != $m;
    return 1;
}

my $trials = 0;
my $fails = 0;
for ( 1 .. 1e6 ) {
    my $m = int( rand 2**32 );
    for ( 1 .. min( $m, 1000 ) ) {
        ++$trials;
        my $n = 1+int( rand $m );
        check( $m, $n ) or ++$fails;
    }
    printf "\r$_ : %f%%", $fails *100 / $trials;
}
__END__
C:\test>ranges
76977645 : 6.620262%
[download]

However, a simple ~~fudge~~ floating point rounding correction factor of 0.000001 seems to sort things out nicely:

#! perl -slw
use strict;
use List::Util qw[ min ];
use Math::Random::MT qw[ rand ];

$|++;

sub check {
    my( $m, $n ) = @_;
    my $step = ( $m +1.000001 ) / $n;
    my $f = int( $n * $step ) -1;
#    warn( "m:$m n:$n f:$f\n" )
    return if  $f != $m;
    return 1;
}

my $trials = 0;
my $fails = 0;
for ( 1 .. 1e6 ) {
    my $m = int( rand 2**32 );
    for ( 1 .. min( $m, 1000 ) ) {
        ++$trials;
        my $n = 1+int( rand $m );
        check( $m, $n ) or ++$fails;
    }
    printf "\r$_ : %f%%", $fails *100 / $trials;
}


__END__
 C:\test>ranges
6783635 : 0.000000%
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^3: Split range 0 to M into N non-overlapping (roughly equal) ranges. Select or Download Code

Replies are listed 'Best First'.
Re^4: Split range 0 to M into N non-overlapping (roughly equal) ranges. by mr_mischief (Monsignor) on Mar 16, 2011 at 04:50 UTC
For the record, although developed independently my version at Re: Split range 0 to M into N non-overlapping (roughly equal) ranges. has the same sort of bug, triggered by at least one of the same input pairs. I guess when two people hit the same flaw, we at least know we're close to the right solution. Since the fix is already discussed for JavaFan's map version, I'm sure the translation of that fix to the for loop can be left as an exercise. Personally, my fix would probably be to pull in bignum, which works well since it's a rounding error. Broken: Read more... (4 kB) Fixed: [chris@pineapple ranges-892828]$ perl -Mbignum ranges -M=14 -N=11 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 3 ( 2) 3 : from 4 to 4 ( 1) 4 : from 5 to 5 ( 1) 5 : from 6 to 7 ( 2) 6 : from 8 to 8 ( 1) 7 : from 9 to 9 ( 1) 8 : from 10 to 11 ( 2) 9 : from 12 to 12 ( 1) 10 : from 13 to 14 ( 2) [chris@pineapple ranges-892828]$ perl -Mbignum ranges -M=13 -N=12 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 2 ( 1) 3 : from 3 to 3 ( 1) 4 : from 4 to 4 ( 1) 5 : from 5 to 6 ( 2) 6 : from 7 to 7 ( 1) 7 : from 8 to 8 ( 1) 8 : from 9 to 9 ( 1) 9 : from 10 to 10 ( 1) 10 : from 11 to 11 ( 1) 11 : from 12 to 13 ( 2) [download] It's maybe slower than fudging, but I have more faith in it. It's easy enough to add a line at the top of the program to let someone else deal with the intricacies since this isn't going to be the resource-hungry part of your code anyway.	[reply] [d/l] [select]
Re^5: Split range 0 to M into N non-overlapping (roughly equal) ranges. by BrowserUk (Patriarch) on Mar 16, 2011 at 05:15 UTC
It's maybe slower than fudging, but I have more faith in it. Having called it "fudging", there is actually very sound math sitting behind it. I'm not competent to explain the math, but I can explain my testing. I've run all permutations of M & N from 1 through 10,000; and 500 million random selections of M & N = 1 .. 2**32 without seeing a single error. Given that in the original form, there was a 1 in 15 error rate, the odds that if there were any errors with the "fudged" algorithm, that I have not detected 1 of them after 600 million trials, are so vanishingly small as to be ... well, very, very unlikely. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]

In Section Seekers of Perl Wisdom