Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Split range 0 to M into N non-overlapping (roughly equal) ranges.

by JavaFan (Canon)
on Mar 12, 2011 at 19:41 UTC ( #892857=note: print w/replies, xml ) Need Help??


in reply to Split range 0 to M into N non-overlapping (roughly equal) ranges.

Without additional variables, or special casing the last range:
my @ranges = map {[int($_*($M+1)/$N), int(($_+1)*(($M+1)/$N))-1]} 0..$ +N - 1;
All the ranges should differ at most 1 in size.
  • Comment on Re: Split range 0 to M into N non-overlapping (roughly equal) ranges.
  • Download Code

Replies are listed 'Best First'.
Re^2: Split range 0 to M into N non-overlapping (roughly equal) ranges.
by BrowserUk (Pope) on Mar 12, 2011 at 21:03 UTC

    With the addition of a little whitespace and factoring out the step size as that is often required for other parts of the code, that's the one I think:

    #! perl -slw use strict; our $N //= 2; our $M //= 1e6; my $STEP = ( $M + 1 ) / $N; my @ranges = map [ int( $_ * $STEP ), int( ( $_+1 ) * $STEP ) -1 ], 0 .. $N - 1; printf "%2d : from %7d to %7d (%7d)\n", $_, @{ $ranges[ $_ ] }, $ranges[ $_ ][ 1 ] - $ranges[ $_ ][ 0 ] + 1 for 0 .. $#ranges;

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Split range 0 to M into N non-overlapping (roughly equal) ranges.
by ELISHEVA (Prior) on Mar 13, 2011 at 05:22 UTC

    FYI, there are occasional one-off rounding error issues with this algorithm (possibly platform dependent), 2.5% of the time partitioning ranges of 1..100 or less. Depending on your goals that may or may not matter, but here are some examples :

    $M,$N final array element (should be $M) (14,11) ==> final=13 (14,13) ==> final=13 (29,11) ==> final=28 (29,13) ==> final=28 (29,22) ==> final=28 (29,26) ==> final=28 (59,11) ==> final=58 (59,13) ==> final=58 (59,22) ==> final=58 (59,26) ==> final=58 (59,44) ==> final=58 (59,52) ==> final=58 (59,55) ==> final=58 (100,49) ==> final=99 (100,98) ==> final=99 (100,99) ==> final=99

    Test script

    Update added test script and added parenthetical remark about platform dependency.

      Good catch! As originally coded, the error rate across all M/N combinations (< 2**32), seems to come out at ~ 1 in 15 (6.66%).

      #! perl -slw use strict; use List::Util qw[ min ]; use Math::Random::MT qw[ rand ]; $|++; sub check { my( $m, $n ) = @_; my $step = ( $m +1 ) / $n; my $f = int( $n * $step ) -1; return if $f != $m; return 1; } my $trials = 0; my $fails = 0; for ( 1 .. 1e6 ) { my $m = int( rand 2**32 ); for ( 1 .. min( $m, 1000 ) ) { ++$trials; my $n = 1+int( rand $m ); check( $m, $n ) or ++$fails; } printf "\r$_ : %f%%", $fails *100 / $trials; } __END__ C:\test>ranges 76977645 : 6.620262%

      However, a simple fudge floating point rounding correction factor of 0.000001 seems to sort things out nicely:

      #! perl -slw use strict; use List::Util qw[ min ]; use Math::Random::MT qw[ rand ]; $|++; sub check { my( $m, $n ) = @_; my $step = ( $m +1.000001 ) / $n; my $f = int( $n * $step ) -1; # warn( "m:$m n:$n f:$f\n" ) return if $f != $m; return 1; } my $trials = 0; my $fails = 0; for ( 1 .. 1e6 ) { my $m = int( rand 2**32 ); for ( 1 .. min( $m, 1000 ) ) { ++$trials; my $n = 1+int( rand $m ); check( $m, $n ) or ++$fails; } printf "\r$_ : %f%%", $fails *100 / $trials; } __END__ C:\test>ranges 6783635 : 0.000000%

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        For the record, although developed independently my version at Re: Split range 0 to M into N non-overlapping (roughly equal) ranges. has the same sort of bug, triggered by at least one of the same input pairs. I guess when two people hit the same flaw, we at least know we're close to the right solution.

        Since the fix is already discussed for JavaFan's map version, I'm sure the translation of that fix to the for loop can be left as an exercise. Personally, my fix would probably be to pull in bignum, which works well since it's a rounding error.

        Broken:

        Fixed:

        [chris@pineapple ranges-892828]$ perl -Mbignum ranges -M=14 -N=11 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 3 ( 2) 3 : from 4 to 4 ( 1) 4 : from 5 to 5 ( 1) 5 : from 6 to 7 ( 2) 6 : from 8 to 8 ( 1) 7 : from 9 to 9 ( 1) 8 : from 10 to 11 ( 2) 9 : from 12 to 12 ( 1) 10 : from 13 to 14 ( 2) [chris@pineapple ranges-892828]$ perl -Mbignum ranges -M=13 -N=12 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 2 ( 1) 3 : from 3 to 3 ( 1) 4 : from 4 to 4 ( 1) 5 : from 5 to 6 ( 2) 6 : from 7 to 7 ( 1) 7 : from 8 to 8 ( 1) 8 : from 9 to 9 ( 1) 9 : from 10 to 10 ( 1) 10 : from 11 to 11 ( 1) 11 : from 12 to 13 ( 2)

        It's maybe slower than fudging, but I have more faith in it. It's easy enough to add a line at the top of the program to let someone else deal with the intricacies since this isn't going to be the resource-hungry part of your code anyway.

      Yeah, they are rounding errors. Rearranging the calculation fixes it (at least, on my platform):
      sub ranges_javafan { my ($M, $N) = @_; return [ map {[int($_/$N*($M+1)) , int((($_+1)/$N)*($M+1))-1] } 0..$N - 1]; }
      The most important is the final ($_+1)/$N*($M+1) - that should return $M+1 for $_ == $N-1. The impact of other rounding errors is that a range is 1 shorter than expected (but making the next one one larger - it shouldn't skip numbers).

      What?

      [ 6:16:11.35] C:\test>ranges -M=14 -N=11 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 3 ( 2) 3 : from 4 to 4 ( 1) 4 : from 5 to 5 ( 1) 5 : from 6 to 7 ( 2) 6 : from 8 to 8 ( 1) 7 : from 9 to 9 ( 1) 8 : from 10 to 11 ( 2) 9 : from 12 to 12 ( 1) 10 : from 13 to 13 ( 1) ======================================== 14 [ 6:16:42.76] C:\test>ranges -M=14 -N=13 0 : from 0 to 0 ( 1) 1 : from 1 to 1 ( 1) 2 : from 2 to 2 ( 1) 3 : from 3 to 3 ( 1) 4 : from 4 to 4 ( 1) 5 : from 5 to 5 ( 1) 6 : from 6 to 7 ( 2) 7 : from 8 to 8 ( 1) 8 : from 9 to 9 ( 1) 9 : from 10 to 10 ( 1) 10 : from 11 to 11 ( 1) 11 : from 12 to 12 ( 1) 12 : from 13 to 13 ( 1) ======================================== 14 [ 6:17:01.29] C:\test>ranges -M=29 -N=11 0 : from 0 to 1 ( 2) 1 : from 2 to 4 ( 3) 2 : from 5 to 7 ( 3) 3 : from 8 to 9 ( 2) 4 : from 10 to 12 ( 3) 5 : from 13 to 15 ( 3) 6 : from 16 to 18 ( 3) 7 : from 19 to 20 ( 2) 8 : from 21 to 23 ( 3) 9 : from 24 to 26 ( 3) 10 : from 27 to 28 ( 2) ======================================== 29 [ 6:17:30.04] C:\test>ranges -M=100 -N=49 0 : from 0 to 1 ( 2) 1 : from 2 to 3 ( 2) 2 : from 4 to 5 ( 2) 3 : from 6 to 7 ( 2) 4 : from 8 to 9 ( 2) 5 : from 10 to 11 ( 2) 6 : from 12 to 13 ( 2) 7 : from 14 to 15 ( 2) 8 : from 16 to 17 ( 2) 9 : from 18 to 19 ( 2) 10 : from 20 to 21 ( 2) 11 : from 22 to 23 ( 2) 12 : from 24 to 25 ( 2) 13 : from 26 to 27 ( 2) 14 : from 28 to 29 ( 2) 15 : from 30 to 31 ( 2) 16 : from 32 to 34 ( 3) 17 : from 35 to 36 ( 2) 18 : from 37 to 38 ( 2) 19 : from 39 to 40 ( 2) 20 : from 41 to 42 ( 2) 21 : from 43 to 44 ( 2) 22 : from 45 to 46 ( 2) 23 : from 47 to 48 ( 2) 24 : from 49 to 50 ( 2) 25 : from 51 to 52 ( 2) 26 : from 53 to 54 ( 2) 27 : from 55 to 56 ( 2) 28 : from 57 to 58 ( 2) 29 : from 59 to 60 ( 2) 30 : from 61 to 62 ( 2) 31 : from 63 to 64 ( 2) 32 : from 65 to 67 ( 3) 33 : from 68 to 69 ( 2) 34 : from 70 to 71 ( 2) 35 : from 72 to 73 ( 2) 36 : from 74 to 75 ( 2) 37 : from 76 to 77 ( 2) 38 : from 78 to 79 ( 2) 39 : from 80 to 81 ( 2) 40 : from 82 to 83 ( 2) 41 : from 84 to 85 ( 2) 42 : from 86 to 87 ( 2) 43 : from 88 to 89 ( 2) 44 : from 90 to 91 ( 2) 45 : from 92 to 93 ( 2) 46 : from 94 to 95 ( 2) 47 : from 96 to 97 ( 2) 48 : from 98 to 99 ( 2) ======================================== 100

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm seeing the same numbers as you. But what is your output for: ranges -M=20 -N=3 ?

        I get: 0-6, 7-13, 14-20.

        I expect the last index to be 19.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://892857]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2021-10-22 10:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My first memorable Perl project was:







    Results (85 votes). Check out past polls.

    Notices?