http://qs321.pair.com?node_id=272873

blokhead has asked for the wisdom of the Perl Monks concerning the following question:

I'm performing a lengthy computation that uses a lot of calls to rand, and I want to be able to pause the computation and resume it again later (using Storable or some other serialization). The process starts with a random seed that I provide, so that I can reproduce interesting results later. I want the results to be the same for a given initial random seed regardless of whether I paused/resumed the computation, so I need to somehow save a random seed in my serialization.

The only reasonable* solution I could think of was something along these lines:

## either load the saved random seed and state data from the ## serialization (resume a paused compuatation), or start with ## new state data and the given random seed (start new ## compuation) my ($foo, $bar, $random_seed) = deserialize('saved') || initial_values(); while (1) { srand($random_seed); ## some crunching on $foo, $bar, using lots of rand() ## decide what the next iteration's seed will be $random_seed = int rand( 2 << 31 ); if (check_for_user_interrupt()) { ## save state, along with the seed for next iteration serialize('saved', $foo, $bar, $random_seed); exit; } }
I use rand to generate a new seed for the next iteration, so it can be saved after any iteration. As soon as I wrote srand inside my while loop, red flags popped up in my head. The srand perldoc says:
Do not call srand() multiple times in your program unless you know exactly what you're doing and why you're doing it.
... but I wonder if I really know exactly what I'm doing ;). I found at least one post regarding multiple srand calls, but in that case the poster was using something like time ^ $$ to re-seed within a very fast loop, and expecting different random output for each loop.

What I would really like to know is in what situations is ok to call srand multiple times? Is using a call to rand for a seed considered bad? I can't use any external entropy to re-seed, because I want all the results to be based on the original random seed supplied at the beginning of the computation. Any thoughts on the serialization of randomly-generated state data like this?

*: In the worst case I could always wrap rand, save the original random seed, and simply save the number of times rand was called. Then to resume again, I just reseed, and call rand in void context a whole slug of times. I'd rather not do this!

blokhead

Replies are listed 'Best First'.
Re: Can I seed srand() with results from rand()?
by BrowserUk (Patriarch) on Jul 10, 2003 at 02:03 UTC

    My take on the "don't call srand multiple times" stuff is that doing so is likely to compromise the randomness of the sequences returned by rand rather than improve them as might be naively assumed.

    In this case, you specifically want to reproduce the same sequences, and you only intend to call srand once for each sequence, so I would say that you do know what you are doing.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Re: Can I seed srand() with results from rand()?
by no_slogan (Deacon) on Jul 10, 2003 at 03:35 UTC

    If rand() is a linear congruential generator (usually the case) with modulus $M, then this

    srand( rand($M) )

    is likely to be almost equivalent to calling rand() once or twice. Not quite, because there might be extra magic in srand(), and some rand()s only return 15 bits of randomness (even though they're seeded with 31). You might not care about having your entropy reduced to that, so go ahead with your plan. Otherwise, look into an alternate PRNG that gives you access to its internal state, like Math::Random.

    What you want to avoid is using a seed with an obvious pattern in a loop. This is bad:

    srand( $n += 42 )

    And keep in mind that more things have obvious patterns than you might think.

      I should have been less cavalier. The generation of random numbers is too important to be left to chance, as they say. One of the desirable features of PRNGs is that they have a provably long period before they start repeating themselves. By reseeding this way, you lose that. After a large number of iterations, you might end up in a short limit cycle of possible values for the next seed. That will depend on the details of your rand() and exactly how many times you call it per loop. So I'm afraid I have to take back my advice. Don't reseed this way, epecially if you're going to do a lot of loops.

      You could use rand() with the initial seed to generate all the other seeds you need. You'd either have to know the number of loops ahead of time, or keep winding rand() forward from the initial seed.

      The best solution would be to use a different random number generator that lets you get at the current seed value. There are several available. They're also easy to write -- just don't try to pick the constants yourself.

        You could use rand() with the initial seed to generate all the other seeds you need.

        ... but that's a terrible idea if your rand() doesn't hide some bits from you. Gack. Stay away.

Re: Can I seed srand() with results from rand()?
by clintp (Curate) on Jul 10, 2003 at 02:47 UTC
    When I'm debugging something that I want to have pseudorandom behavior in real life, I use srand() with a constant to make the results predictable during debugging. Once I'm satisfied that things are working properly, I chuck the srand() completely (5.6+) and let things go.
Re: Can I seed srand() with results from rand()?
by Notromda (Pilgrim) on Jul 10, 2003 at 03:14 UTC
    I think that warning has as its intended audience those who don't really understand the significant difference between random and psuedo-random. Obviously the behaviour that you are looking for is pseudo-random, and I think you are all right to call it multiple times. But then again, I am not an expert. :P
Re: Can I seed srand() with results from rand()?
by mp (Deacon) on Jul 10, 2003 at 15:20 UTC
    This thread and its conclusion both dealing with a similar problem of producing repeatable pseudorandom sequences might be helpful to you. I used Math::Random::MT to generate random numbers and seeded it deterministically to get repeatable sequences without touching srand(), since srand()'s seed may have been depended on in other unrelated parts of the application (where less deterministic behavior was needed).

    Math::Random::MT keeps the seed state in an object instance, so it makes for easy construction of multiple deterministic pseudorandom sequence generators.

Re: Can I seed srand() with results from rand()?
by bunnyman (Hermit) on Jul 10, 2003 at 14:00 UTC
    The process starts with a random seed that I provide, so that I can reproduce interesting results later.

    The code you provided will make it hard for you to reproduce the results later.

    The only thing that is predictable about rand() is that given the exact same input to srand(), you will get the exact same outputs from rand() every time you do it.

    So:
    srand(42) -> 78, 102, 4, 19, ...
    srand(55) -> 8, 2, 66, 243, ...

    There isn't anything wrong with calling srand() more than once, except that doing so will not make the output more random, it may even make it less random.

    To reproduce the same outputs from rand(), you will have to call srand() with the same inputs in the same order every time, so this is a new part of state that you must save. On top of that, if the implementation of rand() changes in different versions of perl, you will not reproduce the same results on different versions.

Re: Can I seed srand() with results from rand()?
by thor (Priest) on Jul 10, 2003 at 12:11 UTC
    While I think that all of the warnings about this scheme eliminating the randomness of rand(), I think that they might be somewhat unwarranted. He is looking for predictability. Also, calling garden variety rand() does not reduce the entropy of the system because garden variety rand() is purely deterministic. So, nothing to worry about there, either.

    thor

      > He is looking for predictability.

      There's predictability and there's predictibility. It tends to be bad for simulations if your "random" numbers start being the same on every pass through the main loop.

      > Also, calling garden variety rand() does not reduce the entropy of the system because garden variety rand() is purely deterministic.

      Entropy measures the number of possible states of a system. Determinism is about the transitions between states. The two concepts are largely orthogonal.

        There's predictability and there's predictibility. It tends to be bad for simulations if your "random" numbers start being the same on every pass through the main loop.
        It all depends on the application. Perhaps you are looking for something in a large set with a certain property. So, rather than iterate through, you pick a random starting point. IIRC, there are non-deterministic primality tests that take a random input and tell you whether a number is composite or not. In this case, you want to keep the one random input that tells you that the number under examination is composite if you want to prove it. If I can find an example of such an algorithm later, I'll update this node.
        Entropy measures the number of possible states of a system. Determinism is about the transitions between states. The two concepts are largely orthogonal.
        Yes. However, this does not contradict anything that I said. Some people were concerned about repeated calls to rand() reducing the strength of randomness of data received from places like /dev/rand and such. I was saying that this was not the case, as perl uses a PRNG, which given a specific input is completely deterministic.

        thor

        Update: Check this out.

Caveat
by jbeninger (Monk) on Jul 10, 2003 at 14:45 UTC
    I'm pretty sure (not positive, so someone please correct me if I'm wrong) that rand() has system-specific behaviour, and that if you seed it on a Windows machine, you could get a different sequence from the exact same seed on a BSD machine. This is fine if it's in-house, but if you ever want to be able to say "hey - found a cool sequence - try out seed xxxxx", they might not get the same sequence.

    I'm not sure if there's a pure-perl PNRG on CPAN somewhere, but when I came across the problem in a C project, I ended up writing my own. It's a relatively simple algorithm, and a quick google shows up a number of results.

Re: Can I seed srand() with results from rand()?
by RollyGuy (Chaplain) on Jul 10, 2003 at 14:21 UTC
    The rand that perl uses is a PRNG (pseudo random number generator). This loosely means that it is *mostly* random allowing for some predictability. This is where the seed comes in. The seed set up the PRNG in such a way as to produce a set of randoms based on that seed. So, if you were to use a constant seed in a program, then every time the program runs, it will choose the same random values. The following chunk of code is a good example of this:
    srand(1); for(0..4){ print rand(10),"\n"; }

    Every time you run this code, it will spit out the same values even though we are calling rand. Another way to think about it (abstractly) is that the PRNG is a hash of random sequences and the seed is the key to the hash. Each time you set the key, you are selecting a particular random sequence. Then, each call to rand gets the next value in the set sequence.

    This leads us into calling srand multiple times. I do see how calling srand multiple times will help you in your situation, but keep in mind that calling it too often will hurt the randomness of your program. Your solution looks fine if you are willing to sacrifice some randomness.

    As far as other solutions go, you've provided one where when you resume, you call rand as many times as you had done up to that point. You've already got overhead of calling srand many times in a loop, so the overhead of calling rand many times when you resume may not be that bad of a tradeoff. You would eliminate two lines from within a loop. You would eliminate the srand calls and the rand calls to generate a new random seed. These would be replaced with a loop on resuming that just calls rand a number of times. This would probably make the resume startup slower, however. It's a tradeoff.

    Another solution would be saving extra data. For example, if you run 300 iterations and then must pause, save a random seed. Then when you resume, use the random seed to resume and finish the run. If this run is interesting, then to rerun, you must somehow provide a way to reset the seed 300 iterations into the run. This method will also suffer from not being the same as a run that was not paused/restarted, but you must decide if that is important. Sometimes it really isn't in the end.

    Enjoy.