Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Non-deterministic Testing

by moot (Chaplain)
on Jan 28, 2005 at 16:07 UTC ( [id://426018]=perlquestion: print w/replies, xml ) Need Help??

moot has asked for the wisdom of the Perl Monks concerning the following question:

I have an algorithm that produces non-deterministic results due to in-built fuzziness, i.e. it is intended to return unpredictable results about 20% of the time.

My query is how best to build a test case for this. Obviously I could just throw 100000 tests against the code and verify that e.g. 78-82% of the results were expected, but I'm wondering if there's a more efficient way to do it?

I fully expect that one day the results will fall 0.1% outside the threshold and thus the test will fail, so I'd also like to guard against that if possible - that is, the testing method used should perhaps take an average over three or five runs.

Replies are listed 'Best First'.
Re: Non-deterministic Testing
by davido (Cardinal) on Jan 28, 2005 at 16:34 UTC

    Jot your algorithm down in the simplest layman terms possible, and have a statistician look at it. It sounds like you're saying that you want a degree of randomness, but need tight control over the standard deviation of frequency of non-expected results. In other words, if you plot the frequency of non-expected results using a bell curve, you want a really steep curve with very short tails. This sort of thing is what statisticians understand best.

    Without seeing the algorithm it'll be hard for us to suggest how to achieve a steep curve with short tails. And I'm not a statistician myself, so my helpfulness may be limited in that regard, though someone else may have the required understanding.

    This also sounds a lot like the production / defect problem. Companies that use statistical analysis in their manufacturing will set a tollerance for defect, and a confidence interval which may be one or two standard deviations from their tollerance level. Once their test samples start falling outside of the predetermined standard deviation from tollerance, they have a high degree of confidence that not just the samples, but also the entire production run is edging toward unacceptible levels of defects. This is analyzed so that they will know when it's time to recalibrate machinery, rethink the quality of the raw materials being used, etc. Because this is a pretty advanced dicipline, you may find a lot of good info by searching for production statistical analysis.


    Dave

Re: Non-deterministic Testing
by osunderdog (Deacon) on Jan 28, 2005 at 16:21 UTC

    It seems like if you know that it will produce unpredictable results around 20% of the time, you could determine the number of tests necessary to verify that boundary.

    I guess I would suggest trying to encapsulate the fuzzy bit into a function that can be swapped out with a less fuzzy (more predictable) function during testing. Then, test the core fuzzy bit separately using sample sizes that are statistically significant.

    <tiny>Geeze, I sound like one of my old professers.</tiny>


    "Look, Shiny Things!" is not a better business strategy than compatibility and reuse.

Re: Non-deterministic Testing
by itub (Priest) on Jan 28, 2005 at 16:30 UTC
    I would first try to test the deterministic components as much as possible in isolation. Then, for the stochastic part, if you really run enough tests and/or your tolerance is large enough, you can be reasonably sure that the tests will "virtually never" fail. For example, if you expect 80,000 out of 100,000, I would estimate a standard deviation of about 120 (assuming independent random events with p=0.8). That means that you will very rarely see deviations greater than 400, and virtually never greater than 1000 (which is half of the sample tolerance you gave).
Re: Non-deterministic Testing (srand)
by tye (Sage) on Jan 28, 2005 at 16:35 UTC

    Force the starting seed (srand) during the unit tests.

    - tye        

      Caveat: this will make the tests reproducible only as long as you keep using the same platform (or perhaps even perl version, or compilation option). For example, the following program
      perl -le "srand(0); print rand"
      gives me the following results:
      0.00115966796875   v5.8.4 built for MSWin32-x86-multi-thread
      0.17082803610629   v5.8.0 built for i686-linux
      
        And of course we can fix this with the following code:
        BEGIN { my @list = 1..100; #or whatever; *CORE::GLOBAL::rand = sub { return shift @list } #or something }
        and now your rand will return what ever you want it to.
      But then you aren't testing the randomness of your function anymore. The OP wants to test whether about 20% of the tests return unpredictable results. By setting the seed of srand, or supplying the return values of rand as others suggest, you're no longer testing what you should be testing.
Re: Non-deterministic Testing
by xdg (Monsignor) on Jan 28, 2005 at 23:03 UTC

    In short -- your inputs may be non-deterministic (or effectively so) but your program is, by definition, deterministic. Isolate the input stream and replace it with a known stream and test that.

    If you're using the build in rand to introduce random behavior, I humbly suggest my own Test::MockRandom. With it you can intercept the built-in rand function to return values that you preset in advance and can test for correct behavior across the range of potential results

    If the source of randomness is truly out of your control (e.g. network packet timings), then my suggestion is to break up your code so that the external dependency is isolated and wrapped in a class. Then use Test::MockObject to simulate the dependency with known results and test across the full range of likely inputs.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: Non-deterministic Testing
by tmoertel (Chaplain) on Jan 29, 2005 at 05:43 UTC
    Even if you cannot determine whether each result returned by your algorithm is correct, you probably can determine whether the distribution of results is correct. (This is what I had to do to test the random-data generators for Test::Lectrotest.)

    First, analyze your algorithm to determine the expected distribution of the results. Then run your algorithm in a large number of random trials to generate a set of observed results. Finally, use statistical tests to determine whether the results conform to the expected distribution (for the level of confidence you desire).

    There is a trade off between the degree of confidence you require and the "power" of statistical tests to detect small deviations from the norm. The more confidence you want, the less power the tests will have. The way to increase the power is to run more trials. Unless you have a really bizarre distribution, however, a few seconds of compute time will probably provide more than enough data to provide ample confidence and power. (But still you should do the math to verify this for your situation.)

    That is the idea. Now, to implement it, you will need to (1) determine the distribution of your expected results, (2) write a test driver to run the trials and collect the output results, and (3) write code to test the collected output against the expected distribution.

    Steps (1) and (3) require some familiarity with statistics. (If you need a refresher, take a look at the resources on the Statistics on the Web page maintained by Clay Helberg.) If you are not writing unit tests to be automated, you can perform step (3) by hand in a statistics program like R. Otherwise, you will need to code up some statistical tests or borrow them (e.g., Statistics::ChiSquare) and call them from your unit tests.

    I hope that this helps.

    Cheers,
    Tom

      I'm glad someone mentioned R. It is a very powerful language. Dare I say R is the Yin to Perls Yang.

      It's also equally hard to master


      "Look, Shiny Things!" is not a better business strategy than compatibility and reuse.

Re: Non-deterministic Testing
by lachoy (Parson) on Jan 28, 2005 at 21:44 UTC

    Rather than coding them up yourself you might try generating lots of tests with something like Test::LectroTest and then checking whether the results fall in your expected range.

    Chris
    M-x auto-bs-mode

Re: Non-deterministic Testing
by fergal (Chaplain) on Jan 28, 2005 at 19:25 UTC
    Can you pick some good tests ("good" meaning they involve edge cases) and then just run each 100 times and see if the percentages come out OK. This seems more reliable than randomly generating 100 tests and running each of them.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://426018]
Approved by itub
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (3)
As of 2024-04-19 05:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found