Re: Non-deterministic Testing
by davido (Cardinal) on Jan 28, 2005 at 16:34 UTC
|
Jot your algorithm down in the simplest layman terms possible, and have a statistician look at it. It sounds like you're saying that you want a degree of randomness, but need tight control over the standard deviation of frequency of non-expected results. In other words, if you plot the frequency of non-expected results using a bell curve, you want a really steep curve with very short tails. This sort of thing is what statisticians understand best.
Without seeing the algorithm it'll be hard for us to suggest how to achieve a steep curve with short tails. And I'm not a statistician myself, so my helpfulness may be limited in that regard, though someone else may have the required understanding.
This also sounds a lot like the production / defect problem. Companies that use statistical analysis in their manufacturing will set a tollerance for defect, and a confidence interval which may be one or two standard deviations from their tollerance level. Once their test samples start falling outside of the predetermined standard deviation from tollerance, they have a high degree of confidence that not just the samples, but also the entire production run is edging toward unacceptible levels of defects. This is analyzed so that they will know when it's time to recalibrate machinery, rethink the quality of the raw materials being used, etc. Because this is a pretty advanced dicipline, you may find a lot of good info by searching for production statistical analysis.
| [reply] |
Re: Non-deterministic Testing
by osunderdog (Deacon) on Jan 28, 2005 at 16:21 UTC
|
It seems like if you know that it will produce unpredictable results around 20% of the time, you could determine the number of tests necessary to verify that boundary.
I guess I would suggest trying to encapsulate the fuzzy bit into a function that can be swapped out with a less fuzzy (more predictable) function during testing. Then, test the core fuzzy bit separately using sample sizes that are statistically significant.
<tiny>Geeze, I sound like one of my old professers.</tiny>
"Look, Shiny Things!" is not a better business strategy than compatibility and reuse.
| [reply] |
Re: Non-deterministic Testing
by itub (Priest) on Jan 28, 2005 at 16:30 UTC
|
I would first try to test the deterministic components as much as possible in isolation. Then, for the stochastic part, if you really run enough tests and/or your tolerance is large enough, you can be reasonably sure that the tests will "virtually never" fail. For example, if you expect 80,000 out of 100,000, I would estimate a standard deviation of about 120 (assuming independent random events with p=0.8). That means that you will very rarely see deviations greater than 400, and virtually never greater than 1000 (which is half of the sample tolerance you gave). | [reply] |
Re: Non-deterministic Testing (srand)
by tye (Sage) on Jan 28, 2005 at 16:35 UTC
|
| [reply] |
|
Caveat: this will make the tests reproducible only as long as you keep using the same platform (or perhaps even perl version, or compilation option). For example, the following program
perl -le "srand(0); print rand"
gives me the following results:
0.00115966796875 v5.8.4 built for MSWin32-x86-multi-thread
0.17082803610629 v5.8.0 built for i686-linux
| [reply] [d/l] |
|
And of course we can fix this with the following code:
BEGIN {
my @list = 1..100; #or whatever;
*CORE::GLOBAL::rand = sub { return shift @list } #or something
}
and now your rand will return what ever you want it to. | [reply] [d/l] |
|
But then you aren't testing the randomness of your function anymore. The OP wants to test whether about 20% of the tests return unpredictable results. By setting the seed of srand, or supplying the return values of rand as others suggest, you're no longer testing what you should be testing.
| [reply] |
Re: Non-deterministic Testing
by xdg (Monsignor) on Jan 28, 2005 at 23:03 UTC
|
In short -- your inputs may be non-deterministic (or effectively so) but your program is, by definition, deterministic. Isolate the input stream and replace it with a known stream and test that.
If you're using the build in rand to introduce random behavior, I humbly suggest my own Test::MockRandom. With it you can intercept the built-in rand function to return values that you preset in advance and can test for correct behavior across the range of potential results
If the source of randomness is truly out of your control (e.g. network packet timings), then my suggestion is to break up your code so that the external dependency is isolated and wrapped in a class. Then use Test::MockObject to simulate the dependency with known results and test across the full range of likely inputs.
-xdg
Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.
| [reply] |
Re: Non-deterministic Testing
by tmoertel (Chaplain) on Jan 29, 2005 at 05:43 UTC
|
Even if you cannot determine whether each result returned by your
algorithm is correct, you probably can determine whether the
distribution of results is correct. (This is what I had to do
to test the random-data generators for Test::Lectrotest.)
First, analyze your algorithm to determine the expected
distribution of the results. Then run your algorithm in a large number of
random trials to generate a set of observed results. Finally, use
statistical tests to determine whether the results conform to the
expected distribution (for the level of confidence you desire).
There is a trade off between the degree of confidence you require
and the "power" of statistical tests to detect small deviations from
the norm. The more confidence you want, the less power the tests will
have. The way to increase the power is to run more trials. Unless
you have a really bizarre distribution, however, a few seconds of
compute time will probably provide more than enough data to provide
ample confidence and power. (But still you should do the math to
verify this for your situation.)
That is the idea. Now, to implement it, you will need to (1)
determine the distribution of your expected results, (2) write a test
driver to run the trials and collect the output results, and (3) write
code to test the collected output against the expected distribution.
Steps (1) and (3) require some familiarity with statistics. (If you
need a refresher, take a look at the resources on the Statistics on the
Web page maintained by Clay Helberg.) If you are not writing unit
tests to be automated, you can perform step (3) by hand in a
statistics program like R.
Otherwise, you will need to code up some statistical tests or borrow
them (e.g., Statistics::ChiSquare) and call them from
your unit tests.
I hope that this helps.
Cheers, Tom
| [reply] |
|
| [reply] |
Re: Non-deterministic Testing
by lachoy (Parson) on Jan 28, 2005 at 21:44 UTC
|
Rather than coding them up yourself you might try generating lots of tests with something like Test::LectroTest and then checking whether the results fall in your expected range.
| [reply] |
Re: Non-deterministic Testing
by fergal (Chaplain) on Jan 28, 2005 at 19:25 UTC
|
Can you pick some good tests ("good" meaning they involve edge cases) and then just run each 100 times and see if the percentages come out OK. This seems more reliable than randomly generating 100 tests and running each of them. | [reply] |