Re: What is the best way to compare profiling results before and after a code change?

The average is probably not a good measure, you will also have to look at the spread around the average (i.e. the variance). Checking if the changes between your programs are significant (due to your changes in the program) or simply due to random (background) effects can be done by running an "ANOVA"-test. The NULL-hypothesis would then be that the variance between runs of the program before and after you changed it is not bigger than the variance within each run.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Comment on Re: What is the best way to compare profiling results before and after a code change?

Replies are listed 'Best First'.
Re^2: What is the best way to compare profiling results before and after a code change? by ELISHEVA (Prior) on Apr 11, 2009 at 20:22 UTC
Thank you for taking the time to answer my question. You make an excellent point about variance, but shouldn't the null hypothesis be that there is no (statistically significant) change in mean/median/mode? I'm not sure what you mean by "the variance within each run". Your idea of comparing the size of variance before and after variance suggests a side effect that I hadn't thought of: changing the code can change the way the script competes with the operating system and may change the variance before and after. I can see that in certain real time situations where timing really matters, you might want to profile that as well as average performance time. However, in my case "usual" performance time, rather than consistency of performance time, is the primary concern. As for using an ANOVA - that would only apply if the distribution of profiling results is normal. If the distribution is skewed or has overly thick or thin tails, then you would have to use other techniques to analyze the variance. Without knowing the distribution it is very hard to tell how many standard deviations (sqrt of variance) are needed to make the difference between the old and new mean statistically significant (5-7 are needed for a normal distribution). Best, beth	[reply]
Re^3: What is the best way to compare profiling results before and after a code change? by CountZero (Bishop) on Apr 11, 2009 at 20:42 UTC
What I mean is that each run of the same program will see a different outcome around the mean. As you know the variance (or the standard deviation, if you like) is a measure of the spread of the actual results around the mean. The mean and variance of different versions of the same program will tell you whether these different versions are faster or not, but perhaps the spread between different versions is less than the spread within each version and then the differences between the mean times each version ran are not really significant. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]


more useful options
	PerlMonks