… or how things tend to “average out”.
My family and I have been on holiday. It was lovely, thank you for asking.
My wife and I played a game of crazy golf: my three year old daughter gave up quickly. Here are our scores on the first five holes – of nine:
The writing was on the wall
I was feeling comfortable about an easy win by this stage. I shouldn’t have been. Crazy golf for people who play once a year, at most, is more a game of luck than skill. Five data points is hardly an strongly significant data set.
Let’s look at the last four holes:
I won… Just.
Regression to the mean
Now let’s be clear, nine data points is barely better than five in terms of statistical significance. But it is the best I have. From these data, my average score is 4.9 and Felicity’s 5.0. With standard deviations of around 2, this difference is of No Significance at all.
All we can conclude is that my wife and I are as good (or bad) as each other.
So what happened past the halfway point?
Nothing. I had had a run of better than average scores, so it should have been no surprise that my scores drifted in the wrong direction. The opposite happened for Felicity. Both of our scores “regressed to the mean”.
A Note of Caution
This example is purely illustrative. Nine data points is insufficient to be sure of the true means of our scores, if we were to play a lot more. And if we were, I’d like to think that some skill may start to intrude on our play!
The “so what?”
1. When you estimate the probabilities of risks that are based on the naturally variability of events, be sure you have a big enough sample to calculate any averages.
2. Beware of runs of luck – a short run of good results may mask an underlying mean result that means, at some stage, you will have a run of bad results.
3. Despite warnings that the past is a poor predictor of the future, it is often the only data we have and we are seduced into complacency or fear by runs of statistical chance. Look in any large set of truly random numbers and you will find runs of numbers with averages well above or well below the mean of the whole set.