Statistical Significance and Conversion Optimization – 9 out of 10 people say its significant!

Calvin and Hobbes on statistics

In our day job we deal with a lot of data and people / organisations/ companies who want to turn this data in to actionable insights.  Inevitably this leads to a debate on how to run tests and prove results.

There is a right way and a wrong way to do this. This post sets out to explain how you can run tests and ensure they are statistically relevant and what to do when your volume of data is too low to prove significance.

statistical significance (or a statistically significant result) is attained when a p-value is less than the significance level (denoted α, alpha). Wikipedia

Or in plain English:

“Statistical significance helps quantify whether a result is likely due to chance or to some factor of interest,” Thomas Redman Data Driven: Profiting from Your Most Important Business Asset

If you are making critical changes to your website you need to know that the impact you are having is real and not just as a result of chance.  A/B testing has become the mot du jour and whilst it can be incredibly powerful it does come with some risks. I should say we are huge proponents of A/B and multivariate testing when done properly.

When running A/B tests and any other test there are two key variables that go into determining statistical significance: sample size and effect size.

Sample size refers to how large the sample for your experiment is. The larger your sample size, the more confident you can be in the result of the experiment (assuming that it is a randomised sample). If you are running tests on a website, the more traffic your site receives, the sooner you will have enough data to determine if there is a statistically significant result.

The second factor is effect size. If there is a small effect size (say a 0.1% increase in conversion rate) you will need a very large sample size to determine whether that difference is significant or just due to chance. However, if you observe a very large effect on your numbers, you will be able to validate it with a smaller sample size to a higher degree of confidence. (Hat Tip to Optimizely’s Optimization Glossary for the great explanation)

Statistical significance calculator

click the image above to use VWO A/B test calculator

So if you have a large sample size it is easier to run your experiments and get statistical significance BUT what do you do when its only small (i.e. you have tried Evans Awesome A/B tools and you have come up with no result)

When your sample size is too small to prove significance you should not be focusing on things like A/B testing.

You should be:

  1. talking to your customers to find out what they want,
  2. you should be focusing on the user experience and sign up flows from a top level view rather than should the sign up button be blue or green,
  3. working out how you can acquire and convert more users cost effectively and repeatedly,
  4. looking for the quick wins then once you have the required traffic levels needed to prove significance, get into the nitty gritty around a/b testing.

Useful Tools you should check out:

  1. Evans Awesome A/B tools
  2. VWO A/B test calculator
  3. Optimizely’s Optimization Glossary
  4. Loyalty Bay *full disclosure I founded the company but I do think we can help 😉


Author: William Roberts

Founder and CEO of Loyalty Bay

Find me on: