Statistical Significance
Statistical significance of a test indicates whether the results of the test are "real" or whether they are simply due to chance occurrence.
Overview
In any experiment that involves drawing a sample from a population, there is always the possibility that the observed effect (result of the test) may have occurred simply due to a chance occurrence or a sampling error. To avoid this uncertainty and to ensure that the results of an experiment reflect the actual choices of the overall population, a term known as Statistical Significance is used.
The result of a test is considered to be statistically significant if the probability that the result could have occurred by chance is lower than a pre-defined threshold.
Statistically significant result = Probability (p) < Threshold (ɑ) Statistical Significance in A/B tests
An A/B test is an example of statistical hypothesis testing, a process whereby a hypothesis is made about the relationship between two data sets and those data sets are then compared against each other to determine if there is a statistically significant relationship or not.
To put this in more practical terms, a prediction is made that content variant B will perform better than content variant A, and then data from both the content variants are observed and compared to determine if B is a statistically significant improvement over A.
For example, we have no way of knowing with 100% accuracy how the next 100,000 people who visit our channel will behave. This is the information that we do not have today, and if we were to wait o until those 100,000 people visited our site, it would be too late to optimize their experience. What we can do is observe the next 1,000 people who visit our site and then use statistical analysis to predict how the following 99,000 will behave.
The complexities arrive in all the ways a given “sample” can inaccurately represent the overall “population”, and all the things we have to do to ensure that our sample can accurately represent the population.
Statistical Significance in A/B tests in Acoustic Personalization
In the context of Acoustic Personalization, you must specify the statistical significance for an A/B test as a percentage that indicates your confidence that the results of the A/B test are valid and free from errors caused by randomness. For example, if you set a statistical significance level of 95%, it means that you can be 95% confident that the observed results are real and not caused due to chance occurrences.
Statistical significance value is calculated based on the click rate of visitors on the channel. To calculate the statistical significance, the control group and at least one of content variants must have non-zero click rate.
The Statistical Significance value for an A/B test in Acoustic Personalization should be within the range 50% to 100%. By default, the value is set to 90%.
It is not advisable to set the value below 90%, because the lower the threshold for statistical significance, the less likely it is that the improvement in the conversions (or whatever the Goal is) is due to given variant being shown. Similarly, it is not advisable to set the statistical significance value to 100%; as this value is practically unlikely to be met during the test.
- Statistical significance of an in-progress A/B test
-
For an in-progress A/B test, you may see a message on the Performance details page, if the following conditions are fulfilled:
- The test is in progress.
- The statistical significance reached by the test is less than 90.
- The test has run for less than a week.