Articles on statistical significance

Overgeneralization in A/B testing

Overgeneralization is a mistake in interpreting the outcomes of online controlled experiments (a.k.a. A/B tests) that can have a detrimental impact on any data-driven business. Overgeneralization is used in the typical sense of going above and beyond what the evidence at hand supports, with “evidence” being a statistically significant or non-significant outcome of an online […] Read more…

Posted in A/B testing, Conversion optimization | Also tagged ab testing, conversion rate optimization, cro, online experimentation

Stop AbUsing the Mann-Whitney U Test (MWU)

The Mann Whitney U Test (MWU), also known as the Wilcoxon Rank Sum Test and the Mann-Whitney-Wilcoxon Test, continues to be advertised as the go-to test for analyzing non-normally distributed data. In online experimentation it is often touted as the most suitable for analyses of non-binomial metrics with typically non-normal (skewed) distributions such as average […] Read more…

Posted in A/B testing, Statistics | Also tagged arpu, average revenue per user, difference in medians, mann-whitney u test, mwu, skewed distribution, skewness, statistical power, stochastic difference

Fully Sequential vs Group Sequential Tests

What is the best design for a statistical test with sequential evaluation of the data at multiple points in time? This is a question anyone who has realized that unaccounted for peeking with intent to stop is the bane of A/B testing eventually comes to ask. So how does one go about answering that? This […] Read more…

Posted in AGILE A/B testing, Statistics | Also tagged ab testing, conversion rate optimization, sequential testing, sequential tests, statistical power

A/B Testing Statistics – A Concise Guide for Non-Statisticians

Navigating the maze of A/B testing statistics can be challenging. This is especially true for those new to statistics and probability. One reason is the obscure terminology popping up in every other sentence. Another is that the writings can be vague, conflicting, incomplete, or simply wrong, depending on the source. Articles sprinkled with advanced math, […] Read more…

Posted in A/B testing, Statistics | Also tagged ab testing, confidence intervals, p-value, statistical confidence, statistical power

P-values and Confidence Intervals Explained

Hundreds if not thousands of books have been written about both p-values and confidence intervals (CIs) – two of the most widely used statistics in online controlled experiments. Yet, these concepts remain elusive to many otherwise well-trained researchers, including A/B testing practitioners. Misconceptions and misinterpretations abound despite great efforts from statistics educators and experimentation evangelists. […] Read more…

Posted in A/B testing, Statistical significance, Statistics | Also tagged ab testing, confidence intervals, confidence threshold, p-value, random variable, variability

Top Misconceptions About Scientific Rigor in A/B Testing

Have you ever thought that statistically rigorous A/B tests are impractical? Or do you have trouble selling the need for rigor in testing to your clients, coworkers, or boss? This article debunks the top five myths about the necessity and difficulties of applying scientific method in online A/B testing. Read more…

Posted in A/B testing, Conversion optimization | Also tagged ab testing, business experiments, confidence threshold, conversion rate optimization, scientific method, statistical confidence

The Effect of Using Cardinality Estimates Like HyperLogLog in Statistical Analyses

This article will examine the effects of using the HyperLogLog++ (HLL++) cardinality estimation algorithm in applications where its output serves as input for statistical calculations. A prominent example of such a scenario can be found in online controlled experiments (online A/B tests) where key performance measures are often based on the number of unique users, […] Read more…

Posted in A/B testing, Google Analytics, Statistics | Also tagged cardinality estimates, cardinality estimation, hyperloglog, hyperloglog++, sample ratio mismatch, statistical analysis

Error Spending in Sequential Testing Explained

Sequential Hypothesis Testing with Efficacy and Futility Bound

Sequential analysis of experimental data from A/B tests has been quite prominent in recent years due to the myriad of Bayesian solutions offered by big industry players. However, this type of sequential analysis is not sequential testing proper as these solutions have generally abandoned the idea of testing and therefore error control, substituting it for […] Read more…

Posted in A/B testing, AGILE A/B testing, Statistics | Also tagged alpha spending, alpha spending functions, optional stopping, peeking, sequential testing, sequential tests

Statistical Methods in Online A/B Testing – the book

Book Cover Statistical Methods Online A/B Testing

The long wait is finally over! “Statistical Methods in Online A/B Testing” can now be found as a paperback and an e-book on your preferred Amazon store. The book is a comprehensive guide to statistics in online controlled experiments, a.k.a. A/B tests, and tackles the difficult matter of statistical inference in a way accessible to […] Read more…

Posted in A/B testing, Conversion optimization, Statistics | Also tagged ab testing, ab testing methodology, risk reward analysis, statistical method

The A/B Testing Guide to Surviving on a Deserted Island

The secluded and isolated deserted island setting has been used as the stage for many hypothetical explanations in economics and philosophy with the scarcity of things that can be developed as resources being a central feature. Scarcity and the need to keep risk low while aiming to improve one’s situation is what make it a […] Read more…

Posted in A/B testing, Conversion optimization, Statistics | Also tagged ab testing, concurrent a/b tests, conversion rate optimization, desert island, multivariate testing, mvt, online controlled experiments, revenue optimization, risk management, risk reward analysis, sequential testing, sequential tests, statistical power, survival

Designing successful A/B tests in Email Marketing

The process of A/B testing (a.k.a. online controlled experiments) is well-established in conversion rate optimization for all kinds of online properties and is widely used by e-commerce websites. On this blog I have already written in depth about the statistics involved as well as the ROI calculations in terms of balancing risk and reward for […] Read more…

Posted in A/B testing, Conversion optimization, Statistical significance, Statistics | Also tagged click rate optimization, confidence intervals, e-mail marketing, e-mail template optimization, email marketing, newsletter, newsletter template optimization, open rate optimization, sequential tests, subject line optimization

Analysis of 115 A/B Tests: Average Lift is 4%, Most Lack Statistical Power

Oct 2022 update: A newer, much larger and likely less biased meta analysis of 1,001 tests is now available! What can you learn from 115 publicly available A/B tests? Usually, not much, since in most cases you would be looking at case studies with very basic data about what was tested and the outcome of […] Read more…

Posted in A/B testing, Conversion optimization | Also tagged ab testing, lift, meta analysis, mvt, online experiments, optional stopping, peeking, roi, sample size, statistical power, underpowered

Search

Browse by topic

Browse by year

The book on user testing

Take your A/B testing program to the next level

Learn more

Articles on statistical significance

Search

Recent articles

Browse by topic

Browse by year