Question 1

What is minimum detectable effect (MDE)?

Accepted Answer

MDE is the smallest improvement your test is designed to detect. For example, 10% MDE on a 5% baseline means you'd detect a lift to 5.5% or higher. Smaller MDEs require larger samples. Choose an MDE based on the smallest improvement that would justify implementing the change.

Question 2

What happens if I don't collect enough samples?

Accepted Answer

An underpowered test has a high risk of missing real effects (false negatives) or producing unreliable p-values. You might conclude "no difference" when there actually is one, or worse, declare a winner based on statistical noise. This leads to implementing ineffective changes or abandoning effective ones.

Question 3

What is statistical power?

Accepted Answer

Statistical power is the probability of correctly detecting a real effect of the specified size. At 80% power, you have an 80% chance of detecting a true difference equal to or larger than your MDE. Higher power requires more samples. 80% is the standard default; some teams use 90% for high-stakes decisions.

Question 4

Should I use one-sided or two-sided tests?

Accepted Answer

Two-sided tests are the standard because they detect both improvements and degradations. One-sided tests need fewer samples but only detect effects in one direction. Use two-sided unless you have a strong prior belief that the change can only improve (or only hurt) the metric. This calculator uses two-sided tests.

Question 5

How do I handle multiple metrics?

Accepted Answer

If you're testing multiple metrics simultaneously, adjust for multiple comparisons using Bonferroni correction (divide significance by number of metrics) or False Discovery Rate control. Without correction, testing 20 metrics at 5% significance means you'll likely get at least one false positive.

Question 6

Can I stop a test early if results look significant?

Accepted Answer

Standard fixed-horizon tests should not be stopped early because p-values are unreliable until the planned sample is reached. If you need to monitor results continuously, use sequential testing methods (like group sequential designs or always-valid confidence intervals) that account for repeated analysis.

A/B Test Sample Size Calculator

About the A/B Test Sample Size Calculator

Why Use This A/B Test Sample Size Calculator?

How to Use This Calculator

Formula

Example Calculation

Tips & Best Practices

Understanding Sample Size Tradeoffs

Practical Test Duration Planning

Advanced Considerations

Frequently Asked Questions