Question 1

What is statistical significance?

Accepted Answer

Statistical significance means the observed result is unlikely to have occurred by random chance alone. In A/B testing, it means the conversion rate difference between control and variant is probably real, not noise. The p-value quantifies this: a p-value of 0.03 means there's only a 3% chance of seeing this large a difference if the variants were identical.

Question 2

What p-value threshold should I use?

Accepted Answer

The standard threshold is 0.05 (5%), meaning you accept a 5% chance of false positives. For high-stakes decisions (pricing changes, major redesigns), some teams use 0.01 (1%). For exploratory tests, 0.10 may be acceptable. The key is to set the threshold before running the test and stick to it.

Question 3

What is a confidence interval?

Accepted Answer

A confidence interval gives a range of plausible values for the true difference between variants. A 95% CI of [0.5%, 1.5%] means you're 95% confident the true difference lies between 0.5 and 1.5 percentage points. If the CI includes zero, the result is not significant at that confidence level.

Question 4

Can a result be significant but not meaningful?

Accepted Answer

Absolutely. With very large sample sizes, even tiny differences can be statistically significant. A 0.01% conversion lift might be significant with millions of users but isn't worth implementing. Always consider practical significance (is the effect large enough to matter?) alongside statistical significance.

Question 5

What if my test is not significant?

Accepted Answer

A non-significant result means you failed to detect a meaningful difference, not that there is no difference. The test may have been underpowered. Check whether you reached the planned sample size. If yes, the change likely has no practical impact. If no, consider running longer or accepting a larger MDE.

Question 6

Should I use one-tailed or two-tailed tests?

Accepted Answer

Two-tailed tests are the standard because they detect both improvements and degradations. Use them unless you have a strong prior that the variant can only improve the metric (rare in practice). This calculator uses a two-tailed test, which is more conservative and appropriate for most A/B testing scenarios.

Statistical Significance Calculator

About the Statistical Significance Calculator

Why Use This Statistical Significance Calculator?

How to Use This Calculator

Formula

Example Calculation

Tips & Best Practices

Interpreting P-Values Correctly

Common Pitfalls in Significance Testing

Beyond Significance: Effect Size and Practical Impact

Frequently Asked Questions