Calculate the expected loss of choosing a variant over the control in a Bayesian A/B test. Quantify the risk of making the wrong decision from test data.
Expected loss measures the average conversion rate you'd sacrifice by choosing the wrong variant. While P(B > A) tells you the probability B is better, expected loss tells you the cost of being wrong. A test with 90% P(B > A) and an expected loss of 0.01% is safe to ship because even if B is worse, the damage is negligible.
This calculator computes expected loss using Monte Carlo simulation on Beta posterior distributions. Enter visitors and conversions for control and variant, and see the expected loss for each decision (choosing A vs. choosing B).
Expected loss is arguably the most practical Bayesian metric because it directly answers: "How much would we lose if we make the wrong call?" This risk-based framework enables earlier, more confident decisions than probability alone. Whether you are a beginner or experienced professional, this free online tool provides instant, reliable results without manual computation. By automating the calculation, you save time and reduce the risk of costly errors in your planning and decision-making process.
Probability alone is insufficient for decision-making. A 92% probability B wins with 0.01% expected loss is safer than a 98% probability with 0.5% expected loss. This calculator adds the risk dimension that makes Bayesian testing actionable. Having a precise figure at your fingertips empowers better planning and more confident decisions.
Expected Loss of choosing B = E[max(θ_A − θ_B, 0)] Expected Loss of choosing A = E[max(θ_B − θ_A, 0)] Computed via Monte Carlo: sample from posterior Beta distributions and average the losses
Result: Expected loss of choosing B = 0.02%
Control: 150/5,000 = 3.0%. Variant: 175/5,000 = 3.5%. If we choose B but A is actually better, the average loss is only 0.02 percentage points of conversion rate. This is a negligible risk, making it safe to ship B even though the test isn't 99% conclusive.
Expected loss completes the Bayesian decision framework. P(B > A) tells you direction, expected loss tells you magnitude of risk. Together, they provide everything needed for a rational shipping decision without arbitrary significance thresholds.
For a high-traffic e-commerce site: 0.05% expected loss is very safe, 0.1% is acceptable, 0.5% warrants caution. For low-traffic or high-AOV sites, convert to dollar terms for a more intuitive threshold. The key insight is that expected loss should be compared to the cost of waiting (opportunity cost of not shipping).
An interesting property: expected loss often reaches actionable levels before P(B > A) reaches 95%. A test might show P(B > A) = 88% but expected loss of only 0.01%. This means you can make confident shipping decisions earlier than frequentist or even probability-based Bayesian methods suggest.
It depends on your traffic and AOV. A common rule: if the expected loss in CR is below 0.1%, ship the variant. For a 3% baseline CR, 0.1% expected loss means you'd lose at most 0.003 percentage points of CR on average — effectively nothing.
P(B > A) measures how likely B is to be better. Expected loss measures how much you'd lose if B is actually worse. You can have high P(B > A) but also high expected loss (rare but large potential downside), or moderate P(B > A) with negligible loss.
Expected loss directly quantifies risk in the same units as your metric (conversion rate). A stakeholder can understand "we'd lose 0.02% CR at most" better than "p = 0.03" or "P(B>A) = 93%." It enables practical, risk-based decision-making.
Yes. Multiply expected loss (as a proportion) by traffic and AOV. If expected loss is 0.0002 (0.02%) and you have 100,000 monthly visitors at $80 AOV, the expected revenue loss is 100,000 × 0.0002 × $80 = $1,600/month. Very manageable.
Yes. The same framework applies to any metric. For revenue per visitor, the expected loss is in dollars per visitor. The concept generalizes to any posterior distribution and any loss function.
Expected loss assumes your posterior accurately represents uncertainty. With very small samples (<100 per group) or extreme conversion rates (<0.1%), the posterior may not be well-calibrated. In such cases, gather more data before relying on expected loss.