Perform the Wilcoxon rank-sum (Mann-Whitney U) test for non-parametric comparison of two groups. Get U statistic, z-score, p-value, effect size, and rank table.
The Wilcoxon rank-sum test (also called the Mann-Whitney U test) is the non-parametric alternative to the independent two-sample t-test. It compares two groups without assuming normality, instead working with the ranks of the combined data to test whether one group tends to produce larger values than the other.
This calculator takes raw data from two groups, computes the Mann-Whitney U statistic, performs a z-approximation for the p-value, and provides effect sizes and the Hodges-Lehmann median difference estimator. A complete rank table shows every observation's assigned rank.
The Wilcoxon rank-sum test is ideal when data is ordinal (Likert scales, rankings), heavily skewed, contains outliers, or comes from small samples where normality cannot be verified. It's widely used in biomedical research, psychology, ecology, and quality control. Check the example with realistic values before reporting. Use the steps shown to verify rounding and units. Cross-check this output using a known reference case. Use the example pattern when troubleshooting unexpected results.
The t-test assumes approximately normal distributions, which is often violated with small samples, skewed data, or ordinal measurements. The Wilcoxon rank-sum test makes no distributional assumptions and is robust to outliers because it uses ranks rather than raw values. This calculator automates the ranking, tie-handling, and p-value computation. Keep these notes focused on your operational context.
Mann-Whitney U Statistic: U₁ = R₁ − n₁(n₁+1)/2 U₂ = R₂ − n₂(n₂+1)/2 U = min(U₁, U₂) Z Approximation (for large samples): z = (U₁ − μ_U) / σ_U μ_U = n₁n₂/2 σ_U = √(n₁n₂(N+1)/12) Effect Size: r = |z| / √N Hodges-Lehmann Estimator: Median of all n₁×n₂ pairwise differences
Result: U = 14.5, z = 3.09, p = 0.002
Group 1 (median = 7) has significantly higher ranks than Group 2 (median = 5). U = 14.5 with z = 3.09 gives p = 0.002 (two-tailed), indicating a statistically significant difference. The effect size r = 0.69 suggests a large effect. The Hodges-Lehmann estimate of the median shift is 3.0.
Rank-based tests replace raw observations with their ranks in the combined sample, making the analysis robust to outliers and distributional assumptions. If a single outlier changes a value from 100 to 10,000, the rank changes by at most one position. This robustness comes at a small cost in power: when data truly is normal, the Wilcoxon test is about 95.5% as efficient as the t-test.
Tied observations receive the average rank. For example, if observations at positions 3, 4, and 5 all share the same value, each receives rank 4. When there are many ties, a correction factor adjusts the variance of the U statistic. With discrete data (like Likert scales), ties are common and the correction becomes important.
The Hodges-Lehmann estimator is the median of all pairwise differences d = x₁ᵢ − x₂ⱼ. It estimates the shift in location between the two distributions. Unlike the mean difference, it's resistant to outliers. A confidence interval for this estimator can be constructed using the distribution of U, providing a non-parametric analog to the confidence interval from a t-test.
They're the same test with different formulations. Wilcoxon uses rank sums directly; Mann-Whitney uses U statistics. They always give the same p-value. The names are used interchangeably.
When your data is ordinal, heavily skewed, contains outliers, or when sample sizes are too small to verify normality (n < 15-20 per group). If data is reasonably normal, the t-test is slightly more powerful.
U counts the number of times a value from one group precedes a value from the other group in the ranked sequence. Large U values (close to n₁×n₂) suggest the groups don't overlap; small U values suggest extensive overlap.
No, this test is for independent groups. For paired data, use the Wilcoxon signed-rank test instead, which is the non-parametric alternative to the paired t-test.
When two or more observations have the same value, they receive the average of the ranks they would have occupied. A tie correction factor can be applied to the variance formula for more accurate z-approximations.
It's the median of all n₁×n₂ pairwise differences between the two groups. It provides a robust, distribution-free estimate of the location shift between groups and serves as the non-parametric confidence interval center.