Pearson Correlation Calculator

Calculate Pearson's r with step-by-step computation, Fisher z confidence intervals, t-test significance, covariance, and full deviation table.

About the Pearson Correlation Calculator

Pearson's r is the most widely used measure of linear association between two variables. Our calculator doesn't just give you the number — it shows the complete step-by-step computation: deviations from means, cross-products, sums of squares, and the final formula. You can verify every calculation by hand.

Beyond the coefficient, get Fisher z-transform confidence intervals (90%, 95%, 99%), t-test significance testing with customizable α levels, covariance, and comprehensive interpretation guidance. The computation table shows (x−x̄), (y−ȳ), their product, and squared deviations for each data point.

Preset datasets cover classic scenarios: height vs weight (strong positive), IQ vs income (moderate), temperature vs heating costs (strong negative). Each demonstrates a different correlation strength and direction. Check the example with realistic values before reporting. Use the steps shown to verify rounding and units. Cross-check this output using a known reference case. Use the example pattern when troubleshooting unexpected results. Validate that outputs match your chosen standards.

Why Use This Pearson Correlation Calculator?

This calculator's step-by-step computation is its key differentiator. Instead of a black-box r value, you see every deviation, cross-product, and sum. This is invaluable for statistics students learning the formula, researchers verifying software output, and anyone who values transparency.

The Fisher z confidence interval is a feature most basic calculators omit but is essential for proper statistical inference. Knowing r = 0.65 is less useful than knowing 95% CI [0.31, 0.85] — the interval communicates uncertainty.

How to Use This Calculator

  1. Enter paired X and Y values (comma-separated, same count).
  2. Or click a preset to load example relationships.
  3. Select your significance level α (0.01, 0.05, or 0.10).
  4. Review Pearson r, R², and confidence interval.
  5. Check the t-statistic and significance result.
  6. Study the computation steps table to verify the math.
  7. Use the interpretation guide for your specific r value.

Formula

r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √[Σ(xᵢ−x̄)²·Σ(yᵢ−ȳ)²]. t = r√(n−2)/√(1−r²), df=n−2. Fisher z = ½·ln((1+r)/(1−r)), SE_z = 1/√(n−3).

Example Calculation

Result: r = 0.9972, R² = 0.9945, t = 35.58 (p < 0.001), 95% CI [0.9872, 0.9994]

Height and weight show a very strong positive linear correlation. 99.45% of weight variation is linearly associated with height. The Fisher z CI confirms the true population r is between 0.987 and 0.999.

Tips & Best Practices

Deriving Pearson's r

Pearson's r is the ratio of covariance to the product of standard deviations: r = Cov(X,Y)/(SD_X · SD_Y). Expanding Cov(X,Y) = Σ(xᵢ−x̄)(yᵢ−ȳ)/(n−1), we get the familiar formula. The denominator normalizes the covariance to the [−1, +1] range regardless of variable scales.

If all points fall exactly on a line with positive slope, every (xᵢ−x̄)(yᵢ−ȳ) term is positive, and r = +1. If the line has negative slope, they're all negative, giving r = −1. Scattered points produce a mix of positive and negative terms that partially cancel, yielding |r| < 1.

Hypothesis Testing for ρ

The null hypothesis H₀: ρ = 0 is tested using t = r√(n−2)/√(1−r²) with n−2 degrees of freedom. Reject H₀ when |t| > t_critical. For testing H₀: ρ = ρ₀ (some non-zero value), convert to z-scores: z = (z_r − z_ρ₀) / SE_z, where z_r = Fisher transform of r and SE_z = 1/√(n−3).

Effect Size Interpretation

In psychology and social sciences, Cohen's guidelines classify r = 0.10 as small, 0.30 as medium, 0.50 as large. In medical research, r = 0.30 might be clinically meaningful. In physics, r < 0.99 might indicate measurement error. Always interpret r in context, not by universal cutoffs.

Frequently Asked Questions

What is Pearson's r and what does it range?

Pearson's r measures the strength and direction of linear association. It ranges from −1 (perfect negative) through 0 (no linear relationship) to +1 (perfect positive). Only linear relationships are captured — a perfect parabola gives r ≈ 0.

What's the Fisher z-transform for?

Raw r has a skewed sampling distribution, especially near ±1. Fisher's z-transform converts r to a normally distributed z-score, enabling accurate confidence intervals and hypothesis tests for the population correlation ρ.

What assumptions does Pearson's r require?

Ideally: (1) both variables are continuous, (2) relationship is linear, (3) no extreme outliers, (4) bivariate normality for significance tests. It's robust to mild violations of normality with n ≥ 30.

How does Pearson differ from Spearman?

Pearson measures linear association using raw values. Spearman measures monotonic association using ranks. For linear relationships, both give similar results. For nonlinear monotonic relationships (log, exponential), Spearman is higher.

Why does my confidence interval seem so wide?

CI width depends on n and |r|. With n=10 and r=0.50, a 95% CI might be [−0.17, 0.87]. You need about n=40 for useful CIs with moderate correlations. Larger samples → narrower CIs.

Can one outlier change r dramatically?

Yes. A single extreme point can inflate r from 0.1 to 0.8 or deflate it from 0.9 to 0.3. Always check for outliers before interpreting Pearson's r, or use Spearman's rank correlation instead.

Related Pages