Compute the best-fit line using least squares regression. Enter up to 10 data points to get slope, intercept, R², correlation coefficient, standard error, residuals table, and predictions.
Least squares regression is the most widely used method for fitting a straight line to a set of data points. Given a collection of (x, y) observations, the method finds the unique line ŷ = mx + b that minimizes the sum of the squared differences between the observed y values and the predicted ŷ values. The result is the "best fit" in the least-squares sense — no other straight line produces a smaller total squared error.
The slope m tells you how much y changes on average for each one-unit increase in x. The intercept b is the predicted y value when x is zero. Together they form the regression equation, which you can use to interpolate within your data range or cautiously extrapolate beyond it.
The coefficient of determination R² measures how well the line explains the variation in your data: R² = 1 means a perfect fit while R² = 0 means the line explains none of the variability. The correlation coefficient r captures both the strength and direction of the linear relationship, ranging from −1 (perfect negative) through 0 (no linear trend) to +1 (perfect positive). The standard error of the estimate quantifies the average scatter of data points around the line.
This calculator supports up to 10 data points and instantly computes slope, intercept, R², r, and standard error. A residuals table shows the observed value, predicted value, and residual for every point — with color coding for positive and negative deviations. An R² strength bar gives an at-a-glance quality rating. Eight preset datasets — from study hours vs grades to altitude vs temperature — let you explore regression concepts interactively. You can also enter an x value to get the corresponding prediction on the best-fit line. Whether you are learning statistics, analyzing experimental data, or building a quick predictive model, this tool has you covered.
Least Squares Regression problems often require several dependent steps, and a small arithmetic slip can propagate through every derived value. This calculator is tailored to that workflow: you enter predict ŷ at x =, decimal places, and it returns slope (m), intercept (b), r² (coefficient of determination), correlation coefficient (r) in one consistent pass. It is useful for homework checks, worksheet generation, tutoring walkthroughs, and fast field/design estimates where you need reliable geometry results without rebuilding the full derivation each time.
slope m = (n·Σxy − Σx·Σy) / (n·Σx² − (Σx)²). intercept b = (Σy − m·Σx) / n. R² = 1 − SS_res / SS_tot. r = sign(m) × √R². Standard Error = √(SS_res / (n − 2)).
Result: 2
Data: (1,3), (2,5), (3,7), (4,9), (5,11). Slope = 2, intercept = 1, equation ŷ = 2x + 1, R² = 1.0 (perfect fit), r = 1.0.
Ordinary Least Squares (OLS) regression finds the unique line ŷ = mx + b that minimizes the sum of squared vertical residuals — the differences between each observed y value and the value predicted by the line. Why squared? Squaring makes every residual positive and penalizes large deviations more heavily than small ones, producing a single, differentiable objective function. Setting the partial derivatives with respect to m and b equal to zero yields the **normal equations**: m = (n·Σxy − Σx·Σy) / (n·Σx² − (Σx)²) and b = (Σy − m·Σx) / n. These closed-form formulas mean no iteration is needed — the best-fit line is computed directly from the data.
The **coefficient of determination R²** measures the proportion of variance in y explained by x. An R² of 0.85 means 85 % of the variability in the response is captured by the linear model; the remaining 15 % is unexplained scatter. R² = 1 − SS_res / SS_tot, where SS_res is the sum of squared residuals and SS_tot is the total sum of squares around the mean. The **Pearson correlation coefficient r** is the signed square root of R²: it ranges from −1 (perfect negative trend) through 0 (no linear relationship) to +1 (perfect positive trend). The **standard error of the estimate** equals √(SS_res / (n − 2)) and measures the average scatter of points around the line in the same units as y.
The **residuals table** is often the most diagnostic output. A random scatter of positive and negative residuals around zero confirms the linear model is appropriate. Systematic patterns — a curve, a fan shape, or clustering — suggest the relationship is nonlinear, the variance is non-constant (heteroscedasticity), or an outlier is distorting the fit.
Least squares regression is foundational across science, engineering, and business. In **physics**, it calibrates instruments by fitting a known signal against sensor output. In **economics**, it models how income predicts consumer spending or how advertising spend predicts sales. In **medicine**, dose-response curves and epidemiological trend lines are estimated by OLS. In **machine learning**, simple linear regression is the baseline model against which more complex algorithms are benchmarked.
The technique also underpins more advanced methods: multiple linear regression extends OLS to several predictors simultaneously; polynomial regression fits curved relationships by adding x², x³, … terms; weighted least squares downweights unreliable observations. Understanding the geometry of OLS — minimizing a sum of squared distances in the (x, y) plane — is the conceptual foundation for all of these generalizations.
It means the line is chosen to minimize the sum of the squared vertical distances (residuals) between each data point and the line. Squaring ensures positive and negative deviations do not cancel out.
R² (coefficient of determination) ranges from 0 to 1. Values above 0.9 indicate an excellent linear fit; values below 0.5 suggest the linear model explains less than half the variability.
r is the Pearson correlation coefficient (−1 to +1) that shows direction and strength. R² = r² and shows the proportion of variance explained. R² is always non-negative.
A minimum of 2 is required to define a line, but at least 5–10 points are recommended for meaningful statistics. Use this as a practical reminder before finalizing the result.
A low R² or a clear pattern in the residuals suggests the relationship is not linear. Consider polynomial, exponential, or logarithmic regression instead.
Yes. Enter an x value in the prediction field and the calculator returns the corresponding ŷ on the best-fit line.