Matthews Correlation Coefficient (MCC) Calculator

Calculate MCC from confusion matrix with full metrics: accuracy, F1, precision, recall, specificity, Cohen's kappa, informedness, markedness, and visual confusion matrix.

About the Matthews Correlation Coefficient (MCC) Calculator

The Matthews Correlation Coefficient (MCC) is the gold standard for evaluating binary classifiers, especially with imbalanced datasets. Unlike accuracy, which can be misleading when classes are unequal, MCC produces a balanced measure that accounts for all four quadrants of the confusion matrix.

Enter TP, TN, FP, and FN (or click a preset) and instantly get MCC alongside 13 other classification metrics: accuracy, balanced accuracy, precision, recall, specificity, F1, NPV, FPR, FNR, Cohen's kappa, informedness, markedness, and prevalence. The visual confusion matrix and step-by-step formula make results transparent.

MCC ranges from −1 (complete disagreement) through 0 (random) to +1 (perfect prediction). Try the "Imbalanced (95-5)" preset to see why accuracy = 94% can coexist with mediocre MCC = 0.52 — the dataset's imbalance inflates accuracy but MCC reveals the true performance. Check the example with realistic values before reporting. Use the steps shown to verify rounding and units. Cross-check this output using a known reference case.

Why Use This Matthews Correlation Coefficient (MCC) Calculator?

In machine learning, reporting only accuracy and F1 can be deeply misleading. A spam filter with 99% accuracy might be letting 50% of spam through if only 2% of emails are spam. MCC cuts through such illusions by requiring balanced performance across all four confusion matrix quadrants.

This calculator's side-by-side comparison of 14 metrics exposes the relationships: you can see how the same confusion matrix produces different stories depending on which metric you read. The preset scenarios — including imbalanced classes and trivial classifiers — make these differences concrete and memorable.

How to Use This Calculator

  1. Enter TP (true positives), FP (false positives), FN (false negatives), and TN (true negatives).
  2. Or select a preset scenario to explore different classifier behaviors.
  3. Review the MCC value and its interpretation.
  4. Check the visual confusion matrix to verify your inputs.
  5. Compare MCC with accuracy, F1, and balanced accuracy.
  6. Examine the complete metrics table for a 360° view of classifier performance.
  7. Use the MCC interpretation guide to assess production-readiness.

Formula

MCC = (TP·TN − FP·FN) / √((TP+FP)(TP+FN)(TN+FP)(TN+FN)). Range: [−1, +1]. MCC = 0 = random. MCC = +1 = perfect.

Example Calculation

Result: MCC = 0.7528, Accuracy = 87.5%, F1 = 0.8718, Precision = 89.47%, Recall = 85.0%

MCC of 0.75 indicates strong agreement between predictions and reality. Despite 12.5% error rate, the classifier correctly handles both positives and negatives, with slightly better specificity than sensitivity.

Tips & Best Practices

MCC vs. Accuracy: The Imbalanced Data Problem

Consider a disease affecting 1% of patients. A classifier that always predicts "healthy" achieves 99% accuracy — impressive-looking but useless. Its MCC is exactly 0, correctly indicating no discriminative ability. This example explains why MCC has become the recommended primary metric in machine learning competitions and medical AI.

Mathematical Properties of MCC

MCC is the Pearson correlation between actual and predicted binary labels (encoded 0/1). This means all properties of Pearson correlation apply: MCC = +1 implies perfect prediction, −1 implies perfect inverse prediction, and 0 implies independence. The formula (TP·TN − FP·FN)/√((TP+FP)(TP+FN)(TN+FP)(TN+FN)) is equivalent to Pearson r on binary variables.

When Other Metrics Complement MCC

MCC gives one number but hides the precision-recall tradeoff. If FP and FN have very different costs — false positive in cancer screening vs. false negative — you need precision and recall separately. ROC-AUC shows performance across all thresholds. Use MCC as the primary evaluation metric, then drill into precision, recall, and domain-specific costs for deployment decisions.

Frequently Asked Questions

Why is MCC better than accuracy for imbalanced data?

If 95% of samples are negative, predicting "always negative" gives 95% accuracy but MCC = 0 (no useful discrimination). MCC requires good performance on both classes, making it impossible to game with trivial strategies.

What's the relationship between MCC and other metrics?

MCC = √(Informedness × Markedness) when both are positive. It's also the geometric mean of the regression coefficients of the problem and its dual. MCC incorporates all four confusion matrix cells equally.

Can MCC be used for multi-class classification?

Yes, MCC generalizes to multi-class problems using the full confusion matrix K: MCC = (c·s − Σ pₖ·tₖ) / √((s²−Σpₖ²)(s²−Σtₖ²)), where c = trace(K), s = total samples, pₖ = per-class predicted totals, tₖ = per-class true totals. Use this as a practical reminder before finalizing the result.

When is MCC = 0 exactly?

MCC = 0 when TP·TN = FP·FN. This occurs for random guessing, "always positive" (TN=FN=0), or "always negative" (TP=FP=0) strategies. Any strategy that ignores the true labels gives MCC = 0.

How does MCC compare to Cohen's kappa?

Both measure binary agreement beyond chance, but MCC is symmetric and treats both classes equally. Kappa can be biased by prevalence and marginal distributions. MCC is generally preferred in machine learning; kappa in inter-rater reliability studies.

What MCC threshold indicates a useful classifier?

No universal threshold exists, but MCC > 0.3 suggests the classifier adds value beyond random, MCC > 0.5 indicates moderate utility, and MCC > 0.7 is considered strong. The required threshold depends on the cost of false positives vs. false negatives in your domain.

Related Pages