Calculate attribute agreement percentages and Cohen's Kappa for pass/fail inspection decisions. Evaluate inspector consistency and accuracy.
Attribute Agreement Analysis evaluates the consistency and accuracy of inspectors making categorical decisions — pass/fail, accept/reject, or defect classification. Unlike variable Gage R&R which measures continuous data, attribute agreement addresses the binary or categorical inspection decisions that are common in visual inspection, go/no-go gaging, and defect sorting.
The analysis measures agreement within each inspector (repeatability), between inspectors (reproducibility), and between each inspector and the known standard (accuracy). Cohen's Kappa statistic adjusts for chance agreement, providing a more rigorous measure than simple percent agreement.
This calculator takes the total number of decisions and the number of matching decisions to compute percent agreement and Cohen's Kappa, helping you determine if your attribute inspection process is reliable.
This analytical approach aligns with lean manufacturing principles by replacing waste-generating guesswork with efficient, fact-based processes that directly support value creation and cost reduction. By calculating this metric accurately, production managers gain actionable insights that drive continuous improvement efforts and strengthen overall operational performance across the shop floor.
Visual and attribute inspections are subjective and variable. Attribute agreement analysis exposes inconsistency and bias, enabling targeted training that makes pass/fail decisions more reliable and defensible. Regular monitoring of this value helps teams detect deviations quickly and maintain the operational discipline needed for sustained manufacturing excellence and competitiveness. Having accurate figures readily available streamlines reporting, audit preparation, and strategic planning discussions with management and key stakeholders across the business.
% Agreement = (Matching Decisions / Total Decisions) × 100 Cohen's Kappa (κ) = (P_o − P_e) / (1 − P_e) where: • P_o = observed agreement (proportion) • P_e = expected agreement by chance • P_e = (a/n × d/n) + (b/n × c/n) for 2×2 confusion matrix
Result: 88% Agreement, κ = 0.76
Out of 100 decisions, 88 matched the standard. Simple agreement is 88%. Adjusting for the 50% expected by chance: κ = (0.88 − 0.50) / (1 − 0.50) = 0.76, indicating substantial agreement beyond chance.
For pass/fail decisions against a known standard, the confusion matrix shows: true accepts, false accepts, true rejects, and false rejects. Each cell informs a different quality metric — escape rate, false alarm rate, and overall accuracy.
Common interventions include: creating limit sample boards with clear pass/fail boundaries, standardizing lighting and viewing conditions, implementing inspector certification programs, and rotating inspectors to prevent fatigue-related degradation.
IATF 16949 clause 7.1.5.1.1 requires attribute MSA for all subjective inspection processes referenced in the control plan. Third-party auditors specifically check for documented attribute agreement studies.
Kappa < 0.40 is poor agreement. 0.40–0.60 is moderate. 0.60–0.75 is good. Above 0.75 is excellent. For critical quality decisions, aim for κ > 0.75. These thresholds are guidelines — context matters.
Percent agreement doesn't account for agreement that occurs by chance. If defect rate is low (e.g., 5%), inspectors who always say "pass" get 95% agreement despite being useless. Kappa adjusts for this.
At least 50 parts, ideally including 25–50% that are borderline or defective. If your defect rate is low, deliberately include known defective parts to test detection capability.
Yes. Use Fleiss' Kappa for multiple raters making categorical decisions. You can also compute pairwise Kappa between each pair of inspectors to identify specific disagreement patterns.
This is expected and reveals opportunities. Create clearer criteria, boundary samples, and comparison standards. Re-evaluate after training. Borderline disagreement is the norm, not the exception.
Low agreement with the standard means high escape rates (defective parts passed as good) and high false reject rates (good parts rejected). Quantifying agreement helps predict these operational impacts.