Model Validation Gates

v2.4.0 | Report Errata

docs development docs development

Performance Gate (AUC-ROC, F1, Precision, Recall, Brier, Calibration)

The performance gate is the first of the four sequential validation gates in the CI pipeline. It computes the model’s accuracy metrics on the holdout test set and compares them against the thresholds declared in the AISDP. Any metric falling below its declared threshold fails the gate, and subsequent gates do not run.

Two subtleties require attention. First, the holdout set must be truly held out: it must not have been used during training, hyperparameter tuning, or feature selection. If the holdout set has leaked into the training process, the performance gate is testing on training data and the results are unreliable. Second, the Technical SME computes the metrics with confidence intervals (bootstrap or cross-validation), and the threshold comparison uses the lower bound of the confidence interval rather than the point estimate.

A model achieving 0.86 AUC-ROC with a 95% confidence interval of [0.82, 0.90] is compared against the threshold using 0.82, because that represents the worst-case plausible performance. The gate produces a structured report recording the gate name, each metric’s value, the threshold, the confidence interval, and the pass/fail determination. This report is stored as a pipeline artefact and retained as Module 5 evidence.

Key outputs

Performance metrics computed on a genuinely held-out test set
Confidence interval computation with lower-bound threshold comparison
Structured gate report stored as a pipeline artefact
Module 5 AISDP evidence

Fairness Gate — Non-Negotiable (SRR, Equalised Odds, Predictive Parity, Calibration)

The fairness gate is the second validation gate, running only after the performance gate passes. It computes the full fairness evaluation suite (selection rate ratios, equalised odds, predictive parity, calibration within groups) across all measured protected characteristic subgroups. Any subgroup metric breaching its threshold fails the gate. This gate is non-negotiable; it cannot be overridden without the process described above.

The fairness gate’s most common failure mode involves small subgroups. The model may pass fairness for all subgroups except one with a small sample size, where the metric is unreliable due to statistical noise. Gate design must handle this: either the small-subgroup metrics are reported with confidence intervals and compared using the lower bound, or the gate explicitly flags subgroups below the minimum sample size (typically 30 observations) as “insufficient data, manual review required.”

The fairness gate report disaggregates results by subgroup, providing per-subgroup metric values, confidence intervals, threshold comparisons, and pass/fail determinations. This disaggregated report is essential Module 5 evidence; an aggregate fairness metric that masks disparate impact within subgroups does not satisfy Article 10’s requirements.

Key outputs

Per-subgroup fairness metrics (SRR, equalised odds, predictive parity, calibration)
Small-subgroup handling (confidence intervals or manual review flagging)
Disaggregated fairness gate report
Module 5 and Module 6 AISDP evidence

Fairness Gate Override — AI Governance Lead Approval (Logged as Risk Acceptance)

The fairness gate cannot be overridden through standard deployment processes. If a candidate model fails the fairness gate, the only pathway to deployment is a formal risk acceptance approved by the AI Governance Lead. This approval is logged as a risk acceptance decision, not as a routine exception.

The risk acceptance record must document which specific subgroup metrics breached which thresholds, the root cause analysis (why the model fails fairness for this subgroup), the remediation plan (what steps will be taken to address the fairness gap), the interim risk assessment (what is the impact on affected persons during the period before remediation), and the time-bound commitment (when will the remediated model be deployed).

The Conformity Assessment Coordinator retains the risk acceptance record as Module 6 evidence. Risk acceptance is not a permanent state; it is a time-limited acknowledgement that the system falls short of its declared fairness standards, with a documented plan to close the gap. The post-market monitoring system (Module 12) tracks progress against the remediation plan and escalates if the timeline slips.

Key outputs

Risk acceptance approval by the AI Governance Lead
Root cause analysis and remediation plan
Time-bound commitment for remediated model deployment
Module 6 and Module 5 AISDP evidence

Robustness Gate (Adversarial Examples, Input Perturbation)

The robustness gate is the third validation gate, testing the model’s stability under input perturbation. Article 15(3) requires high-risk AI systems to be resilient to errors, faults, and inconsistencies, and Article 15(5) requires that robustness measures be proportionate to the relevant circumstances and take due account of the state of the art as reflected in relevant harmonised standards or common specifications. IBM’s Adversarial Robustness Toolbox (ART) provides a comprehensive set of adversarial attack methods and defence evaluations. Perturbations are calibrated to realistic input noise levels, and the gate verifies that the model’s accuracy does not degrade beyond a defined tolerance.

For tabular models, feature perturbation at realistic noise levels is the practical approach: ±5% on continuous features and random category flips at a 1% rate are typical starting points. For neural networks, ART supports FGSM, PGD, C&W;, and DeepFool attack methods. For image or text models, ART’s attack methods provide systematic adversarial evaluation. The perturbation configuration is version-controlled alongside the threshold configuration.

Performance degradation exceeding the defined tolerance fails the gate. The gate report records the perturbation methods applied, the perturbation magnitudes, the model’s performance under each perturbation, and the tolerance comparison. A model that passes the performance and fairness gates but fails the robustness gate may be accurate under normal conditions but fragile to input variation, posing a risk in production environments where inputs are noisier than in controlled evaluation datasets.

Key outputs

Adversarial and perturbation testing using ART or equivalent
Perturbation configuration aligned to realistic input noise levels
Robustness gate report with per-perturbation performance results
Module 9 and Module 5 AISDP evidence

Documentation Gate (Model Card Completeness, AISDP Currency)

The documentation gate is the fourth and final validation gate. It verifies that the compliance documentation is complete and current for the candidate model version. A model that passes all technical gates but lacks a complete model card, or whose AISDP sections have not been updated to reflect the candidate’s characteristics, should not be deployed.

The gate checks that the model card has been generated and contains all required sections (architecture summary, training data version, evaluation metrics disaggregated by subgroup, intended use, known limitations). It verifies that the AISDP sections most closely tied to the model (Modules 3, 5, and 10) have been updated or re-generated to reflect the candidate version. It confirms that the evidence pack (model card, gate reports, data quality reports) is complete and that integrity hashes have been computed.

Auto-generated documentation simplifies this gate: if the model card and AISDP drafts are generated by the pipeline, the documentation gate primarily verifies that the generation succeeded and that the generated documents contain no empty or missing sections. For manually maintained documentation, the gate checks timestamps to confirm that the documentation has been updated since the candidate model was registered.

Key outputs

Model card completeness verification
AISDP currency verification for affected modules
Evidence pack integrity hash verification
Module 5 and Module 10 AISDP evidence