Model Version Pinning & Cryptographic Hash Verification
The inference layer must serve a specific, immutable model version. Model version pinning ensures that the exact model validated during testing is the model that runs in production. Switching to a different version constitutes a deployment event requiring human approval and CI/CD pipeline validation; it cannot happen silently through an automatic update.
The model registry enforces this through stage management. Models progress through defined stages: experimental, staging, production, and archived. Only models in the production stage can be loaded by the inference service, and promotion to production requires documented approval. MLflow, SageMaker, and Vertex AI model registries all support this pattern.
Cryptographic hash verification adds a further assurance layer. Each model artefact is hashed at the point of registration, and the inference service verifies the hash on load. This confirms that the model binary has not been tampered with between registration and deployment. The hash, together with the model version identifier and the deployment approval record, is logged as part of the Article 12 audit trail, enabling precise reconstruction of which model was serving at any given point in time.
Key outputs
- Model registry with stage management (experimental → staging → production → archived)
- Cryptographic hash computation at registration and verification at load
- Deployment event logging with version, hash, and approval evidence
- Module 10 audit trail entries
Confidence Thresholding — Below-Threshold → Human Review
Every classifier produces some form of confidence estimate: a probability score, a softmax output, or a distance from the decision boundary. Confidence thresholding routes predictions that fall below a defined threshold to human review before they are acted upon. This control prevents the system from acting on uncertain predictions, which are the most likely to diverge from intended behaviour.
The Technical SME calibrates the threshold carefully, using the validation dataset. The threshold is set at the level where the model’s error rate for below-threshold predictions is unacceptably high. Too high a threshold, and most predictions are sent to humans, defeating the purpose of automation. Too low, and uncertain predictions slip through with potential adverse consequences for affected persons.
The threshold value, the calibration methodology, and the resulting human review volume are all documented in the AISDP. The review volume has operational implications: the human oversight interface must be designed to handle the expected caseload without creating bottlenecks that tempt operators to reduce review thoroughness. Confidence thresholding is also subject to ongoing monitoring; if the distribution of confidence scores shifts in production, the threshold may need recalibration.
Key outputs
- Defined confidence threshold with calibration methodology
- Documentation of expected human review volume at the chosen threshold
- Integration with human oversight interface workflow
- Module 7 and Module 3 AISDP entries
Output Constraint Enforcement (Pydantic Schema Validation)
Output constraint enforcement is the last-resort guard at the inference layer. It enforces hard bounds on what the model can output: scores must fall within a defined range, classifications must be drawn from a defined set, and generated text must conform to length and format constraints. This prevents pathological model behaviour, such as extreme score values from adversarial inputs or hallucinated classification labels, from propagating downstream.
Pydantic schema validation provides a clean implementation. The output schema is defined as a Pydantic model specifying field types, value ranges, and enumeration constraints. Every inference output is validated against this schema, and outputs that do not conform are rejected. The rejection is logged with the original output, the validation failure reason, and the request context, creating an audit record that supports both debugging and compliance evidence.
This control is particularly important for robustness under adversarial conditions. Adversarial inputs may cause the model to produce outputs far outside its expected range. Without output constraints, these outputs would propagate to the post-processing layer, the explainability layer, and ultimately to human operators or affected persons. Output constraint enforcement ensures that even if the model misbehaves, the system as a whole remains within its documented operational bounds.
Key outputs
- Pydantic output schema definition
- Validation pipeline integrated into the inference service
- Rejection logging configuration
- Module 3 and Module 9 documentation of the constraint mechanism
Intent Drift Control — Production-Stage Models Only
A specific form of intent drift at the inference layer occurs when a model that has not completed the full validation and approval pipeline is inadvertently served in production. This can happen through misconfigured deployment scripts, manual overrides during debugging, or automation that promotes models between stages without the required governance checks.
The control is architectural: the inference service is configured to load only models that the model registry marks as being in the production stage. Promotion to production requires documented approval through the CI/CD compliance gates. The inference service verifies the model’s stage on load, and any attempt to serve a model in experimental, staging, or archived status is blocked and logged as a security event.
This constraint ensures that every model serving predictions has passed the conformity checks, fairness evaluations, and governance approvals documented in the AISDP. It also simplifies audit: the model registry’s stage history, combined with the inference service’s load logs, provides a complete record of which validated model was active at any point. Penetration testing should specifically verify that this constraint cannot be bypassed.
Key outputs
- Inference service configuration restricting loads to production-stage models
- Model registry stage management with approval requirements
- Security event logging for attempted non-production model loads
- Penetration test verification of the constraint