Documentability is the first criterion. The question is: can the model’s architecture, hyperparameters, and decision process be described precisely enough to satisfy Annex IV, Section 2? Could a qualified reviewer reproduce the training process from the documentation alone?
The assessment examines the model architecture and determines whether its structure can be expressed in a technical specification document. A logistic regression model scores strongly: every parameter is a named coefficient with a clear interpretation. A gradient-boosted tree ensemble scores adequately: the architecture (number of trees, depth, splits) can be described, though enumerating every learned split across thousands of trees is impractical. A transformer with billions of parameters scores weakly: the architecture can be described at a structural level, yet the learned representations cannot be enumerated. The assessment identifies documentation gaps that would need compensating measures, such as detailed behavioural characterisation in lieu of parameter-level documentation.
The score (strong, adequate, or weak) is recorded in the compliance criteria scoring matrix alongside the evidence supporting the determination. The assessment is not merely a label; it must specify what can be documented, what cannot, and what compensating measures would be required if the architecture were selected.
Key outputs
- Documentability score per candidate model
- Gap identification with compensating measures