Determinism asks whether the model produces the same output for the same input consistently across executions. This property directly affects reproducibility for conformity assessment, auditability for Article 12 logging, and the testing methodology required for Article 15 evaluation.
The assessment determines whether the candidate architecture is inherently deterministic or inherently stochastic. Linear models, decision trees, and most ensemble methods are deterministic: given the same input and model version, the output is identical. LLMs, diffusion models, and other generative architectures are inherently stochastic, with outputs varying across invocations even for identical inputs.
For stochastic architectures, the assessment specifies the controls needed to achieve sufficient reproducibility for compliance purposes. Temperature clamping reduces variance. Seed fixing enables reproduction in testing environments. Output logging captures the actual output for each inference, compensating for the inability to reproduce it deterministically. The assessment should also evaluate the performance cost of these controls, since setting temperature to zero may degrade output quality for tasks where diversity is valuable.
The determinism score reflects the architecture’s natural reproducibility and the feasibility and cost of the controls needed to achieve compliance-grade reproducibility where the architecture is not inherently deterministic.
Key outputs
- Determinism score per candidate model
- Reproducibility control specification for stochastic architectures