Testability

v2.4.0 | Report Errata

docs development docs development

Testability asks whether the model architecture supports the testing required by Article 15. Can accuracy, robustness, and fairness be evaluated meaningfully? Can adversarial robustness be tested systematically?

The assessment determines whether standard evaluation methodologies exist for the candidate architecture and whether they are sufficient for the system’s risk profile. Decision trees and linear models produce deterministic outputs that simplify testing: a given input always produces the same output, enabling straightforward pass/fail comparison against expected values. LLMs and diffusion models produce stochastic outputs, requiring statistical testing frameworks that evaluate output distributions rather than individual predictions; the assessment specifies the testing methodology needed and estimates the testing effort.

Adversarial robustness testing varies by architecture. Tabular models can be tested through feature perturbation. Image classification models have well-established adversarial example generation methods. LLMs require prompt injection testing, jailbreak evaluation, and content safety assessment. The assessment identifies which adversarial testing methods are applicable and whether they are mature enough to produce reliable results.

The testability score reflects the combined ease and reliability of performance testing, fairness testing, and adversarial robustness testing for the candidate architecture.

Key outputs

Testability score per candidate model
Required testing methodology specification