Additional Threat-Specific Testing

v2.4.0 | Report Errata

docs security docs security

Output Validation Testing

For systems where model outputs are consumed by downstream components (web interfaces, databases, APIs, workflow engines), the testing programme verifies that no model output can trigger a secondary vulnerability. Test cases include generating outputs containing SQL injection payloads, cross-site scripting vectors, command injection strings, and malformed data structures, then verifying that the output validation layer neutralises each payload before it reaches the downstream component.

The testing should cover every downstream consumption path identified in the threat assessment. Each path requires dedicated test cases because the injection mechanisms and encoding requirements differ: HTML encoding for web rendering, parameterised queries for database consumption, shell escaping for command execution. A test that verifies XSS protection does not also verify SQL injection protection.

The Technical SME updates the test suite whenever a new downstream integration is added. The test results are documented as Module 9 evidence. A finding that any model output can reach a downstream component without validation is a critical failure requiring immediate remediation.

Key outputs

Injection payload testing across all downstream consumption paths
Per-path test cases (XSS, SQL injection, command injection, malformed data)
Test suite updates on new downstream integration
Module 9 AISDP evidence

Denial of Service Testing

Denial-of-service testing verifies the system’s resilience to resource exhaustion attacks. Three test categories are required. Sustained high-volume testing submits requests at rates exceeding the expected peak load by at least 3x, verifying that rate limiting activates correctly and the system maintains service for requests within the limit. Adversarial input testing submits inputs designed to maximise inference time (unusual dimensions, extreme values, pathological structures), verifying that timeouts terminate long-running inferences.

Combined testing submits high-volume legitimate requests simultaneously with adversarial inputs, simulating a realistic attack scenario. The pass criteria are that the system maintains the documented latency and throughput targets under load, that rate limiting and timeout enforcement function correctly, and that the system recovers automatically after the attack ceases. Recovery time should be measured and compared against the declared recovery objective.

The test configuration (load profile, adversarial input specifications, duration) should be documented so that the test is repeatable. The test results are documented as Module 9 and Module 5 evidence, and fed into the load testing results from for a complete picture of the system’s performance under stress.

Key outputs

Three-category DoS testing (volume, adversarial input, combined)
Pass criteria including latency maintenance, control activation, and recovery
Repeatable test configuration
Module 9 and Module 5 AISDP evidence

Plugin/Tool Security Testing

For systems where the AI component invokes external tools or plugins, the testing programme verifies four properties. The tool allowlist is enforced: attempts to invoke unlisted tools are rejected. Parameter validation prevents the system from passing malicious or out-of-scope parameters to authorised tools. Human approval gates function correctly for high-impact actions. Comprehensive logging captures every tool invocation with its parameters and outcome.

Test cases include attempting to invoke disallowed tools through crafted model outputs, passing boundary and malformed parameters to allowed tools, verifying that the human approval workflow cannot be bypassed through rapid sequential requests, and confirming that tool invocation logs are complete and accurate. For agentic systems, this testing is particularly critical because the system’s action space directly affects real-world outcomes.

The Technical SME conducts plugin/tool security testing after every change to the system’s tool integrations or permission model. The test results are documented as Module 7 and Module 9 evidence. A finding that the allowlist can be bypassed or that human approval can be circumvented is a critical failure.

Key outputs

Allowlist enforcement testing
Parameter validation and boundary testing
Human approval bypass testing
Module 7 and Module 9 AISDP evidence

Excessive Agency Testing

Testing verifies that the system’s actual capabilities do not exceed its documented intended scope. Three test categories address this. Permission boundary testing attempts to access resources, APIs, or data stores that the system should not be able to reach, confirming that the principle of least privilege is technically enforced. Privilege escalation testing attempts to increase the system’s permissions through its own actions.

Scope creep testing presents the system with tasks that fall outside its documented intended purpose and verifies that it declines or escalates rather than attempting to fulfil them. For an agentic system designed to manage customer support tickets, scope creep testing might present a request to modify a financial transaction, verifying that the system refuses the action.

For agentic systems as described in Article 43, this testing is particularly important. The Technical SME conducts it after every change to the system’s tool integrations or permission model. Any finding that the system can access resources or perform actions beyond its documented scope is a critical non-conformity, because it represents a gap between the AISDP’s intended purpose declaration and the system’s actual capability.

Key outputs

Permission boundary testing (resource, API, data store access)
Privilege escalation testing
Scope creep testing with out-of-purpose task presentation
Module 1 and Module 9 AISDP evidence