Red Team Exercises

v2.4.0 | Report Errata

docs security docs security

Annual Red Team — Scenarios

Annual red team exercises simulate realistic threat scenarios combining technical attacks with social engineering. Five scenario categories are relevant for high-risk AI systems. Data source corruption attempts to manipulate the system’s outputs by corrupting a data source it depends upon, testing the data integrity controls and anomaly detection. Automation bias exploitation attempts to cause the human oversight layer to approve harmful outputs by exploiting operator trust in the model’s recommendations.

Sensitive information extraction attempts to retrieve personal or confidential information from the model through carefully crafted queries. Denial of service via adversarial inputs attempts to trigger resource exhaustion through inputs designed to maximise computational cost. Model manipulation attempts to alter the system’s behaviour through infrastructure compromise, configuration tampering, or supply chain exploitation.

The Technical SME conducts red team exercises with personnel who were not involved in the system’s development and who have realistic threat actor capabilities. For financial-sector systems subject to DORA, the scenarios should include sector-specific threats such as adversarial manipulation of credit scoring outputs or data poisoning to influence lending decisions.

Key outputs

Five-category red team scenario coverage
Independent exercise personnel with realistic threat actor capabilities
Sector-specific scenarios where applicable
Module 9 AISDP evidence

TLPT Alignment (DORA Art. 26, TIBER-EU)

Significant financial entities subject to DORA Article 26 must conduct threat-led penetration testing (TLPT) using intelligence-led methodologies such as TIBER-EU. For AI systems deployed within such entities, the TLPT scope should explicitly include AI-specific attack scenarios.

The threat intelligence phase should address AI-specific threat actors and techniques, using MITRE ATLAS alongside MITRE ATT&CK.; The TLPT scope should encompass adversarial inputs designed to manipulate financial decisions, model extraction attempts, data poisoning scenarios targeting the training pipeline, and prompt injection attacks for LLM-based systems. This ensures that the TLPT exercises the AI-specific attack surface in addition to the traditional infrastructure and application targets.

TLPT reports, shared with the financial supervisor, also serve as Module 9 evidence for the AI Act, provided they are structured to address both regimes’ expectations. The engagement brief should reference both the DORA TLPT requirements and the AI Act threat model, specifying which threats from each regime the test should exercise. If the system is not subject to DORA, this article is documented as not applicable.

Key outputs

TLPT scope including AI-specific attack scenarios (if DORA applicable)
MITRE ATLAS alongside MITRE ATT&CK; in threat intelligence phase
Dual-regime report structure serving both DORA and AI Act
Module 9 AISDP evidence

Red Team Outputs

Red team exercises produce a detailed report containing findings, exploited vulnerabilities, the attack chain for each successful scenario, and recommended mitigations. The report should map each finding to the threat model entry it exercises and the AISDP module it affects, enabling direct traceability from the exercise to the compliance documentation.

Findings are tracked in the vulnerability management register with the same severity classification and remediation SLAs as other security findings. Critical findings from red team exercises, such as a successful data source corruption that alters the model’s outputs without triggering any alert, require immediate remediation. The remediation actions and their verification are documented alongside the original findings.

The red team report also feeds into the threat model update cycle. Attack chains that were not anticipated in the threat model indicate gaps in the threat assessment. Successful attacks that bypassed controls that were assumed to be effective indicate control weaknesses. Both findings result in threat model updates and potential revisions to the risk register.

Key outputs

Detailed red team report with findings, attack chains, and recommendations
Findings tracked in the vulnerability management register
Threat model and risk register updates from exercise findings
Module 9 AISDP evidence