Threat Actors
The threat model must identify the threat actors whose capabilities, motivations, and access levels shape the threat landscape. Four primary categories are identified relevant to high-risk AI systems.
External attackers range from opportunistic attackers exploiting known vulnerabilities to sophisticated adversaries conducting targeted campaigns against the AI system. Their motivations may include financial gain, competitive intelligence, ideological disruption, or state-sponsored espionage. Malicious insiders have legitimate access to one or more system components and can exploit that access to modify training data, tamper with model artefacts, exfiltrate intellectual property, or sabotage the system’s outputs.
Compromised deployers have legitimate access to the system through the deployer relationship but may use that access for purposes beyond the intended scope, including model extraction or systematic querying. Adversarial users interact with the system through its intended interfaces but submit inputs designed to exploit the model’s behaviour, such as adversarial examples, prompt injection, or systematic probing to discover the model’s decision boundaries.
Each threat actor category has different capabilities, different access levels, and different motivations. The threat model should assess each identified threat against the relevant actor categories to determine realistic likelihood scores. An attack that requires insider access scores differently from one that can be executed by an anonymous external attacker.
Key outputs
- Threat actor profiles with capabilities, motivations, and access levels
- Mapping of threats to relevant actor categories
- Likelihood scoring informed by actor capability assessment
- Module 9 AISDP documentation