Data poisoning simulation tests the model’s resilience to corrupted training data. ART’s poisoning modules provide simulation capabilities. The test inserts known poisoned records into a copy of the training dataset, retrains the model, and evaluates whether the poisoned model’s behaviour deviates from the clean model’s behaviour on the poisoned trigger inputs and on legitimate inputs.
The simulation should test at multiple poisoning rates (for example, 0.1%, 0.5%, 1%, and 5% of the training data) to determine the minimum poisoning rate that produces a detectable effect. This threshold informs the data integrity monitoring sensitivity: the anomaly detection on the data pipeline (Great Expectations, Evidently AI) must be configured to detect modifications at or below this rate.
The simulation also validates the effectiveness of the data integrity controls described above. If the poisoned records should have been caught by the anomaly detection, but were not, the detection configuration needs tuning. The simulation results are documented as Module 9 evidence and fed into the risk register.
Key outputs
- Poisoning simulation at multiple rates using ART
- Minimum detectable poisoning rate determination
- Data integrity control effectiveness validation
- Module 4 and Module 9 AISDP evidence