Membership inference testing evaluates whether an attacker can determine if a specific individual’s data was included in the training set. ML Privacy Meter implements state-of-the-art membership inference attacks. The testing protocol trains an attack model on a shadow dataset (data from the same distribution as the training data) and evaluates its ability to distinguish training members from non-members.
If the attack achieves significantly better than random accuracy, the model is leaking membership information. The recommended starting threshold is an attack AUC-ROC below 0.55 (only marginally better than chance) as a starting threshold. An attack AUC-ROC above this threshold indicates that the model retains sufficient information about individual training records to enable membership determination, which has direct GDPR implications.
If the membership inference test fails (attack AUC-ROC above threshold), the controls described in should be strengthened: differential privacy parameters tightened, output granularity further restricted, or the model architecture revised to reduce memorisation. The test results are documented as Module 4 and Module 9 evidence, with the chosen threshold and its justification.
Key outputs
- Membership inference testing using ML Privacy Meter
- Attack AUC-ROC measurement against defined threshold (< 0.55)
- Remediation triggers for above-threshold results
- Module 4 and Module 9 AISDP evidence