In-Processing Techniques (Fairness Constraints, Adversarial Debiasing, Invariant Representations)

v2.4.0 | Report Errata

docs development docs development

In-processing mitigations modify the model’s training procedure to incorporate fairness objectives. They are more technically demanding than pre-processing techniques and require careful hyperparameter tuning to balance fairness and accuracy.

Fairlearn’s ExponentiatedGradient is the most practically accessible approach. It solves a constrained optimisation problem, maximising accuracy subject to a fairness constraint such as demographic parity or equalised odds. The algorithm trains many candidate models with different constraint levels and returns the one that best balances the two objectives. It integrates with scikit-learn estimators and requires minimal additional code.

Adversarial debiasing (Zhang et al., 2018) trains an adversary network that attempts to predict the protected characteristic from the model’s internal representations. The main model is penalised for leaking information about protected characteristics. This technique is effective for deep learning models but requires careful tuning; the adversary’s learning rate relative to the main model critically affects the fairness-accuracy trade-off.

Learning fair representations (Zemel et al., 2013) takes a more aggressive approach, learning a new feature space that is explicitly uninformative about protected characteristics while remaining predictive for the target variable. The disparate impact remover (Feldman et al., 2015) modifies feature values to reduce correlations with protected characteristics while preserving predictive value.

The AISDP documents the specific technique selected, the mathematical formulation of the fairness constraint, the observed trade-off between fairness and accuracy, and the hyperparameter choices. The selection rationale should explain why in-processing was chosen over pre-processing or post-processing, considering the specific bias patterns identified in the pre-training analysis.

Key outputs

In-processing technique selection and configuration
Mathematical formulation of fairness constraint
Fairness-accuracy trade-off analysis with hyperparameter documentation