v2.4.0 | Report Errata
docs development docs development

Large language models and foundation models used as components within high-risk systems introduce documentation challenges that go beyond those of conventional models. AISDP Module 3 must record the base model’s provenance with a level of detail that enables a competent authority to assess the compliance implications of the model choice.

Provenance documentation for foundation models covers several dimensions. The model’s origin must be recorded: the provider, the model family and version identifier, the date of access or download, and the access mechanism (API, downloaded weights, or fine-tuned variant). The training data provenance should be documented to the extent available from the provider; where the provider’s disclosures are insufficient, the gaps are recorded as non-conformities, and the compensating controls applied (such as sentinel testing) are described.

The model’s architecture family, parameter count, training methodology, and known limitations should be captured. For models accessed via API, the provider’s versioning policy is documented, including whether the provider may silently update the model within a version identifier. The provider’s data handling practices are recorded. The licensing terms and their compatibility with the system’s commercial and regulatory context are assessed.

For models downloaded from public repositories such as Hugging Face, best practice is to download the model once, compute a SHA-256 cryptographic hash, store the model and hash in the internal model registry, and reference only the internal copy thereafter. This prevents silent changes if the repository updates the model under the same identifier. Hugging Face’s revision parameter supports pinning to a specific Git commit SHA for this purpose.

The provenance record for each model version should capture, at minimum: origin, training data version, training code commit, hyperparameters, pipeline execution ID, evaluation metrics, content hash, and digital signature. This record is attached as structured metadata in the model registry and referenced by AISDP Module 3.

Key outputs

  • Foundation model provenance record
  • Provider disclosure gap assessment
  • Cryptographic hash and version pinning records
On This Page