Governance & Provenance (v0.59.0)
Procurement-ready ML compliance from a single CLI. v0.59 ships 4 governance surfaces that previously required a stack of SaaS tools and a security team: CycloneDX/SPDX BOM emitter, in-toto / SLSA-3 attestation, HIPAA / SOC2 audit log, and EU AI Act Annex XI/XII auto-documentation.
`soup bom emit` — ML Bill of Materials
Generates machine-learning Bills of Material in CycloneDX 1.6 (with ML-BOM extension) or SPDX 2.3 AI-profile formats — or both in a single invocation.
soup bom emit \
--name llama3-8b-finetuned --version 1.0.0 \
--base-model meta-llama/Llama-3.1-8B-Instruct \
--base-sha abc123...def456 \
--config-sha 789def...012abc \
--data-sha 456ghi...789jkl \
--task sft --license apache-2.0 \
--format both --output ./manifests/llama3-bomAtomic file write (tempfile.mkstemp + os.replace) with symlink rejection (TOCTOU defense). --format=both produces <prefix>.cdx.json and <prefix>.spdx.json side by side.
`soup attest emit` — SLSA-3 in-toto attestations
Per-stage attestation aligned with SLSA-3 (Supply-chain Levels for Software Artifacts) and in-toto.
soup attest emit --stage train \
--subject adapter.safetensors --sha abc123...xyz789 \
--builder soup-cli \
--invocation "soup train --config soup.yaml" \
--sign unsigned --output ./attestations/train.jsonStages: extract / train / eval / export / publish. Backends: unsigned (v0.59.0 default — tamper-detectable via SHA-256), ed25519 and sigstore ship in v0.59.1.
`soup audit-log` — HIPAA/SOC2 audit trail
Every command execution records timestamp, command-line, exit code, operator identity, and host into ~/.soup/audit.jsonl (or $SOUP_AUDIT_LOG_PATH). PII fields are redacted before write.
# Tail the most recent 100 records (rich table)
soup audit-log tail --limit 100
# Raw JSONL for piping
soup audit-log tail --limit 50 --json
# Rotate at a 500 MB cap
soup audit-log rotate --cap-mb 500EU AI Act Annex XI/XII
soup train ships two new flags that emit the documentation required by the EU AI Act:
soup train --config soup.yaml \
--annex-xi ./docs/annex-xi.md \
--repro-receipt ./receipts/repro.jsonThe reproducibility receipt captures every seed, kernel version, library version, and dataset hash needed to reproduce the run under SR 11-7 model-risk-management standards.
CO₂ energy tracking schema
soup.yaml accepts an optional co2 block tying training energy to electricityMap intensity data so the BOM and Annex XI doc carry a real-time gCO₂eq number. The estimator backend lands in v0.59.1.
Numbers
+93 new tests in v0.59.0 (9193 → 9286).
See also
- [Supply-chain security](/docs/supply-chain-security) — v0.60 LoRA backdoor scanner, Merkle signing, air-gap bundles.
- [Registry](/docs/registry) — every BOM and attestation can be attached as an artifact.
- [Pre-flight & tooling (v0.64)](/docs/preflight-tooling) —
soup license-advisor --target b2c|defense|embeddedreturns ok/warn/block per (license, deploy-target, MAU) and composes with the v0.59 license-matrix onsoup adapters merge. - [Adapter lifecycle (v0.67)](/docs/adapter-lifecycle) —
soup lockSHA256(base \|\| dataset \|\| env) closure makes governance artifacts reproducible across teams.