Pre-flight & Tooling (v0.64.0)

Six surfaces that catch mistakes before you spend a GPU hour: pick the right base, lock the run plan, freeze the environment, install completions, advise on licenses, and predict peak VRAM.

`soup tunability` — Pareto frontier of base-model efficiency

bash
soup tunability --dataset ./chats.jsonl --candidates llama-3.1-8b qwen2.5-7b gemma-3-9b \
  --probe-steps 100 --holdout-size 64 --output ./tunability.json

Probes a held-out dataset slice against each candidate base with a lightweight LoRA, measures training-loss deltas, and reports which bases form the Pareto frontier (best efficiency for cost). --plan-only dry-runs without probing; --list shows all bundled candidates.

  • Candidate allowlist with licensing metadata (Apache-2.0 / MIT / LLaMA-3 / etc.)
  • Bounds: probe_steps ∈ [10, 10000], holdout_size ∈ [10, 100000] rows
  • Safety: path containment, null-byte rejection, symlink-escape rejection
  • Output: per-candidate delta, wall-clock seconds, estimated USD cost, Pareto membership

`soup plan` / `soup apply` — Terraform-shaped drift detection

bash
soup plan --config soup.yaml --state ./soup.tfstate
soup apply --config soup.yaml --state ./soup.tfstate

plan computes cost / ETA / peak-VRAM / SHA-256 hashes from the config and writes an immutable soup.tfstate. apply re-reads the config, detects any drift (batch size, dataset SHA, base SHA) and refuses to proceed (exit 3) until you re-plan.

  • Pure JSON state: plan{cost, eta, sha}, applied: bool, applied_at, run_id
  • TOCTOU defense: os.lstat before open, symlinks rejected
  • Peak VRAM with 10% safety margin; spot pricing per GPU tier
  • Composes with v0.67 soup lock for full reproducibility

`soup env lock` / `status` / `check` — hermetic environment

bash
soup env lock --output ./soup-env.lock
soup env check --lock ./soup-env.lock   # exit 3 on ABI drift

Snapshots Python version, CUDA major version, platform, and every installed package into a JSON lockfile. check detects ABI-sensitive drift (e.g. CUDA 12 → 13) that would silently break training.

  • Fields: soup_version, python_version, platform, cuda_version, packages {name, version, source}
  • Atomic write, file-size capped
  • Feeds the env_hash half of v0.67 soup.lock closure

`soup completions <shell>` — bash / zsh / fish

bash
soup completions bash  | sudo tee /etc/bash_completion.d/soup
soup completions zsh   > ~/.zsh/completions/_soup
soup completions fish  > ~/.config/fish/completions/soup.fish

Eval-safe shell completion scripts emitted to stdout (no Rich panels). Closed shell allowlist; all error messages go to stderr.

`soup license-advisor` — deploy-target risk gate

bash
soup license-advisor --target b2c --license llama-3 --monthly-active-users 750000

Returns ok / warn / block (exit 3 on block) for a (license, deploy-target, MAU) tuple. Targets: b2c, defense, embedded — each with distinct rules. Composes with v0.60 license-matrix on soup adapters merge.

Hardware-fit calculator — analytical peak VRAM

python
from soup_cli.utils.hardware_fit import estimate_peak_vram_gb, decide_hardware_fit

report = decide_hardware_fit(input, available_vram_gb=24.0)
# report.predicted_peak_gb -> 18.2
# report.breakdown -> {weights: 4.1, optimizer: 8.2, gradients: 4.1, activations: 1.3, overhead: 0.5}

Static predictor with a 5-bucket breakdown (weights / optimizer / gradients / activations / overhead) and a 10% safety margin. Refuses to run if predicted peak exceeds available VRAM.

  • 9 quant tiers (none, 4bit, 8bit, fp8, gptq, awq, aqlm, eetq, mxfp4)
  • 4 PEFT modes (full, lora, dora, qlora)
  • Bounds: seq_len ∈ [64, 1M], batch ∈ [1, 1024], params ∈ (0, 1000B]
  • Activation memory halved under gradient checkpointing

Numbers

Six surfaces, +N tests on top of v0.63's 10,035. Composes downstream with v0.65 eval depth, v0.66 post-train x-rays, and v0.67 adapter lifecycle.

See also

  • [Soup lock](/docs/adapter-lifecycle) — v0.67 closes the env_hash → soup.lock reproducibility chain.
  • [Governance](/docs/governance) — v0.59 BOM + SLSA-3 + audit log layer on top of plan/apply.