Model Registry

v0.26.0 ships a local model registry at ~/.soup/registry.db. Every fine-tune can be pushed, tagged, searched, diffed, and walked as a lineage DAG — entirely offline, no server, no cloud.

Why a registry

Most people end up with 17 checkpoint directories called output-3-final and no idea which one won. The registry fixes that: one row per fine-tune, carrying config, eval baseline, and parent lineage.

Push a run

bash
# After training a model
soup registry push \
  --run-id run_20260420_143052_a1b2 \
  --name chat-llama \
  --tag v1 \
  --parent base-llama-3.1-8b \
  --notes "Alpaca 15k, 2 epochs, lr=2e-4"

Optional: omit --parent and the registry will attempt to derive it from the run's config (same base model + different run id).

List, show, search

bash
# List everything
soup registry list

# Filter
soup registry list --name chat-llama
soup registry list --tag prod
soup registry list --base Qwen/Qwen3-8B --task sft

# Show details + lineage
soup registry show chat-llama@v1

# Search across name, base, task, notes
soup registry search "medical"

Diff two versions

bash
soup registry diff chat-llama@v1 chat-llama@v2
# training.lr:       2e-4 → 3e-4
# training.lora.r:   16   → 32
# eval.judge:        8.2  → 8.6  (+0.4)
# eval.mmlu:         64.3 → 65.1 (+0.8)

soup registry diff pretty-prints both config and eval delta between two entries.

Promote to prod

bash
soup registry promote chat-llama@v2 --tag prod

Tagging is how you mark stable versions. Any number of tags can point at the same entry.

Lineage DAG

bash
soup history chat-llama
# chat-llama
# ├── @v1  judge 8.2  mmlu 64.3%  (2026-04-18)
# │   ├── @v2  judge 8.6  mmlu 65.1%  (2026-04-19)
# │   │   └── @prod  ⭐
# │   └── @v1-rephrase  judge 8.3  mmlu 64.0%
# └── @baseline

Cycle detection is enforced on every walk — forking a descendant back onto an ancestor raises an error.

Delete

bash
soup registry delete chat-llama@v1 --yes

Delete cascades to children, but the confirmation prompt prints the subtree first.

Use as an eval-gate baseline

The [eval gate](/docs/eval-gate) accepts registry://<id> as the baseline, so quality regressions are diffed against any historical run rather than a static file.

Storage

  • SQLite at ~/.soup/registry.db — human-readable, portable
  • Single writer, no daemon
  • Name + tag are validated (ASCII, 64-char cap, no path separators)
  • SQL LIKE-wildcard escaping on every search term

See also

  • [Eval gate](/docs/eval-gate) — baselines can point at registry entries
  • [Soup Cans](/docs/soup-cans) — export a registry entry as a portable .can
  • [CLI reference](/docs/cli-reference)