Adapter Lifecycle Finish (v0.67.0)
Six surfaces that complete what v0.57 soup adapters started. Adapters are now first-class versioned, collaborative, multi-tenant, evolvable, lockfile-tracked, bisect-able artifacts. None of these exist in hosted vendors — Sakana-style evolutionary merge is research demo only, VeRA storage hurts hosted unit economics (price by GPU-hour, not adapter count), MoLE routing needs both training + serving stacks, adapter PRs need weights + eval + history together.
CMA-ES merge — evolutionary search over LoRA weights
soup adapters merge --strategy cmaes \
--adapter ./lora_a --adapter ./lora_b --adapter ./lora_c \
--eval ./suite.yaml --budget 1h --output ./mergedSakana-style evolutionary merge. Pure-Python rank-mu CMA-ES (no cma dependency). Softmaxes N-1 logits onto the simplex, samples a population, keeps the elite half, plateau-detects after 3 generations without improvement (converged=True).
- 2..16 adapters; population
[2, 256]; generations[1, 10K] - Budget
[60s, 24h]— reuses v0.57blame.parse_budget - Eval-fn failures swallowed with sentinel -1e9 score (one broken eval ≠ crashed run)
- Live eval-suite auto-wiring lands in v0.67.1; v0.67.0 prints the validated plan
VeRA / VB-LoRA vector bank — multi-tenant adapter economics
from soup_cli.utils.vector_bank import VectorBank, write_bank, estimate_bank_size
bank = VectorBank(
name="customer-personalisation",
base_model="meta-llama/Llama-3-8B",
entries={"user_1": (0.31, -0.07, ...), "user_2": (...)},
)
write_bank(bank, "./bank.json")
# 128-D scaling vector at fp32 ≈ 512 bytes / user
# vs. ~30 MB per rank-16 LoRA on Llama-3-8BShared random projection P (d_model × d_model) + per-user scaling vector v_u. Thousands of per-user adapters at MB-each instead of hundreds-of-MB per LoRA. Atomic JSON I/O + cwd containment + symlink rejection + 16 MiB cap.
estimate_bank_size(num_users, vector_dim) for sizing. Live multi-tenant serving via v0.22 multi-adapter surface lands in v0.67.1.
MoLE — per-token gating over task LoRAs
# soup.yaml
task: moe_lora_routing
training:
mole:
num_task_adapters: 8 # [2, 64]
hidden_dim: 4096
temperature: 1.0 # softmax sharpness
top_k: 2Mixture of LoRA Experts. Gating network routes per-token activations to top-K task adapters via softmax over hidden state. Backend-cross-validator rejects mlx. Live gating-kernel + per-token softmax routing lands in v0.67.1.
`soup adapters pr` — GitHub-shaped adapter pull requests
soup adapters pr "Better politeness on EU support tickets" \
--base-sha 9f2e... --adapter ./candidate \
--eval ./eval_delta.json --samples ./sample_diffs.json \
--output ./PR.mdPR = {base SHA, dataset diff, adapter weights, eval-delta report} rendered as review-friendly Markdown with eval-delta table + per-sample baseline/candidate diffs:
| Metric | Baseline | Candidate | Δ |
|---|---|---|---|
| judge_score | 7.4 | 8.2 | +0.8 |
| retry_rate | 12.1% | 4.6% | -7.5% |
_md_table_escape neutralises \, |, \n, \r, \t` in operator-controlled cells. JSON output also available for v0.68 GitHub Action. Bounds: ≤64 deltas, ≤256 samples, ≤32 KiB per sample.
`soup lock` — shared run lockfile
soup lock write --base-model meta-llama/Llama-3-8B \
--base-sha <64hex> --dataset-sha <64hex> --env-hash <64hex> \
--output soup.lock
soup lock show soup.lock
soup lock check --base-model ... --base-sha ... --dataset-sha ... --env-hash ...
# exit 3 on driftClosure of (base_model_sha, dataset_sha, env_hash):
closure_sha = SHA256(base_sha || dataset_sha || env_hash)Commit soup.lock to git so the whole team coordinates on the same reproducible run. soup_version + created_at are advisory only — legitimate operator upgrades don't trigger drift. Composes with v0.64 soup env lock (provides env_hash) and v0.64 soup plan (provides base/dataset hashes from config).
`soup adapters bisect` — binary search over training history
soup adapters bisect ./ckpt-0500 ./ckpt-1000 ./ckpt-1500 ./ckpt-2000 \
--eval-command "soup eval custom --checkpoint {ckpt} --suite ./regression.yaml"Binary search over ordered checkpoint history. Operator supplies a shell template with {ckpt} placeholder — Soup uses shlex.split after shlex.quote(ckpt) (argv-list mode, no `shell=True`). Probes both endpoints first (short-circuits all-OK / all-broken), then ~log₂(n) midpoint probes. Exit 3 on BROKEN_AT.
Composes with v0.66 influence-blame: bisect finds the broken checkpoint, blame attributes it to specific training rows.
Numbers
+165 tests in v0.67.0 (10,836 → 11,021), 7 new test files. v0.67.1 lights up CMA-ES live eval-wiring, VeRA multi-tenant serve, and MoLE gating-kernel.
See also
- [Adapters (v0.57)](/docs/adapters) — diff / merge / blame / branch / checkout, the foundation v0.67 builds on.
- [Post-train x-rays](/docs/post-train-xrays) — v0.66 blame is what
bisecthands off to. - [Pre-flight & tooling](/docs/preflight-tooling) — v0.64
soup env lock+soup planare the inputs tosoup lock.