CLI Reference

Core Commands

`soup init`

Create a config file with interactive wizard or template.

bash
soup init [--template <template>] [--output <path>]

`soup train`

Start a training run.

bash
soup train [--config <path>] [--resume auto|<checkpoint>] [--wandb] [--tensorboard] \
           [--deepspeed zero2|zero3|zero2_offload|zero++|zero_pp] [--fsdp full_shard|shard_grad|full_offload] \
           [--gpus auto|<N>] [--gate <suite>] \
           [--push-as <user/repo>] [--hf-resume] [--yes]
  • --gpus (v0.27) — topology-aware multi-GPU launch hint (NVLink/PCIe detection)
  • --deepspeed zero++ (v0.27) — ZeRO-3 with quantized weights & grads
  • --gate (v0.26) — run an eval suite at epoch boundaries and halt on regression
  • --push-as (v0.29) — auto-push every save_steps checkpoint to HF Hub as a checkpoint-<N> branch
  • --hf-resume (v0.29) — pull the latest checkpoint branch from HF Hub and resume training

`soup chat`

Interactive chat with a trained model.

bash
soup chat [--model <path>] [--base <model>] [--temperature <float>] [--max-tokens <int>]

`soup serve`

Start an OpenAI-compatible inference server.

bash
soup serve [--model <path>] [--backend transformers|vllm|sglang] [--port <int>] [--host <host>] [--tensor-parallel <n>] [--gpu-memory <float>] [--speculative-decoding <draft-model>] [--spec-tokens <n>] [--adapters <name>=<path>]

`soup export`

Export model to deployment format.

bash
soup export [--model <path>] [--format gguf|onnx|tensorrt|awq|gptq] [--quant <type>] [--output <path>]

`soup merge`

Merge LoRA adapter with base model.

bash
soup merge [--adapter <path>] [--base <model>] [--output <path>] [--dtype <type>]

`soup push`

Upload model to HuggingFace Hub.

bash
soup push [--model <path>] [--repo <user/repo>] [--private] [--collection <owner/slug-hash>]
  • --collection (v0.29) — add the pushed repo to an existing HF Collection

Token resolution (single source of truth): env HF_TOKEN / HUGGINGFACE_HUB_TOKEN > ~/.cache/huggingface/token > ~/.huggingface/token. The legacy --token flag is deprecated and delegates to this chain.

`soup eval`

Comprehensive model evaluation platform.

bash
soup eval benchmark [--model <path>] [--benchmarks <list>]
soup eval custom [--model <path>] [--tasks <path>]
soup eval judge [--model <path>] [--prompts <path>] [--judge <model>]
soup eval auto [--config <path>]
soup eval compare <run1> <run2>
soup eval leaderboard
soup eval human [--model-a <path>] [--model-b <path>] [--prompts <path>]

`soup deploy`

Deploy models to inference runtimes.

bash
# Ollama
soup deploy ollama [--model <path>] [--name <name>] [--system <prompt>] [--template <tpl>] [--parameter <key=val>]
soup deploy ollama --list
soup deploy ollama --remove <name>

# HuggingFace Spaces (new in v0.29)
soup deploy hf-space \
  --model <user/repo> \
  --space <user/space> \
  --template gradio-chat|streamlit-chat

soup deploy hf-space validates model_repo via validate_repo_id before substituting into app.py / README.md — a crafted repo id cannot inject Python code into the deployed Space.

`soup infer`

Batch inference on a list of prompts.

bash
soup infer [--model <path>] [--input <path>] [--output <path>] [--max-tokens <int>] [--temperature <float>]

`soup migrate`

Import config from competing tools.

bash
soup migrate --from llamafactory|axolotl|unsloth <input-file> [--output <path>] [--dry-run]

`soup recipes`

Browse and use ready-made training configs.

bash
soup recipes list                          # List all 116 recipes
soup recipes show <name>                   # Print recipe YAML
soup recipes use <name> [--output <path>]  # Copy to soup.yaml
soup recipes search <query>                # Search by model/task/size

Data Commands

bash
soup data inspect <path>                          # View dataset stats
soup data validate <path> [--format <fmt>]        # Check format compliance
soup data convert <path> --to <fmt> --output <f>  # Convert between formats
soup data merge <f1> <f2> --output <f> [--shuffle]  # Combine datasets
soup data dedup <path> [--threshold <float>]      # Remove duplicates
soup data stats <path>                            # Extended statistics
soup data generate --prompt "..." --count <n>     # Generate synthetic data
soup data generate --provider ollama|anthropic|vllm  # Multi-provider support
soup data generate --template code|conversation|qa|preference|reasoning  # Domain templates
soup data generate --quality-pipeline             # Validate + filter + dedup
soup data filter <path> [--coherence <f>] [--perplexity <n>]  # Quality filter
soup data sample <path> --strategy random|diverse|hard --count <n>  # Sample subset
soup data split <path> --ratio 0.8,0.1,0.1                 # Train/val/test split
soup data search <query>                                     # Search HuggingFace Hub
soup data preview <dataset_id>                               # Preview remote dataset
soup data download <dataset_id> [--output <path>] [--samples <n>]  # Download from HF
soup data register <name> --path <path> --format <fmt>       # Register local dataset
soup data unregister <name>                                  # Remove from registry
soup data registry                                           # List registered datasets
soup data push --input <jsonl> --hf-dataset <user/repo>      # Publish JSONL as HF dataset (v0.29)
soup data from-traces --input <logs> --format langchain|openai|soup-serve  # Trace-to-preference (v0.26)
soup data review <pairs.jsonl>                                # Preview preference pairs (v0.26)
soup data augment --input <jsonl> --strategy rephrase|translate|style     # LLM augment (v0.25)

Experiment Commands

bash
soup runs                         # List all training runs
soup runs show <run_id>           # Detailed run info + loss curve
soup runs compare <run1> <run2>   # Side-by-side comparison
soup runs delete <run_id>         # Remove from database

Other Commands

bash
soup deploy ollama [--model <p>] [--name <n>]   # Deploy GGUF to Ollama
soup profile [--model <m>] [--task <t>]        # Estimate memory/speed/GPU
soup adapters list [--path <dir>]              # Scan for LoRA adapters
soup adapters info <path>                      # Adapter metadata
soup adapters compare <a> <b>                  # Side-by-side comparison
soup sweep --config <path> --param key=v1,v2   # Hyperparameter search
soup diff --model-a <p> --model-b <p>          # Compare two models
soup doctor [--nccl]                            # Check system + deps (+ NCCL multi-GPU bandwidth)
soup quickstart [--dry-run] [--yes]             # One-command demo
soup ui [--port <int>] [--no-browser]           # Launch web UI
soup version [--full]                           # Show version info

v0.54 — Pre-flight Decision

bash
soup advise <data.jsonl> --goal "<task>"          # Classify task + rank PROMPT_ENG/RAG/SFT/DPO/GRPO
soup advise --probe                               # 100-step LoRA probe for empirical ROI
soup advise --record                              # Append to ~/.soup/advise_history.jsonl
soup advise explain                               # Print full rubric
soup advise compare <a.jsonl> <b.jsonl>           # Compare two candidate datasets

v0.55 — Eval Design

bash
soup eval design <data.jsonl> --goal "..."        # Goal-conditioned suite (TF-IDF + dispatch)
soup eval discover <data.jsonl>                   # Greedy farthest-first canaries + memorization probes
soup eval lock <design.json>                      # SHA-256-pin the suite as an artifact
soup eval coverage <design.json> --task <category>  # Gap analysis vs v0.54 taxonomy
soup eval gate-install --baseline <run-id>        # Install .git/hooks/pre-push regression gate

v0.56 — Diagnose (Model Report Card)

bash
soup diagnose <run-id> [--attach-to-registry <id>]  # 6 failure-mode probes + OK/MINOR/MAJOR + SVG badge
soup train --diagnose-gate <evidence>               # Refuse final save on MAJOR regression (exit 2)

v0.57 + v0.67 — Adapter VCS

bash
# v0.57 — git for LoRA
soup adapters diff <a> <b>                            # Per-layer Frobenius + relative drift + SVD effective rank
soup adapters merge <a> <b> [c...] -o <out> \
  --strategy linear|ties|dare|svd|cmaes \           # v0.67 adds CMA-ES evolutionary search
  [--eval <suite>] [--budget 1h]                       #   (eval + budget required for cmaes)
soup adapters blame --dataset <d> --layer <l> --budget 5m --shards 4
soup adapters branch <name> -c <config.yaml> --base <model> --dataset <data>
soup adapters checkout <name> -o <out.yaml>
soup adapters branches

# v0.67 additions
soup adapters pr <title> --base-sha <hex> --adapter <path> \
  --eval <eval_delta.json> --samples <sample_diffs.json> \
  --output <PR.md>                                     # GitHub-shaped PR markdown
soup adapters bisect <ckpt1> <ckpt2> ... \
  --eval-command "soup eval custom --checkpoint {ckpt} --suite <s.yaml>"
                                                       # Binary-search history for regression boundary

# Shared run lockfile (v0.67)
soup lock write --base-model <id> --base-sha <64hex> --dataset-sha <64hex> \
  --env-hash <64hex> [--output soup.lock]
soup lock show [soup.lock]
soup lock check --base-model <id> --base-sha <64hex> --dataset-sha <64hex> \
  --env-hash <64hex> [soup.lock]                       # Exit 3 on drift

v0.58 — Production Data Flywheel

bash
soup loop init <served-model> --eval <suite> --baseline registry://<id> \
  --monthly-budget 50usd --max-runs-per-day 3
soup loop watch                                   # Long-running daemon (SIGTERM-safe)
soup loop status
soup loop pause / soup loop resume
soup loop canary <adapter> --traffic 5% --autoroll-on-regress
soup loop replay [<iter-id>]

v0.59-v0.62 — Governance, Supply Chain, Unlearn, RAG

bash
# v0.59 — Governance & Provenance
soup bom emit --name <m> --version <v> --base-model <hf> --base-sha <sha> \
  --config-sha <sha> --task <t> --license <spdx> --format cyclonedx|spdx|both
soup attest emit --stage extract|train|eval|export|publish \
  --subject <artifact> --sha <sha> --builder <id> --invocation <cmd> \
  --sign unsigned|ed25519|sigstore
soup audit-log tail / rotate
soup train --annex-xi <out.md> --repro-receipt <out.json>

# v0.60 — Supply Chain Security
soup adapters scan <dir>                          # Spectral LoRA backdoor scanner (exit 0/1/3)
soup adapters sign / verify [--strict]            # Merkle-root tamper detection
soup adapters check-safetensors --strict          # Refuse pickle/.bin/.pt
soup adapters merge --license <spdx>              # SPDX-license compatibility gate
soup airgap-bundle --model <dir> --dataset <dir>... --wheel <dir>... --kernel <dir>...

# v0.61 — Unlearning & Knowledge Edit
# (configured via task: unlearn in soup.yaml)
soup eval unlearning <run-id> --benchmark tofu|muse|wmdp
soup edit set --base <m> --method rome|memit|alphaedit --subject "..." --target "..." [--plan-only]
soup edit diff registry://before_id registry://after_id --probes probes.jsonl --top-k 10

# v0.62 — RAG & Activation Steering
soup steer train --base <m> --method caa|iti|repe --name <id> --pairs <jsonl>
soup steer apply --name <id> --strength <float>   # |strength| ≤ 10 enforced
soup steer list

v0.63 — Production Trace Ecosystem

bash
soup ingest --source langfuse|langsmith|helicone|openpipe|otel|openai-stored \
  --logs <jsonl> --output <out.jsonl>
soup prune-prompt --input <jsonl> --output <out.jsonl> --min-frequency 0.95
soup data active-sample --input <jsonl> --budget N
soup ab --input <jsonl> --metric latency|judge_score|retry_rate \
  [--alpha 0.05] [--beta 0.20] [--effect-size 0.1]
soup drift-alarm --reference <ref.jsonl> --live <live.jsonl> --threshold 0.2 \
  [--slack-url ...] [--discord-url ...]                # Exit 3 on drift

v0.64 — Pre-flight & Tooling

bash
soup tunability --dataset <path> [--candidates <names>] \
  [--probe-steps 100] [--holdout-size 64] [--output <path>] [--plan-only] [--list]
soup plan --config <soup.yaml> [--state ./soup.tfstate]
soup apply --config <soup.yaml> [--state ./soup.tfstate] [--dry-run]
soup env lock [--output ./soup-env.lock]
soup env status [--lock ./soup-env.lock]
soup env check [--lock ./soup-env.lock]                # Exit 3 on ABI drift
soup completions bash|zsh|fish
soup license-advisor --target b2c|defense|embedded \
  [--license <spdx>] [--monthly-active-users <N>]      # Exit 3 on block

v0.65 — Eval Depth (Failure Modes 6 → 10)

bash
soup eval behavior <run-id> --battery xstest|harmbench|jailbreakbench|elephant|syceval \
  [--evidence <responses.json>] [--output <report.json>]
soup eval capability <run-id> --suite full|fast|math|code [--output <report.json>]
soup eval checklist <spec.yaml> [--evidence <responses.json>] [--output <report.json>]
soup eval irt-subset <responses.jsonl> --size full|small|tiny [--output <plan.json>]

v0.66 — Post-train X-rays

bash
soup probe sae-diff <sae.safetensors> <pre_acts.json> <post_acts.json> \
  [--top-k 20] [--output <report.json>]
soup probe sleeper <base> [--evidence <activations.json>] [--output <result.json>]
soup probe interference <losses.json> [--output <matrix.json>]   # Exit 2 if worst ≥20%
soup probe pack <base> [--list] [--output <manifest.json>]

Global Flags

bash
soup --verbose <command>   # Full tracebacks instead of friendly errors

> Note: --verbose is a global flag — it must go before the command name, not after.