`soup advise` — decide before you train

Before you touch a GPU, ask: do you actually need to fine-tune? Or is your problem better solved with prompt engineering, RAG, DPO instead of SFT, or GRPO over rewards?

soup advise (v0.54.0) is a pre-flight decision engine. It classifies your task, profiles your dataset, and emits a ranked verdict across:

  • PROMPT_ENG — your data is too small / your goal is solvable with a better prompt
  • RAG — high-variance factual recall is better handled with retrieval
  • SFT — supervised fine-tuning is the right baseline
  • DPO — you already have preference pairs (chosen/rejected)
  • GRPO — you have reasoning traces + ≥ 500 rows, RL over verifiable rewards wins

Usage

bash
soup advise data.jsonl --goal "polite customer support chat"

Output:

text
SFT (recommended)
  why:
    - 4,213 rows (above _MIN_ROWS_FOR_TRAINING=50)
    - no preference pairs detected
    - no reasoning traces — GRPO ruled out
    - tone-shift goal — SFT is the right baseline
  next: soup autopilot --data data.jsonl --task sft --goal "polite customer support chat"

Heuristics

  • Task classification: keyword + structural signals. tool_calls field → tool_use; <think>...</think> blocks → reasoning; chat-shaped messages → input-extraction. The --goal string carries ~10× the row weight when classifying.
  • Dataset profile: row count, avg input/output chars, type-token diversity, label variance, has_chosen_rejected, has_reasoning_traces. Capped at 2,000-row sample for speed.
  • Verdict rubric:

- preference pairs detected → DPO

- reasoning traces + ≥ 500 rows → GRPO

- < 50 rows → PROMPT_ENG (training will overfit)

- high-variance factual → RAG

- default → SFT

`--probe` — put real numbers on the ROI estimate

bash
soup advise data.jsonl --goal "..." --probe

Runs a 100-step LoRA probe and a held-out zero-shot / few-shot / RAG baseline, bounded to [-1, 1] per-method delta. The 600-second timeout is a hard wall.

`--record` — cross-project history

bash
soup advise data.jsonl --goal "..." --record

Appends a frozen HistoryEntry to ~/.soup/advise_history.jsonl (atomic, file-locked via fcntl.flock on POSIX / msvcrt.locking on Windows, 64 KB per-line cap, 16 MiB file cap, 10k row cap). Future verdicts read this back via summarise_history so the engine learns across projects.

Override the path with SOUP_ADVISE_HISTORY_PATH — containment-checked to $HOME / $CWD / tempfile.gettempdir(). Default file perms: 0o600.

Subcommands

bash
soup advise run data.jsonl --goal "..."   # explicit (also the default)
soup advise explain                       # print the full rubric
soup advise compare a.jsonl b.jsonl       # which dataset is better for fine-tuning?

The top-level argv preprocessor _rewrite_advise_argv injects run so soup advise data.jsonl works without typing the subcommand.

See also

  • [Autopilot](/docs/autopilot) — the literal next step after soup advise
  • [Eval design](/docs/eval-design) — turn your data into evals
  • [Trace-to-preference](/docs/trace-to-preference) — distill production traffic into DPO pairs