Trace-to-Preference

v0.26.0 turns traffic logs into DPO / KTO training data. Point soup data from-traces at LangChain, OpenAI, or Soup-serve logs. Soup extracts preference signals — thumbs, regenerations, user edits — and writes clean pairs.

Ingest a log

bash
soup data from-traces \
  --logs chat_logs.jsonl \
  --format langchain \
  --signal thumbs_up \
  --output prefs.jsonl

Flags

FlagMeaning
--logsJSONL trace file (or a directory for the soup-serve format)
--formatlangchain · openai · soup-serve
--signalthumbs_up · regenerations · user_edit
--output / -oDestination JSONL (default prefs.jsonl)

Signals

  • thumbs_up — explicit positive feedback becomes the chosen response
  • regenerations — if the user hit regenerate, the later response is preferred
  • user_edit — if the user rewrote the response, the edit becomes chosen

Review before training

bash
soup data review prefs.jsonl --sample 10

Previews up to 100 random pairs side-by-side with their chosen / rejected labels. Use it as a sanity check before burning a GPU on junk data.

Train on them

DPO or KTO — the output format of from-traces matches both:

yaml
task: dpo
data:
  train: prefs.jsonl
  format: dpo

Safety

  • Logs never execute model code — parsers only read JSON.
  • Traces may contain PII or secrets; Soup warns on ingest and leaves scrubbing to you before sharing pairs externally.
  • Output JSONL can be validated with the same soup data validate pipeline as any other dataset.

See also

  • [Data formats](/docs/data-formats)
  • [DPO training guide](/docs/dpo-training-guide)