v0.53.6 / v0.53.7 — Plugins live, Anthropic API, tool endpoints

v0.53.6 — SoupPluginCallback live

The v0.45.0 BasePlugin protocol is now wired into every HF Trainer via SoupPluginCallback. One bad plugin can't crash training — exceptions in plugin hooks are caught and logged, the trainer keeps going.

bash
soup plugins list
soup plugins enable my-cool-plugin
# every soup train, dpo, grpo, ... fans HF Trainer events into the plugin

Anthropic-compatible `/v1/messages` endpoint

bash
soup serve --model ./output --backend transformers
# POST http://localhost:8000/v1/messages — Anthropic Messages shape

v0.53.6 ships the transformers backend; v0.53.7 adds vLLM parity plus Anthropic-shape SSE streaming.

N-gram speculative decoding

prompt_lookup_num_tokens is plumbed through _generate_response — n-gram lookup is now a zero-config draft for inference workloads where a draft model is overkill.

v0.53.7 — Data Recipe DAG live runner

bash
soup data recipe --execute --output ./out recipe.yaml

Live runner for 6 node kinds:

  • seed — load a starting JSONL
  • llm_text — generate via OpenAI / Ollama / Anthropic / vLLM
  • code — RLVR-sandbox code execution
  • judge — LLM-as-a-judge filter
  • validator — pydantic / regex / jsonschema gate
  • sampler — diversity-preserving subsample

Each node writes an atomic checkpoint — interrupt and resume from the last finished node.

Tool endpoints live

POST /v1/tools/python      # RLVR sandbox + 5s timeout + 512MB RLIMIT
POST /v1/tools/web_search  # bearer-auth, allowlisted providers
POST /v1/tools/bash        # 501 — deferred to v0.53.8 (container/namespace work)

Trainer plugins lazy-imported live

6 trainer plugins ship with v0.53.7 lazy-import wiring:

  • grokfast — gradient grokking filter
  • spectrum — top-k parameter selection
  • llmcompressor — compressed inference export
  • sonicmoe — MoE routing acceleration
  • cce_plugin — Cut Cross-Entropy as a plugin
  • math_verify — math RLVR plugin

Markdown + data forge polish

  • soup data ingest now does heading-aware Markdown splits (one JSONL row per section)
  • soup data decontaminate --benchmark-file <path> accepts operator-supplied corpora
  • soup data forge --judge-provider {ollama,anthropic,vllm} live
  • soup data preprocess AOT-tokenizes → Arrow shards (live)

Tests

  • v0.53.6: 7,998 → 8,051 (+53)
  • v0.53.7: 8,051 → 8,162 (+111)

See also

  • [Plugins & integrations](/docs/plugins)
  • [Data Forge](/docs/data-forge)