Recipes

Soup includes 116 ready-made configs for popular models — v0.31 grew the catalog to 80, v0.51 added 26 new families (GPT-OSS / GLM 4.6 / Kimi K2 / MiniMax M2 / QwQ-32B / QVQ-72B / Granite 4 / Voxtral / DeepSeek-OCR and more), v0.52 added 6 TTS / BitNet recipes (Orpheus / Sesame-CSM / Llasa / Spark / Oute / Falcon-E), and v0.62 added 3 RAG recipes (raft-llama3-8b, ra-dit-retriever, ra-dit-llama3-8b). No need to write YAML from scratch.

List Recipes

bash
soup recipes list

Shows all available recipes with model name, task, and description.

Search Recipes

bash
# Search by model name
soup recipes search llama

# Search by task type
soup recipes search grpo

# Search by model size
soup recipes search 8b

Show Recipe

bash
# Print full YAML to stdout
soup recipes show llama3-sft

Use a Recipe

bash
# Copy recipe to soup.yaml
soup recipes use llama3-sft

# Custom output path
soup recipes use qwen3-dpo --output my-config.yaml

Available Recipes (116 total)

CategoryModels
General SFT/DPO/GRPO/KTO/ORPO/SimPO/IPO/PPO/Embedding/PretrainLlama 3.1 / 3.2 / 4 (Scout + Maverick), Qwen 2.5 / 3 (incl. 30B MoE + 235B-A22B), Mistral, Mixtral 8x7B/8x22B, Gemma 3, Phi-4, DeepSeek R1 / V3 + all 6 R1-Distill sizes
v0.51 catalog expansion (26 new)GPT-OSS 20B / 120B, GLM 4.6 / 5, Kimi K2 / K2-Thinking, MiniMax M2, QwQ-32B, QVQ-72B, Granite 4, LFM2, Cogito v2, Mistral Small 3 / Medium 3.5, Magistral / Devstral / Ministral, MedGemma, EmbeddingGemma, LLaVA-Next, InternVL 3.5, Voxtral, Baichuan 2, Qwen-Image, DeepSeek-OCR, Paddle-OCR-VL
Vision (multimodal)Llama-3.2-Vision (11B + 90B), Pixtral-12B, Qwen2-VL (7B + 72B), InternVL 2.5 / 3.5, LLaVA-Next, MiniCPM-V 2.6, Qwen-Image, DeepSeek-OCR, Paddle-OCR-VL
Audio (speech)Qwen2-Audio, SeamlessM4T v2 (translation), Whisper-large-v3 (ASR), Voxtral
TTS (v0.52, 5 families)Orpheus, Sesame-CSM, Llasa, Spark, Oute
BitNet (v0.52)Falcon-E
RAG (v0.62)raft-llama3-8b, ra-dit-retriever, ra-dit-llama3-8b
ReasoningAll 6 DeepSeek-R1-Distill sizes (Qwen 1.5B/7B/14B/32B + Llama 8B/70B), Qwen3-Coder 30B, Qwen3-30B-A3B reasoning, Phi-4 reasoning, QwQ-32B
Small / edge / mobileSmolLM2 (135M / 360M / 1.7B), Qwen2.5 (0.5B / 1.5B / 3B), Gemma 2 2B, Phi-3.5-mini, Llama-3.2 (1B / 3B), LFM2
Domain specialistsBioMistral 7B, Meditron 7B, MedGemma (medical) — CodeLlama (13B / 70B), Magicoder 6.7B (code) — Mathstral 7B (math) — Llama-2-13b-finance — Nemotron-4 340B — EmbeddingGemma
Multimodal reasoningLlama-3.2-Vision GRPO, Pixtral DPO
Multi-GPUllama3-70b-fsdp2, qwen3-32b-zeropp, deepseek-v3-pipeline
Apple Silicon (MLX)llama3.1-8b / qwen3-8b / gemma3-9b SFT-MLX
Tool-calling / agenticqwen3-8b-tools, llama4-scout-tools

Each recipe ships with LoRA rank, learning rate, batch size, and quantization tuned to the model. Use soup recipes show <name> to print the full YAML.

Customizing Recipes

Recipes are a starting point. After soup recipes use, edit the generated soup.yaml to:

  • Point to your dataset (data.train)
  • Adjust epochs, learning rate, or LoRA rank
  • Add backend: unsloth for 2-5x speedup
  • Enable evaluation with eval config section