Recipes
Soup includes 116 ready-made configs for popular models — v0.31 grew the catalog to 80, v0.51 added 26 new families (GPT-OSS / GLM 4.6 / Kimi K2 / MiniMax M2 / QwQ-32B / QVQ-72B / Granite 4 / Voxtral / DeepSeek-OCR and more), v0.52 added 6 TTS / BitNet recipes (Orpheus / Sesame-CSM / Llasa / Spark / Oute / Falcon-E), and v0.62 added 3 RAG recipes (raft-llama3-8b, ra-dit-retriever, ra-dit-llama3-8b). No need to write YAML from scratch.
List Recipes
bash
soup recipes listShows all available recipes with model name, task, and description.
Search Recipes
bash
# Search by model name
soup recipes search llama
# Search by task type
soup recipes search grpo
# Search by model size
soup recipes search 8bShow Recipe
bash
# Print full YAML to stdout
soup recipes show llama3-sftUse a Recipe
bash
# Copy recipe to soup.yaml
soup recipes use llama3-sft
# Custom output path
soup recipes use qwen3-dpo --output my-config.yamlAvailable Recipes (116 total)
| Category | Models |
|---|---|
| General SFT/DPO/GRPO/KTO/ORPO/SimPO/IPO/PPO/Embedding/Pretrain | Llama 3.1 / 3.2 / 4 (Scout + Maverick), Qwen 2.5 / 3 (incl. 30B MoE + 235B-A22B), Mistral, Mixtral 8x7B/8x22B, Gemma 3, Phi-4, DeepSeek R1 / V3 + all 6 R1-Distill sizes |
| v0.51 catalog expansion (26 new) | GPT-OSS 20B / 120B, GLM 4.6 / 5, Kimi K2 / K2-Thinking, MiniMax M2, QwQ-32B, QVQ-72B, Granite 4, LFM2, Cogito v2, Mistral Small 3 / Medium 3.5, Magistral / Devstral / Ministral, MedGemma, EmbeddingGemma, LLaVA-Next, InternVL 3.5, Voxtral, Baichuan 2, Qwen-Image, DeepSeek-OCR, Paddle-OCR-VL |
| Vision (multimodal) | Llama-3.2-Vision (11B + 90B), Pixtral-12B, Qwen2-VL (7B + 72B), InternVL 2.5 / 3.5, LLaVA-Next, MiniCPM-V 2.6, Qwen-Image, DeepSeek-OCR, Paddle-OCR-VL |
| Audio (speech) | Qwen2-Audio, SeamlessM4T v2 (translation), Whisper-large-v3 (ASR), Voxtral |
| TTS (v0.52, 5 families) | Orpheus, Sesame-CSM, Llasa, Spark, Oute |
| BitNet (v0.52) | Falcon-E |
| RAG (v0.62) | raft-llama3-8b, ra-dit-retriever, ra-dit-llama3-8b |
| Reasoning | All 6 DeepSeek-R1-Distill sizes (Qwen 1.5B/7B/14B/32B + Llama 8B/70B), Qwen3-Coder 30B, Qwen3-30B-A3B reasoning, Phi-4 reasoning, QwQ-32B |
| Small / edge / mobile | SmolLM2 (135M / 360M / 1.7B), Qwen2.5 (0.5B / 1.5B / 3B), Gemma 2 2B, Phi-3.5-mini, Llama-3.2 (1B / 3B), LFM2 |
| Domain specialists | BioMistral 7B, Meditron 7B, MedGemma (medical) — CodeLlama (13B / 70B), Magicoder 6.7B (code) — Mathstral 7B (math) — Llama-2-13b-finance — Nemotron-4 340B — EmbeddingGemma |
| Multimodal reasoning | Llama-3.2-Vision GRPO, Pixtral DPO |
| Multi-GPU | llama3-70b-fsdp2, qwen3-32b-zeropp, deepseek-v3-pipeline |
| Apple Silicon (MLX) | llama3.1-8b / qwen3-8b / gemma3-9b SFT-MLX |
| Tool-calling / agentic | qwen3-8b-tools, llama4-scout-tools |
Each recipe ships with LoRA rank, learning rate, batch size, and quantization tuned to the model. Use soup recipes show <name> to print the full YAML.
Customizing Recipes
Recipes are a starting point. After soup recipes use, edit the generated soup.yaml to:
- Point to your dataset (
data.train) - Adjust epochs, learning rate, or LoRA rank
- Add backend: unsloth for 2-5x speedup
- Enable evaluation with
evalconfig section