Quant Menu — 9 Quantization Formats (v0.38.0)
Pick the right quantization format for your base model and hardware. Soup loads the appropriate quantization_config and trains LoRA on top.
yaml
# Train LoRA on top of a pre-quantized GPTQ checkpoint:
base: TheBloke/Llama-2-7B-Chat-GPTQ
training:
quantization: gptq # or: awq, hqq:4bit, aqlm, eetq, mxfp4, fp8
# FSDP + QLoRA — set quant_storage:
training:
quantization: 4bit
bnb_4bit_quant_storage: bfloat16Format matrix
| Format | Bits | Use case | Optional dep |
|---|---|---|---|
4bit | 4 | Default. Best general LoRA training. | bitsandbytes |
8bit | 8 | Larger memory budget, more accurate gradients. | bitsandbytes |
none | 16/32 | Full fine-tuning or DPO/PPO without quant. | — |
gptq | 2/3/4/8 | Train LoRA on top of an existing GPTQ checkpoint. | gptqmodel |
awq | 4 | Train LoRA on top of an existing AWQ checkpoint. | autoawq |
hqq:Nbit | 1, 2, 3, 4, 5, 6, 8 | Wide bit range; compose with LoRA. | hqq |
aqlm | 2 | Extreme compression. | aqlm |
eetq | 8 | Fast 8-bit kernel for SM75+. | eetq |
mxfp4 | 4 | Newer 4-bit type with better activation distribution. | bitsandbytes ≥ 0.45 |
fp8 | — | Train fp16/bf16 on top of FP8-released checkpoints. | transformers ≥ 4.45 |
Compatibility matrix
soup train runs check_quant_distributed_compat() at startup. HQQ / EETQ / AQLM hard-fail with FSDP and ZeRO-3; BNB 4-bit + FSDP without bnb_4bit_quant_storage emits a yellow warning.
Pre-quantized + QAT
gptq / awq / hqq:* / aqlm / eetq / mxfp4 / fp8 all carry their own scale; combining with quantization_aware (int8 QAT or 'fp8') is rejected at config-load.
Scope
Wired into the SFT trainer + transformers backend in v0.38.0. Multi-trainer expansion is tracked for v0.38.1 (mirrors v0.27.0 MII / v0.37.0 multipack stub-then-live pattern). MLX backend gets a distinct error message naming the actual reason.