Quant Menu — 9 Quantization Formats (v0.38.0)

Pick the right quantization format for your base model and hardware. Soup loads the appropriate quantization_config and trains LoRA on top.

yaml
# Train LoRA on top of a pre-quantized GPTQ checkpoint:
base: TheBloke/Llama-2-7B-Chat-GPTQ
training:
  quantization: gptq        # or: awq, hqq:4bit, aqlm, eetq, mxfp4, fp8

# FSDP + QLoRA — set quant_storage:
training:
  quantization: 4bit
  bnb_4bit_quant_storage: bfloat16

Format matrix

FormatBitsUse caseOptional dep
4bit4Default. Best general LoRA training.bitsandbytes
8bit8Larger memory budget, more accurate gradients.bitsandbytes
none16/32Full fine-tuning or DPO/PPO without quant.
gptq2/3/4/8Train LoRA on top of an existing GPTQ checkpoint.gptqmodel
awq4Train LoRA on top of an existing AWQ checkpoint.autoawq
hqq:Nbit1, 2, 3, 4, 5, 6, 8Wide bit range; compose with LoRA.hqq
aqlm2Extreme compression.aqlm
eetq8Fast 8-bit kernel for SM75+.eetq
mxfp44Newer 4-bit type with better activation distribution.bitsandbytes ≥ 0.45
fp8Train fp16/bf16 on top of FP8-released checkpoints.transformers ≥ 4.45

Compatibility matrix

soup train runs check_quant_distributed_compat() at startup. HQQ / EETQ / AQLM hard-fail with FSDP and ZeRO-3; BNB 4-bit + FSDP without bnb_4bit_quant_storage emits a yellow warning.

Pre-quantized + QAT

gptq / awq / hqq:* / aqlm / eetq / mxfp4 / fp8 all carry their own scale; combining with quantization_aware (int8 QAT or 'fp8') is rejected at config-load.

Scope

Wired into the SFT trainer + transformers backend in v0.38.0. Multi-trainer expansion is tracked for v0.38.1 (mirrors v0.27.0 MII / v0.37.0 multipack stub-then-live pattern). MLX backend gets a distinct error message naming the actual reason.