Quant Menu II (v0.53.0)
The full advanced-quantization surface. Schema-only release; live llama.cpp imatrix + serve / merge / export writers land in v0.53.1.
Unsloth Dynamic 2.0 GGUF ladder
14-entry closed allowlist:
UD-Q8_K_XL · UD-Q6_K_XL · UD-Q5_K_XL · UD-Q4_K_XL · UD-Q3_K_XL · UD-Q2_K_XL · UD-IQ4_XS · UD-IQ3_M · UD-IQ3_XXS · UD-IQ2_M · UD-IQ2_XS · UD-IQ2_XXS · UD-IQ1_M · UD-IQ1_S
validate_ud_gguf_format is case-insensitive with canonical normalisation.
soup export --format gguf --quant UD-Q4_K_XL --output ./model.ggufIQ + Apple/ARM GGUF flavours
- 12-entry IQ family (IQ1/2/3/4 — including IQ4_NL non-linear)
- 10-entry Apple/ARM-friendly set (Q4_0_4_4, Q4_NL, Q5_K_M, etc.)
Both wrapped in MappingProxyType metadata.
KV cache types
training:
kv_cache_type: fp8 # q8_0 | bf16 | f16 | fp8FP8 is Hopper-only — cross-validator rejects fp8 on the MLX backend; SM-capability check fires at serve construction.
FP8 attention, NVFP4, native unsloth_bnb_4bit
training:
fp8_attention: true # requires quantization_aware='fp8', non-MLX
nvfp4: true # CUDA + text only; Blackwell SM ≥ 12 (runtime check)
unsloth_bnb_4bit: true # requires backend='unsloth' + quantization='4bit'LF / Axolotl parity
training:
bnb_4bit_use_double_quant: true # requires quantization='4bit'
llm_int8: true # asserts quantization='8bit'
quantize_ref_model: true # extends quant to ref model (DPO/IPO/SimPO/ORPO/BCO/KTO/GRPO/PPO/preference)
quantize_reward_model: true # PPO + reward_model tasksAdvanced save formats
soup merge --save-format 4bit # | 4bit_forced
soup export --format torchao --quant-config quant.yamlsave-format 4bit_forced writes a single BNB-4bit merged checkpoint without a dequant / merge / requant cycle.
--quant-config accepts a closed TorchAO allowlist: Int4WeightOnly / Int8DynActInt4 / Float8DynActFloat8 / NVFP4.
Stats
- Net +157 tests (7,453 → 7,610 across 179 files)
- 154 tests in
test_v0530.py - 5 review agents ran in parallel; every CRITICAL / HIGH / MEDIUM / LOW finding fixed or documented
See also
- [Quant Menu (v0.38)](/docs/quant-menu) — the original 9-format menu
- [Speed & Memory](/docs/training-speed-memory) — FP8 training, Cut CE, kernel auto-compose
- [Multi-GPU](/docs/multi-gpu) — ZeRO++ / FSDP2 / pipeline