Recommended starter~5.24 GBQ4_K_MCompact and practical first choice. Start here unless you already know your machine has enough RAM or VRAM for larger variants.pb_qwen3_5_9b_dora_v7_stable_q4_k_m.ggufDownload model
Lightweight~4.98 GBQ4_K_SEven more compact with good quality. Ideal for CPU offloading or when memory is tight.pb_qwen3_5_9b_dora_v7_stable_q4_k_s.ggufDownload model
Smallest variant~4.95 GBQ4_0The smallest and fastest variant. Best choice for pure CPU inference or very limited memory.pb_qwen3_5_9b_dora_v7_stable_q4_0.ggufDownload model
Higher quality~6.85 GBQ6_KUse this when you can spend more memory for stronger answers and more stable code-oriented behavior.pb_qwen3_5_9b_dora_v7_stable_q6_k.ggufDownload model
High precision~8.87 GBQ8_0For capable hardware where quality matters more than download size and runtime memory footprint.pb_qwen3_5_9b_dora_v7_stable_q8_0.ggufDownload model
Maximum quality~16.7 GBF16For demanding workflows with high precision, e.g. code analysis, conversion, and detailed testing.pb_qwen3_5_9b_dora_v7_stable_f16.ggufDownload model
Maximum quality GPU~16.7 GBBF16Brain-float16 variant for maximum GPU performance while maintaining high quality.pb_qwen3_5_9b_dora_v7_stable_bf16.ggufDownload model
SPR GGUF~5.2 GBQ4_K_M VRAM-safeVRAM-safe variant with same token-space structure plus a safer memory footprint.spr_qwen3_5_9b_dora_vramsafe_q4_k_m.ggufDownload model
SPR GGUF~6.9 GBQ6_K VRAM-safeHigher precision while keeping memory pressure manageable.spr_qwen3_5_9b_dora_vramsafe_q6_k.ggufDownload model
SPR GGUF~8.9 GBQ8_0 VRAM-safeFor stronger output quality in local runtimes with enough memory available.spr_qwen3_5_9b_dora_vramsafe_q8_0.ggufDownload model
SPR GGUF~16.7 GBBF16 VRAM-safeHigher precision variant for advanced inference and conversion workflows.spr_qwen3_5_9b_dora_vramsafe_bf16.ggufDownload model