PowerBASIC model library

Additional AI Model Downloads

PowerBASIC-focused Qwen 9B DoRA GGUF model variants for local AI runtimes such as LM Studio, Ollama, llama.cpp-compatible tools, KoboldCpp, text-generation-webui, and other GGUF-capable local inference environments.

These downloads are optional companions for users who want local model behavior tuned toward PowerBASIC-oriented code assistance. Choose the quantization level that matches your available RAM/VRAM.

Qwen 9B PowerBASIC model download visual

The GGUF files are hosted in the shared download area outside AISPR and can be used in LM Studio, Ollama imports, llama.cpp-compatible runners, KoboldCpp, text-generation-webui, and similar local model tools.

Start simple

Start with Q4_K_M unless you know your machine has enough memory for larger variants.

Raise quality

Use Q6_K when you want stronger output quality and can spend more memory.

Reference use

Use Q8_0 or F16/BF16 only for capable hardware, archival, conversion, or reference workflows.

Hugging Face

Repository and model cards

All current model files and release metadata are published in one repository, including sizes, checksums, and direct download links.

Open repository Browse files

Available files

Qwen 9B PowerBASIC GGUF variants

Hosted on Hugging Face: https://huggingface.co/Theogott/pb-qwen3_5-9b-powerbasic-ggufs

Recommended starter~5.24 GB

Q4_K_M

Compact and practical first choice. Start here unless you already know your machine has enough RAM or VRAM for larger variants.

pb_qwen3_5_9b_dora_v7_stable_q4_k_m.gguf

Download model

Lightweight~4.98 GB

Q4_K_S

Even more compact with good quality. Ideal for CPU offloading or when memory is tight.

pb_qwen3_5_9b_dora_v7_stable_q4_k_s.gguf

Download model

Smallest variant~4.95 GB

Q4_0

The smallest and fastest variant. Best choice for pure CPU inference or very limited memory.

pb_qwen3_5_9b_dora_v7_stable_q4_0.gguf

Download model

Higher quality~6.85 GB

Q6_K

Use this when you can spend more memory for stronger answers and more stable code-oriented behavior.

pb_qwen3_5_9b_dora_v7_stable_q6_k.gguf

Download model

High precision~8.87 GB

Q8_0

For capable hardware where quality matters more than download size and runtime memory footprint.

pb_qwen3_5_9b_dora_v7_stable_q8_0.gguf

Download model

Maximum quality~16.7 GB

F16

For demanding workflows with high precision, e.g. code analysis, conversion, and detailed testing.

pb_qwen3_5_9b_dora_v7_stable_f16.gguf

Download model

Maximum quality GPU~16.7 GB

BF16

Brain-float16 variant for maximum GPU performance while maintaining high quality.

pb_qwen3_5_9b_dora_v7_stable_bf16.gguf

Download model

Now available

SPR Models

Hosted on Hugging Face: https://huggingface.co/Theogott/spr-qwen3_5-9b-dora-vramsafe-gguf

SPR GGUF~5.2 GB

Q4_K_M VRAM-safe

VRAM-safe variant with same token-space structure plus a safer memory footprint.

spr_qwen3_5_9b_dora_vramsafe_q4_k_m.gguf

Download model

SPR GGUF~6.9 GB

Q6_K VRAM-safe

Higher precision while keeping memory pressure manageable.

spr_qwen3_5_9b_dora_vramsafe_q6_k.gguf

Download model

SPR GGUF~8.9 GB

Q8_0 VRAM-safe

For stronger output quality in local runtimes with enough memory available.

spr_qwen3_5_9b_dora_vramsafe_q8_0.gguf

Download model

SPR GGUF~16.7 GB

BF16 VRAM-safe

Higher precision variant for advanced inference and conversion workflows.

spr_qwen3_5_9b_dora_vramsafe_bf16.gguf

Download model

Where to use them: These are standard GGUF model files for local inference. Use them in LM Studio, Ollama after creating/importing a Modelfile, llama.cpp-compatible tools, KoboldCpp, text-generation-webui, Jan, GPT4All-style GGUF loaders, and other runtimes that accept GGUF models. If a runtime asks for a model folder, place the downloaded file in your local model library and select it from there.