alexpotato 4 minutes ago
- llama.cpp
- OpenCode
- Qwen3-Coder-30B-A3B-Instruct in GGUF format (Q4_K_M quantization)
working on a M1 MacBook Pro (e.g. using brew).
It was bit finicky to get all of the pieces together so hopefully this can be used with these newer models.
https://gist.github.com/alexpotato/5b76989c24593962898294038...
solarkraft 22 minutes ago
Up until relatively recently, while people had already long been making these claims, it came with the asterisks of „oh, but you can’t practically use more than a few K tokens of context“.
solarkraft 25 minutes ago
Edit: The unsloth quants seem to have been fixed, so they are probably the go-to again: https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks
kristianpaul 4 minutes ago
mark_l_watson 37 minutes ago
erelong 36 minutes ago
aliljet an hour ago
u1hcw9nx 2 hours ago
xenospn an hour ago