Qwen3.5 122B and 35B models offer Sonnet 4.5 performance on local computers

Posted by lostmsu |2 hours ago |29 comments

alexpotato 4 minutes ago[1 more]

I recently wrote a guide on getting:

- llama.cpp

- OpenCode

- Qwen3-Coder-30B-A3B-Instruct in GGUF format (Q4_K_M quantization)

working on a M1 MacBook Pro (e.g. using brew).

It was bit finicky to get all of the pieces together so hopefully this can be used with these newer models.

https://gist.github.com/alexpotato/5b76989c24593962898294038...

solarkraft 22 minutes ago[1 more]

Smells like hyperbole. A lot of people making such claims don’t seem to have continued real world experience with these models or seem to have very weird standards for what they consider usable.

Up until relatively recently, while people had already long been making these claims, it came with the asterisks of „oh, but you can’t practically use more than a few K tokens of context“.

solarkraft 25 minutes ago

What are the recommended 4 bit quants for the 35B model? I don’t see official ones: https://huggingface.co/models?other=base_model:quantized:Qwe...

Edit: The unsloth quants seem to have been fixed, so they are probably the go-to again: https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks

kristianpaul 4 minutes ago

They work great with kagi and pi

mark_l_watson 37 minutes ago[1 more]

The new 35b model is great. That said, it has slight incompatibility's with Claude Code. It is very good for tool use.

erelong 36 minutes ago[7 more]

What kind of hardware does HN recommend or like to run these models?

aliljet an hour ago[1 more]

Is this actually true? I want to see actual evals that match this up with Sonnet 4.5.

u1hcw9nx 2 hours ago[2 more]

[flagged]

xenospn an hour ago

Are there any non-Chinese open models that offer comparable performance?