Granite 4.1: IBM's 8B Model Matching 32B MoE

Posted by steveharing1 |2 hours ago |63 comments

dash2 3 minutes ago

Nah, I ain't reading that. If they can't be bothered to get a human to write it, it can't be that important. I'm glad for them though. Or sorry that happened.

2ndorderthought 2 hours ago[3 more]

I test drove it yesterday. It's pretty impressive at 8b. Runs on commodity hardware quickly.

Qwen3.6 35b a3b is still my local champion but I may use this for auto complete and small tasks. Granite has recent training data which is nice. If the other small models got fine tuned on recent data I don't know if I would use this at all, but that alone makes it pretty decent.

The 4b they released was not good for my needs but could probably handle tool calls or something

cbg0 an hour ago[1 more]

The real "sleeper" might be https://huggingface.co/ibm-granite/granite-vision-4.1-4b if the benchmarks hold up for such a small model against frontier models for table & semantic k:v extraction.

Havoc 2 hours ago[2 more]

Interesting to see a pivot away from MoE by both IBM and mistral while the larger classes of SOTA of models all seem to be sticking to it.

Quick vibe check of it- 8B @ Q6 - seems promising. Bit of a clinical tone, but can see that being useful for data processing and similar. You don't really want a LLM that spams you with emojis sometimes...

tosh an hour ago

IBM announcement: https://research.ibm.com/blog/granite-4-1-ai-foundation-mode...

agunapal an hour ago[2 more]

If you really think about why MoE came into existence, its to save significant cost during training, I don't think there was any concrete evidence of performance gains for comparable MoE vs dense models. Over the years, I believe all the new techniques being employed in post training have made the models better.

100ms an hour ago[3 more]

> Full stop.

Why people don't edit out obvious sloppification and expect to still have readers left

dissahc 15 minutes ago

qwen3.5 9b outperforms granite 4.1 30b by a huge amount (32 vs 15 on artificialanalysis benchmark)... i have no idea what made the writer of this article say so many demonstrably incorrect things

robotmaxtron 6 minutes ago

"open source"

show me.

RugnirViking 2 hours ago[1 more]

sounds interesting. Here's hoping they release a 32B model, thats a pretty good sweet spot for feasibility of home setups.

edit: I just realised they do actually have a 30b release alongside this. Haven't tried it yet.

mdp2021 2 hours ago[2 more]

Wish they also released an embedding model, in the line of their previous: compact (while good)...

tokenhub_dev 12 minutes ago

[flagged]

whalesalad an hour ago

[flagged]