hintymad an hour ago
I find it amazing how robust the current deep learning models are. A simple linear combination of every weight did not degrade the performance of the model, but enhanced it.
zinodaur 3 hours ago
unrvl22 4 hours ago
40 minutes ago
Comment deletedjordz an hour ago
Havoc an hour ago
2 hours ago
Comment deletedjrm4 2 hours ago
-- Bill Gates
AlienRobot 3 hours ago
>The model is built via a merge of https://huggingface.co/nex-agi/Nex-N2-Pro and https://huggingface.co/Qwen/Qwen3.5-397B-A17B, proceeded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was upload instead of the final distilled model. We are sorry for the confusion and apologize profusely.
Incidentally are people using Github issues as blogs now?
fkozlowski 2 hours ago
MadrasTh0rn 2 hours ago
ekjhgkejhgk 2 hours ago
AnotherGoodName 3 hours ago
yieldcrv 2 hours ago
Its a fine tune of Qwen
Not a conspiracy
flowbarai 11 minutes ago
Aurornis 2 hours ago
Comment deletedantii 2 hours ago
Comment deleteddiego_moita an hour ago
Oh, I am so SHOCKED, so SHOCKED! /s
Explaining the joke: in Brazil, Rio de Janeiro is known as "Terra de bandido" (Gangster's Land).
Kinda like Chicago in the 20's or Naples and Palermo in the 90s.
elzbardico 3 hours ago
alfiedotwtf 3 hours ago