NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
MiniMax M2.5 released: 80.2% in SWE-bench Verified (minimax.io)
logicprog 8 minutes ago [-]
Hm. The benchmarks look too good to be true and a lot of the things they say about the way they train this model sound interesting, but it's hard to say how actually novel they are. Generally, I sort of calibrate how much salt I take benchmarks with based on the objective properties of the model and my past experiences with models from the same lab.

For instance,

I'm inclined to generally believe Kimi K2.5's benchmarks, because I've found that their models tend to be extremely good qualitatively and feel actually well-rounded and intelligent instead of brittle and bench-maxed.

I'm inclined to give GLM 5 some benefit of the doubt, because while I think their past benchmarks have overstated their models' capabilities, I've also found their models relatively competent, and they 2X'd the size of their models, as well as introduced a new architecture and raised the number of active parameters, which makes me feel like there is a possibility they could actually meet the benchmarks they are claiming.

Meanwhile, I've never found MiniMax remotely competent. It's always been extremely brittle, tended to screw up edits and misformat even simple JavaScript code, get into error loops, and quickly get context rot. And it's also simply just too small, in my opinion, to see the kind of performance they are claiming.

mythz 52 minutes ago [-]
Really looked forward to this release as MiniMax M2.1 is currently my most used model thanks to it being fast, cheap and excellent at tool calling. Whilst I still use Antigravity + Claude for development, I reach for MiniMax first in my AI workflows, GLM for code tasks and Kimi K2.5 when deep English analysis is needed.

Not self-hosting yet, but I prefer using Chinese OSS models for AI workflows because of the potential to self-host in future if needed. Also using it to power my openclaw assistant since IMO it has the best balance of speed, quality and cost:

> It costs just $1 to run the model continuously for an hour at 100 tokens/sec. At 50 tokens/sec, the cost drops to $0.30.

user2722 31 minutes ago [-]
!!!!!! Incredibly cheap!!!!!

I'll have to look for it in OpenRouter.

amunozo 18 minutes ago [-]
For the moment is free in Opencode, if you want ot try it.
denysvitali 11 minutes ago [-]
Btw, the model is free on OpenCode for now
3adawi 17 minutes ago [-]
Wish my company allowed more of these LLMs through Github Copilot, stuck with OpenAi, Anthropic and Google LLMs where they burn my credit one week into the month
jhack 29 minutes ago [-]
And it's available on their coding plans, even the cheapest one.
turnsout 22 minutes ago [-]
With the GLM news yesterday and now this, I'd love to try out one of these models, but I'm pretty tied to my Claude Code workflow. I see there's a workaround for GLM, but how are people utilizing MiniMax, especially for coding?
hasperdi 17 minutes ago [-]
you can use Claude Code with these models. You just need to pass the right env vars. Have a look at the client setup guide on z.ai
turnsout 7 minutes ago [-]
Interesting—thanks!
amunozo 17 minutes ago [-]
I use Opencode, when the model is free for the moment. I have not used Claude Code so I cannot compare.
claythearc 20 minutes ago [-]
anything with an open ai compatible endpoint can have claude code router put in front of it afaik https://github.com/musistudio/claude-code-router
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 18:07:19 GMT+0000 (Coordinated Universal Time) with Vercel.