NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
GLM-5.2 is probably the most powerful text-only open weights LLM (simonwillison.net)
besterman23 3 hours ago [-]
I wonder if multiple attempts at the opossum would produce better results.

If we didn’t have the previous example I would interpret this as pretty solid evidence that labs were training on the Pelican “benchmark”.

I just can’t imagine a model dropping so significantly from one version to the next on such a silly task.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 19:01:50 GMT+0000 (Coordinated Universal Time) with Vercel.