NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Tiny hackable CUDA language model implementation (github.com)
yobbo 2 hours ago [-]
Looks very nice, but I can't find numerical gradient checks, which is helpful when verifying that backward pass is correct:

https://github.com/markusheimerl/gpt/blob/main/transformer/a...

3 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 07:55:44 GMT+0000 (Coordinated Universal Time) with Vercel.