N
Hacker Next
new
past
show
ask
show
jobs
submit
login
▲
Tiny hackable CUDA language model implementation
(
github.com
)
20 points by
markusheimerl
3 days ago
|
1 comment
add comment
yobbo 2 hours ago
[-]
Looks very nice, but I can't find numerical gradient checks, which is helpful when verifying that backward pass is correct:
https://github.com/markusheimerl/gpt/blob/main/transformer/a...
3 days ago
[-]
Rendered at 07:55:44 GMT+0000 (Coordinated Universal Time) with Vercel.
https://github.com/markusheimerl/gpt/blob/main/transformer/a...