Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
visarga
on March 15, 2024
|
parent
|
context
|
favorite
| on:
Quiet-STaR: Language Models Can Teach Themselves t...
You are missing an important detail here - number of tokens - yes, you have 50 "steps" in network depth, but you could have extra tokens. Assuming you don't run out of tape, there is no reason for LLMs to be limited to simple operations.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: