Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, because they're already taking that into account.

>Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation).

(Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: