No, because they're already taking that into account. >Metric: generation throug... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ImprobableTruth on Feb 20, 2023 \| parent \| context \| favorite \| on: Running large language models like ChatGPT on a si... No, because they're already taking that into account. >Metric: generation throughput (token/s) = number of the generated tokens / (time for processing prompts + time for generation). (Though they're doing batching, so this is an unfair comparison. Would be interesting to get single batch speed.)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact