Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pythia was trained on only 300B tokens and is pretty dumb compared to LLaMA.

Pythia 13B is worse than LLaMA-7B and requires double the resources.



Not all use cases need GPT-4 level performance. I'd argue that even LLaMA-7B is quite limited. Also, new and improved models are being released all the time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: