Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Aaaah thanks for the explanation. PANEL of 10 humans, as in, they were all together. I parsed the phrase as "10 random people" > "average human" which made little sense.


Actually I believe that he did mean 10 random people tested individually, not a committee of 10 people. The key being that the question is considered to be answered correctly if any one of the 10 people got it right. This is similar to how LLMs are evaluated with pass@5 or pass@10 criteria (because the LLM has no memory so running it 10 times is more like asking 10 random people than asking the same person 10 times in a row).

I would expect 10 random people to do better than a committee of 10 people because 10 people have 10 chances to get it right while a committee only has one. Even if the committee gets 10 guesses (which must be made simultaneously, not iteratively) it might not do better because people might go along with a wrong consensus rather than push for the answer they would have chosen independently.


He means 10 humans voting for the answer




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: