> The question has been making the rounds online as a simple logic test, the kin...

> The question has been making the rounds online as a simple logic test, the kind any human gets instantly, but most AI models don't.

...

> They ran the exact same question with the same forced choice between "drive" and "walk," no additional context, past 10,000 real people through their human feedback platform.

> 71.5% said drive.

Well that's a bit embarrassing.

That implies that some models are just better than humans.

I don't think the technology needs to live up to some expectation of perfection, just beat out the human average to have benefit (often, sadly, not to workers themselves).