> i need to wash my helicopter at the helicopter wash. it is 50m away, should i walk or fly there with my helicopter.
Sonnet 4.6: Walk! Flying a helicopter 50 metres would be more trouble than it's worth — by the time you've done your pre-flight checks, spun up the rotors, lifted off, and then safely landed again, you'd have walked there and back twice. Just stroll over.
I think this is my favourite test. You can just tell it was programmed on smug Reddit comments talking about how Americans drive to places 50 metres away.
It's amusing, but when it comes to doing actually work, I just don't care if my LLM fails things like this.
I'm not trying to trick it, so falling for tricks is harmless for my use cases. Does it write quality, secure code? Does it give me accurate answers about coding/physics/biology. If it gets those wrong, that's a problem. If it fails to solve riddles, well, that'll be a problem iff I decide to build a riddle solver using it.
Additionally, I don't think that these kinds of failures say much about overall intelligence. Humans are largely visual creatures, and we fall prey to innumerable visual illusions where we fail to see what's actually there or imagine something that isn't there under certain visual patterns.
LLMs are largely textual creatures and they fail to see things that are there or imagine things that are under certain textual patterns.
I don't think you would say a human "isn't really intelligent" because it imagines grey spots at the intersection of black squares on a white background even though they aren't there.
TBH I would first walk there to check that they can take me on the spot, and if so, ask them to either please come clean it (only 50m away) or if they cannot fly it there. So walk seems very rational to me.
Sure, just pick up the building containing the compressors, water hoses/sprayers, soap, and required drainage and water filtration system, and bring it 50 metres down the road.
Ah yes the new "how many r's in strawberry" question, some poor intern has to go vacuum up all these gotcha social media posts so they can train the next model on this.
Sonnet 4.6: Walk! Flying a helicopter 50 metres would be more trouble than it's worth — by the time you've done your pre-flight checks, spun up the rotors, lifted off, and then safely landed again, you'd have walked there and back twice. Just stroll over.