Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To me the only acceptable answer would be “what do you mean?” or “can you clarify?” if we were to take the question seriously to begin with. People don’t intentionally communicate with riddles and subliminal messages unless they have some hidden agenda.
 help



Sure, if an open ended response was allowed, but if it was a multiple choice question then you'd have to use your common sense and pick one.

However, the important issue here really isn't about the ability of humans or LLMs to recognize logic puzzles. If you were asking an LLM for real world advice, trying to be as straightforward as possible, you may still get a response just as bad as "walk", but not be able to recognize that it was bad, and the reason for the failure would be exactly the same as here - failure to plan and reason through consequences.

It's toy problems like this that should make you step back once in a while and remind yourself of how LLMs are built and how they are therefore going to fail.


Thing is, it's not a riddle or a subliminal message. Everything needed to answer the question is contained therein.

I don't think it is, though. Where is the car? Do you want to wash your car at the car wash? Both of those are rather important pieces of information. Everyone is relying on assumptions to answer the question, which is fine, but in my opinion not a great reasoning test.

If you want to argue that, then you could also argue that everything needed to challenge the questions’ motives and its validity is also contained therein.

This reminds me of people who answer with “Yes” when presented with options where both can be true but the expected outcome is to pick one. For example, the infamous: “Will you be paying with cash or credit sir?” then the humorous “Yes.”


That's precisely what makes it a "trick question" or a "riddle". It's weird precisely because all the information is there. Most people who have functioning brains and complete information don't ask pointless questions (they would, obviously, just drive their car to the car wash)—there's no functional or practical reason for the communication, which is what gives it the status of a puzzle—syntax and exploitation of our tendency to assume questions are asked because information is incomplete tricks us into brining outside considerations to bear that don't matter.

If you were forced to answer either or, which one would you pick? I think that's where the interesting dynamic comes from. Most humans would pick drive, also seen in the human control, even if it is lower that I thought it'd be

Sure, though then we’re in la la land. What’s a real life example of being forced to answer an absurd question other than riddles, games, etc? No longer a valid question through normal discourse at that point, and if context isn’t provided then I think the expected outcome still is to ask for clarification.

I would love to see LLMs start to ask clarifying questions. That feels like it would be a step up similar to reasoning

Claude Code has an entire tool for the LLM to asking clarifying questions - it'll give you three pre-written responses or you can respond with your own text.

How is that a "subliminal message"? It's just a simple example of common sense, which LLMs fail because they can't reason, not because they are "overthinking". If somebody asks, "What's 2+2?", they might be insulting you, but that doesn't mean the answer is anything other than 4.

2+2 might well not equal 4, since you haven’t specified the base of the numbers or the modulus of the addition.

And what if it’s a full service car wash and you’ve parked nearby because it’s full so you walk over and give them the keys?

Assumptions make asses of us all…


So you're saying it would be useful for an "AI assistant" to ask you for the base each time you give it a math problem? Do you also want it to ask you if you're using the conventional definitions of "2" and "+"? For the car wash, would you like it to ask if you're on Earth or on Mars? Do you have air in your tires? Is the car actually a toy car?

Some assumptions are always necessary and reasonable, that's why I'm saying the "AI" lacks common sense.


Seems like you’re the one not applying common sense now!

Yes, that is exactly the point of my comment. Illustrating that disregarding normal (=common) assumptions (=sense) is a lack of common sense.

It’s common sense to ask a question in riddle format? What’s the goal of the person asking the question? To challenge the other person? In what way? See if they get the obvious? Asking for clarification isn’t valid?

It's common sense to know that you need to have your car with you to wash it. Asking the question is a challenge in the obvious yes. If you asked an AI "what's 2+2" and it said 3, would you argue that the question was a trick question?

No. I would expect it to say 4 given that has an objective answer. For the other, without any context whatsoever, I would prefer the answer of clarifying. I would be okay if the way it asked for clarification came with:

“What do you mean walk or drive? I don’t understand the options given you would need your car at the car wash. Is there something else I should know?”


"What do you mean two plus two? I don't understand the question given that it's basic math. Is there something else I should know?"

I fail to see how these things are one and the same. I get the point you are making, I just don't agree with it.

2+2 is a complete expression, the other is grammatically correct but logically flawed. Where is the logical fallacy in 2+2?


Well, I don't think you get my point based on your last question. My point is that there is no logical fallacy in the car wash question, just like there is none in 2+2. How is it any more logically flawed than asking, "I want to shop for groceries. The shop 50 meters away. Should I walk or drive?".

You’re conflating it being a question granting making it logically sound. The prior context in the question is what adds the logical fallacy to it, the question without that is fine but given the information about the car it becomes absurd. Your new example illustrate different things, context cannot be ignored here as it is what makes the entire thing what it is. In the car wash example, the context has a direct relationship with the question that determines the answer, the relationship matters so much that OP claims that for its benchmark purposes only “drive” is the valid answer. That special condition is what makes it a puzzle, a test, and a logically flawed proposition to test your attention despite it being structured as a question grammatically. 2+2 does not bring this relationship in its structure and presentation.

You're not making a fair comparison.

"What's 2 + 2" is a completely abstract question for mathematics that human beings are thoroughly trained mostly to associate with tests of mastery and intelligence.

The car wash question is not such a question. It is framed as a question regarding a goal oriented, practical behavior, and in this situation it would be bizarre for a person to ask you this (since a rational person having all the information in the prompt, knowing what cars are, which they own, and knowing what a car wash is, wouldn't ask anybody anything, they'd just drive their car to the car wash).

And as someone else noted there are in fact situations in which it actually can be reasonable to ask for more context on what you mean by "2 + 2". You're just pointing out that human beings use a variety of social mores when interpreting messages, which is precisely why the car wash question silly/a trick were a human being to ask you and not preceded the question with a statement like "we're going to take an examine to test your logical reasoning".

As with LLMs, interpretation is all about context. The people that find this question weird (reasonably) interpret it in a practical context, not in a "this is a logic puzzle context" because human beings wags cats far more often than they subject themselves to logic puzzles.


My point is that just because there's no practical reason to ask the question, that doesn't make it a weird question or make the answer anything other than obvious. You'd never ask somebody "Is the sky blue?", but that doesn't mean the answer is anything other than "Yes". The answer is clearly not "Well, is it night? Is it sunset?" etc.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: