Looks like you're mixing up two things when testing: the correct answer and form...

XCSme · 2026-02-18T11:05:38 1771412738

Because the format can't also be strictly defined via structured output, and you have to write it in plain words. Imagine you also have a field within your JSON, which also needs a specific format. It's AI, you don't want to write a 2000lines JSON schema to define what you need and how to parse it, that's the point of using AI instead of writing your own data extraction script.

Also, simply because a human would respect it properly. And it's quite clear what the request was.

Thanks for the suggestion to separate format following from correct answer, good idea, I'll think about it.

Still, some good AIs do it properly, and as expectedly, why would I change the tests specifically for Claude, which is basically the only one with this problem.

viraptor · 2026-02-18T12:05:44 1771416344

> Because the format can't also be strictly defined via structured output, and you have to write it in plain words.

That's not how structured output works. Check the docs https://platform.claude.com/docs/en/build-with-claude/struct...

The schema is enforced at the inference time. The non-confirming tokens are removed from the possible responses.

XCSme · 2026-02-18T12:10:14 1771416614

I use structured format in many of live AI systems, maybe my point was not clear.

For some tasks it's impossible to define a JSON schema. Let's say you want the message to end with "Thank you", in any language. Should I add in my schema 200 possible endings? What about all their variations and declinations in various languages?

Sometimes you have to define in natural language how you want the output to look like.