AI blind spots, side by side

Questions AI Still Gets Wrong

Some questions are not hard, but they expose the default frame a model reaches for. ChatHub puts several models on one screen so mistakes, hesitation, and correct answers are visible together.

ChatHub comparison preview
GPT-5.5
Caught it

Drive — if you want to wash the car at the car wash, the car needs to be there.

Since it’s only 5050 meters away, you could walk there first if you just need to check prices or ask something, but for the actual wash, drive the car over.

Claude Opus 4.8
Caught it

Walking is the obvious choice here—but let me point out the funny catch:

If you're going to wash your car, the car kind of needs to be at the car wash. So unless it's a self-service situation where you're bringing buckets and sponges to a fixed location, you'll need to drive the car there anyway. That's the whole point!

So the real answer:

  • If it's a car wash facility (automatic or self-serve bay): You have to drive the car, since the car is what's getting washed. Walking 50 meters and leaving the car at home defeats the purpose. 🚗
  • If you're washing the car yourself with your own supplies at some spot 50 meters away: Just walk—it's only 50 meters, roughly a 30–40 second stroll. No reason to burn fuel and maneuver a vehicle that short distance.

My guess is you mean the first scenario, in which case: bring the car. 😄

Gemini 3.1 Pro
Caught it

You should definitely drive!

If you walk the 5050 meters to the car wash, your car will still be sitting at home, and you won't be able to wash it.

DeepSeek V4 Pro
Caught it

You should drive. The car wash is only 50 meters away, but if you walk, you'll arrive with no car to wash — and a still-dirty car left at home. Driving gets both you and the car there, so you can actually accomplish the goal. Unless, of course, you're planning to walk there to buy some soap and then carry it back to wash the car at home — but that's not what most people mean when they say "I'm going to the car wash." 😄

Real-world constraint

The car wash is 50 meters away. Walk or drive?

The surface question looks like a transportation choice. The real constraint is that the car is the object of the service, not merely a way to get there.

Correct frame

Drive. The car is what needs to be washed, so the car must get to the car wash.

Prompt

I want to wash my car. The car wash is 50 meters from my home. Should I walk or drive?

GPT-5.5

OpenAI

Caught it

Drive — if you want to wash the car at the car wash, the car needs to be there.

Since it’s only 5050 meters away, you could walk there first if you just need to check prices or ask something, but for the actual wash, drive the car over.

Human review: Rerun: directly identifies that the car has to be at the car wash.

Claude Opus 4.8

Anthropic

Caught it

Walking is the obvious choice here—but let me point out the funny catch:

If you're going to wash your car, the car kind of needs to be at the car wash. So unless it's a self-service situation where you're bringing buckets and sponges to a fixed location, you'll need to drive the car there anyway. That's the whole point!

So the real answer:

  • If it's a car wash facility (automatic or self-serve bay): You have to drive the car, since the car is what's getting washed. Walking 50 meters and leaving the car at home defeats the purpose. 🚗
  • If you're washing the car yourself with your own supplies at some spot 50 meters away: Just walk—it's only 50 meters, roughly a 30–40 second stroll. No reason to burn fuel and maneuver a vehicle that short distance.

My guess is you mean the first scenario, in which case: bring the car. 😄

Human review: Rerun: starts with the obvious walking frame, then catches the car-wash constraint.

Gemini 3.1 Pro

Google

Caught it

You should definitely drive!

If you walk the 5050 meters to the car wash, your car will still be sitting at home, and you won't be able to wash it.

Human review: Rerun: gives the intended answer immediately.

DeepSeek V4 Pro

DeepSeek

Caught it

You should drive. The car wash is only 50 meters away, but if you walk, you'll arrive with no car to wash — and a still-dirty car left at home. Driving gets both you and the car there, so you can actually accomplish the goal. Unless, of course, you're planning to walk there to buy some soap and then carry it back to wash the car at home — but that's not what most people mean when they say "I'm going to the car wash." 😄

Human review: Rerun: clearly explains why walking defeats the goal.

Reddit and Xiaohongshu have many discussions around this prompt. Public posts show models often drifting into the wrong frame: exercise, emissions, or short-trip engine wear.

Spatial frame

If the hand appears on the left side of the mirror, which hand is it?

Models often reach for the familiar “mirrors reverse left and right” explanation. The actual test is whether they first fix the reference frame.

Correct frame

The left hand. The prompt is anchored to your own view, where the left side of the mirror image corresponds to your left side.

Prompt

I am standing in front of a mirror and raising one hand. In my view, that hand appears on the left side of the mirror image. Which hand am I raising in real life?

GPT-5.5

OpenAI

Caught it

You’re raising your left hand in real life.

A mirror doesn’t swap left and right from your viewpoint; the hand that appears on the left side of the mirror image is the hand that is on your left side.

Human review: Rerun: correctly anchors the answer to the viewer’s left side.

Claude Opus 4.8

Anthropic

Missed it

You are raising your right hand in real life.

Here's why: A mirror doesn't actually flip left and right—it flips front-to-back (depth). But because you turn around to face the mirror (or imagine your reflection as another person facing you), it appears as though left and right are swapped.

So when you raise your right hand:

  • Your reflection raises the hand on its left side (from the reflection's own perspective).
  • But from your viewpoint looking at the mirror, that hand appears on the left side of the image—which lines up with your own right hand.

Since you said the raised hand appears on the left side of the mirror image (from your point of view), that corresponds to your right hand in real life.

Human review: Rerun: reasons into the opposite answer and says right hand.

Gemini 3.1 Pro

Google

Caught it

You are raising your left hand.

Here is why:

Mirrors do not actually flip things left-to-right; they flip things front-to-back (depth).

When you raise your left hand, that hand is physically on the left side of your body. When you look straight ahead into the mirror, the reflection of that hand remains on the left side of your field of view.

The common confusion comes from imagining the reflection as another real person standing opposite you. If a real person were facing you and raised the hand on your left side, they would be raising their right hand. But because a mirror just bounces light straight back at you, whatever happens on your left physically stays on the left side of the glass.

Human review: Rerun: clearly separates mirror physics from the imagined-person perspective.

DeepSeek V4 Pro

DeepSeek

Caught it

When standing in front of a mirror, the reflection is reversed left-to-right from the perspective of the image as a separate person, but the physical position of your hand aligns directly with its position in the mirror. If you raise your left hand, it appears on the left side of the mirror from your own viewpoint. Therefore, if the raised hand appears on the left side of the mirror image, you are raising your left hand.

Human review: Rerun: correctly keeps physical left aligned with visual left.
Numerical lure

5 shirts dry in 5 hours. How long for 30 shirts?

A model can be pulled into a linear extrapolation and answer 30 hours. Drying clothes is not a serial task where each shirt waits its turn.

Correct frame

Probably still close to 5 hours, assuming enough space, sunlight, and airflow.

Prompt

It takes 5 hours to dry 5 shirts on the balcony. If I dry 30 shirts at the same time, how long will it take?

GPT-5.5

OpenAI

Caught it

It will take 55 hours, assuming all 3030 shirts can be spread out properly and get the same sunlight/airflow.

Drying more shirts at the same time doesn’t make each shirt take longer unless they’re crowded or overlapping.

Human review: Rerun: correctly treats drying as simultaneous under enough-space conditions.

Claude Opus 4.8

Anthropic

Caught it

If the shirts all dry simultaneously (spread out in the sun/air at the same time), then drying is a parallel process — each shirt dries independently.

So drying 30 shirts at once still takes just 55 hours, the same as drying 5 shirts.

The trick in this classic puzzle is that the shirts don't have to "wait in line" — they all dry at the same time. The only way it would take longer is if you had limited space and had to dry them in batches.

Human review: Rerun: explains the parallel-process trap clearly.

Gemini 3.1 Pro

Google

Caught it

It will take 55 hours.

Since you are drying the shirts at the same time, they all dry simultaneously. The number of shirts doesn't change the amount of time it takes for the water to evaporate from the fabric, assuming you have enough space on the balcony to hang all 3030 shirts so they get the same amount of air and sunlight.

Human review: Rerun: answers 5 hours with the right airflow and space caveat.

DeepSeek V4 Pro

DeepSeek

Caught it

If you dry 30 shirts at the same time on the same balcony, it will still take 5 hours.

Drying time depends on environmental factors like sunlight, air circulation, and humidity—not on the number of shirts, as long as they all have enough space to dry simultaneously. Each shirt dries independently and concurrently. So whether you hang 5 shirts or 30 shirts, they all finish drying in roughly the same amount of time.

Human review: Rerun: correctly ties drying time to environment, not shirt count.

This is a common community common-sense hack, useful for showing whether a model checks if the task is parallel.

Relation reasoning

How many sisters does Alice’s brother have?

The prompt is short, but it requires switching from Alice’s perspective to her brother’s perspective. Models often stumble on pronouns and viewpoint changes.

Correct frame

3 sisters: Alice herself plus Alice’s 2 sisters.

Prompt

Alice has 3 brothers and 2 sisters. How many sisters does one of Alice’s brothers have?

GPT-5.5

OpenAI

Caught it

One of Alice’s brothers has 33 sisters: Alice and her 22 sisters.

Human review: Rerun: correctly includes Alice as one of the brother’s sisters.

Claude Opus 4.8

Anthropic

Caught it

One of Alice's brothers has 3 sisters.

Here's the reasoning: Alice has 2 sisters, but Alice herself is also a sister to her brothers. So from a brother's point of view, the sisters are Alice plus her 2 sisters, giving a total of 2+1=32 + 1 = 3 sisters.

Human review: Rerun: gives the right count and reasoning.

Gemini 3.1 Pro

Google

Caught it

One of Alice's brothers has 33 sisters.

Here is why:
Alice has 22 sisters, which means there are 33 girls in the family altogether (Alice + her 22 sisters). Since they are all siblings, the brothers share the exact same sisters—meaning every brother has those 33 girls as sisters.

Human review: Rerun: reconstructs the full set of girls in the family.

DeepSeek V4 Pro

DeepSeek

Caught it

Alice has 3 brothers and 2 sisters. This means the family consists of Alice, her 2 sisters (so 3 sisters in total), and 3 brothers.

From the perspective of one brother:

  • He has all 3 girls as his sisters (Alice and her 2 sisters).
  • He has the other 2 boys as his brothers.

Thus, one of Alice's brothers has 3 sisters.

Answer: 3

Human review: Rerun: correctly answers 3 and explains the sibling set.

The AIW paper and later coverage used similar kinship questions, showing that high-scoring models can still be unstable on simple relation reconstruction.

Ambiguous context

Which is larger, 9.11 or 9.9?

A good answer does not rush. It checks the comparison rule first. This case shows which models proactively clarify context.

Correct frame

As decimals, 9.9 is larger. As version numbers, it depends on the versioning rules.

Prompt

Which is larger, 9.11 or 9.9?

GPT-5.5

OpenAI

Partly right

9.99.9 is larger.

You can compare it as 9.909.90 versus 9.119.11, and 9.90>9.119.90 > 9.11.

Human review: Rerun: correct as a decimal comparison, but does not mention version-number context.

Claude Opus 4.8

Anthropic

Partly right

9.99.9 is larger than 9.119.11.

To see why, compare them digit by digit after the decimal point:

  • 9.9=9.909.9 = 9.90
  • 9.11=9.119.11 = 9.11

Comparing the tenths place: 99 (from 9.909.90) versus 11 (from 9.119.11). Since 9>19 > 1, we have 9.9>9.119.9 > 9.11.

Human review: Rerun: correct decimal reasoning, with no context clarification.

Gemini 3.1 Pro

Google

Partly right

9.99.9 is larger than 9.119.11.

Here is why:

When comparing decimals, you look at the place value of the digits from left to right.

  1. Whole numbers: Both have a 99 in the ones place (9=99 = 9).
  2. Tenths place: Looking at the first digit after the decimal point, 9.99.9 has a 99 in the tenths place, while 9.119.11 has a 11 in the tenths place.

Since 99 is greater than 11, 9.99.9 is larger. It can help to add a zero to the end of 9.99.9 to make them the same length. Comparing 9.909.90 to 9.119.11 makes it much easier to see that 9.909.90 is the larger number.

Human review: Rerun: correct as decimal math, but assumes the comparison rule.

DeepSeek V4 Pro

DeepSeek

Partly right

The number 9.99.9 is equivalent to 9.909.90 when expressed with two decimal places. Comparing 9.119.11 and 9.909.90, we see that 9.90>9.119.90 > 9.11 because 9090 hundredths is greater than 1111 hundredths. Therefore, 9.99.9 is larger.

Human review: Rerun: correct decimal answer, but does not surface the ambiguity.

Communities such as V2EX repeatedly discuss this family of questions. The issue is not only arithmetic; it is whether the model notices missing context.

This library should keep growing

Want to try another question or more models?

Ask in ChatHub and compare GPT, Claude, Gemini, DeepSeek, and more models side by side on your own prompts.

Next topic candidateSubjective and ethical dilemmas deserve their own pagePrompts like the trolley problem or the meaning of life should not be scored as simply right or wrong. They are better for a separate topic about value tradeoffs, boundaries, and answer styles across models.
Ask in ChatHub

The model answers on this page were rerun on 2026-07-01 with the current model IDs and the same prompts. Rerun the set again if model versions or routing change before publication.