Do Nielsen’s heuristics still matter for AI experiences? (Q&A)

We’re jumping into @Menno’s article today, Nielsen’s Heuristics for an AI powered Post-Screen World. He argues that Nielsen’s usability heuristics still matter, but only if we stop treating them as fixed rules from a screen-based past and start using them as living principles for today’s AI-driven, post-screen world.

If we apply the heuristics literally, we design for a world that no longer exists. He believes that heuristics still work, though, because they describe how people think, feel, and make mistakes (not because they describe buttons, pages, or layouts).

The main problem is that the world has changed. Interfaces now live in smart locks, wearables, voice systems, AR, VR, and AI copilots.

As technology stretches beyond screens, the heuristics get stress-tested. AI systems introduce probability, uncertainty, and risk, which means designers must expose confidence, explain reasoning, and make it easy for people to push back.

Let’s jump into the discussion:

I like Menno’s perspective… let’s not replace Nielsen’s heuristics. We need to translate them. Ask better questions when applying them.

We need to design for moments with no screen, no certainty, and real consequences. Stretch the heuristics to fit new contexts, because our tools are changing fast, but our responsibility to users has not.

Where have Nielsen’s heuristics helped you design a better AI experience, and where have they clearly fallen short?

Let’s dig into what it takes to ship an AI project. Menno Crammer is a featured Helio author and my first guest on Glaringly Obvious. Pumped to learn from him today.

2 Likes

@Menno, pumped to chat and would love others to jump into the conversation. Let’s start with some basics.

In your opinion, which heuristic breaks first when an AI system becomes probabilistic instead of deterministic?

2 Likes

I guess Consitency.

4: Consistency and Standards

Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform and industry conventions.

AI is soo often under the hood, a black box, we “dont know what happened” , “it didnt sahre its reasonings”…. We (sadly) are already getting accustomed to "not knowing”.

1 Like

Yes, I think this is where trust fades with AI (both in users and teams)

As soon as outputs vary, people stop expecting repeatable behavior and start tolerating “it just does that sometimes.” It creates a gap. When teams accept unpredictability as a feature, they start to get frustrated with tools. The bar drops fast. Investment declines.

The challenge with AI projects is finding consistency in outcomes, but also in intent, constraints, and recovery when things go wrong. I know over here we beat our heads against the wall sometimes.

If AI cannot be consistent in results, what must it be consistent about for users to trust it?

1 Like

True. One big problem I see at least with the current LLMs is still the “wanting to please the user”.

You can tell it, “no, that is not right”, and it will answer. “you are absolutely right”.
It does make mistakes and it does change its opinion.
Also, you can asl the same question twice and get different answers. Int he same model as well as different models.

What should stay consistent is its relationship with you. Then users have that as a basis to rely on. How it talks to you, what it knows about you, be honest about what it doesn’t know.
Don’t invent, don’t fill in the gaps that should be gaps…

And my personal favourite… I still wish there was a “certainty percentage” to each anwser… The equivalent of facial expressions when someone talks to you. There is no black and white in human communication… but there is with AI…

1 Like

Yes! This maps directly back to Visibility of System Status, just at a deeper level than UI feedback.

In deterministic systems, status meant progress bars and confirmations. In AI systems, status is honesty.

→ Confidence, uncertainty, limits, and refusal are the new signals.

I think when an AI tries to please, fills gaps, or mirrors agreement, it violates that heuristic by hiding its true state. You have to really press to get to the heart of an answer.

If “visibility of system status” now includes uncertainty, confidence, and intent, how should we redesign this heuristic for AI without turning every response into a disclaimer?

1 Like

100%, a part of this is also “expectation management though”. I mean, right now the “AI can make mistakes please validate your answers” is like the airlines T&Cs… everyone clicks yes, nobody read them.

Visibility of System Status is actually quite complex when its not binary. But process updates, label statuses… now we have the "thinking” or “accessing docs”. But even there… they could just be time fillers and placeholders, and in reality all processes are in parallel..

We should come up with a way to communicate to the users what is happening. If that is status updates, a percentage, a score, or multiple version, we will have to see…

1 Like

Yes. This is exactly the tension you articulate in your article.

The old heuristic assumes status was binary and visible. Loading or done. Success or error. AI breaks that assumption. What looks like “thinking” or “processing” can just as easily be theater.

So I think the question shifts from showing activity to setting expectations.

If visibility of system status is about expectation management, what signals would you trust most from an AI system today?

2 Likes

Honestly the same we would expect from humans…

Evidence. Reasoning.
Handhold me to how you got there and why.

Explain and defend your point of view or “argument”.
And provide proof, references, links, etc. Maybe even disprove the opposite…

In order to design AI systems properly we should look into human communication and “rituals”.

I like these 10 principles for Agentic design.

Not that they are all right, but a good “take” on what it takes to design trustworthy systems.

1 Like

If trust signals in AI should mirror human communication, which heuristic do you think needs the biggest rewrite to support evidence and reasoning?

1 Like

In my opinion… the heuristic that needs the biggest rewrite is “Visibility of system status,” merged with “Match between the system and the real world.”

In AI, “status” isn’t progress, it’s epistemic status (confidence and justification combined). So the system should mirror how humans earn trust in conversation by making its evidence, reasoning, assumptions, uncertainty, and plausible alternatives legible in plain, domain-appropriate language (clearly separating what it knows from what it infers or guesses).

Now, “knowing” is a dangerous one… because what does it actually “know”…

I guess the key will lie in redesigning the essence of communication. We communicate, verbal, nonverbal. Conscious, subconscious. Somehow we need to squeeze this into AI.

We need to make it more obvious, more clear, more real, more human.

1 Like

Loving this conversation! @menno one thing I would propose is that the marketing and branding of LLMs and AI contribute to a disconnect with this heuristic:

Match Between System and the Real World
We have an expectation that AI will act as an assistant or intern when we prompt: “Give me ten reference articles from trusted organizations about niche topic {X}.”

A human intern is unlikely to present a list of 10 articles, with 5 of those being broken or invented links. The ‘real world’ expectation is broken not because the intelligence is artificial, but because of inherent issues within an LLM. @schuboxaz is always talking about how AI doesn’t recognize ontology, so the idea of “Give me 10 of these things that exist in the world” doesn’t translate.

I’m curious if this resonates, or if you think AI actually is a system closer to the real world and just in need of alignment?

3 Likes

True true. This is not human at all… We are not walking encyclopedias, well, most of us at least aren’t. And on the ontology, I agree. “it” doesn’t understand what is writes. Therefore it cannot “know”. All relationships are probabilistic, this is the major hurdle for our current LLMs to evolve.

On the human side, I really think the key is in communication and or expectation management.
Do we expect it to be right or wrong, do we tolerate mistakes or not, and is this different from human to human, or human to machine.

The intern will not “know” but we also don’t expect this intern to know..

3 Likes

Fantastic thread. We’ll keep this open for more questions, as it’s a strong topic for product and design teams evaluating their AI initiatives. Thanks, @menno, for pushing the thinking deeper into heuristics and how they apply today!

2 Likes

Thanks for a great article here @menno :fire: This made a lot of sense to me. Especially the idea that heuristics are guidelines and not strict rules.

We’ve seen with AI that it can sound very confident even when it’s wrong, which can be confusing (and problematic) for users. How would you make this practical? Would you actually measure things like trust, transparency, and being able to undo actions?

Looking forward to this!

2 Likes

@nathaliesmith, Thank you! Personally I don’t “like” the undo as a feature… I guess I would prefer “redo”! Most of all we need more clarity on certainty. Showcase how confident the AI is. I really think that is one of the biggest things missing with AI, that 1: it is new for us human, 2: Its somewhere in between magic, awesome and scary. 3: it doest tell what it does or knows, or doesnt… So a large part will just be the "adoption hurdle” which time will deal with. But we need better principles under designing these experiences. We need principles about trust, accuracy or even agency and power.

1 Like

Love this POV “Its somewhere in between magic, awesome and scary.” hahaha made me chuckle because its so true. How can something be SO freaky + unknown, but wildly useful at the same time.

Agree with everything you said & appreciate the perspective so much! Cheers! Can’t wait for your next feature :fire:

2 Likes

Just one of the heuristics I wanted to pull out of the article

Recognition Rather Than Recall
Let people pick from what they see instead of remembering what to type. Menus, thumbnails, autofill chips, recently used prompts, these reduce memory load.

I’m such a “search” power user- for Slack, for Programming, for Tailwind docs, for pretty much anything.

Curious if I’m just an outlier :rofl:

1 Like