What does human-in-the-loop really mean in practice? (Q&A)

MoData · January 9, 2026, 12:10am

@patrizia_bertini you’re dropping some gold in here

“I personally think of LLMs as less than a junior. I see them as hyper educated interns from Mars. They are eager to please, never contradict you, and incredibly good at producing convincing outputs. But those outputs can look great until you actually engage with them and test them.”

Very accurate (and funny) analogy. I have to bring up my (now 2nd) favorite description of AI provided by @ben :

“AI is a highly efficient gap filler. The bigger the gap, the worse the result is.”

EricZ · January 9, 2026, 1:57am

Some ideas for the thrill of receiving a deliverable in minutes that will require hours of QC and assessment.

Rapid Assembly Malaise
Rapid Euphoric Dissonance
Euphoric Assessment Delivery

@ben you got more?

ben · January 9, 2026, 5:58pm

YES. Somebody has to. What’s interesting is that this is also a problem that companies and large governmental organizations run into as well. Who is responsible when large amounts of toxins are found in paper receipts?

menno · January 9, 2026, 7:36pm

Awesome article and great thread.

He had a while ago a thread on Hitl, Hatl, Hotl, Hbtl.

Has anyone found any good operating models/descision models for when and where to put the human?

Bryan · January 10, 2026, 2:43pm

Seems like a fertile area to explore, especially when bringing in metrics to support judgment.

Most framework documents are either simple like this, or going more into the technical flow charting. @menno, are you thinking about the actual human reasoning at each step?

menno · January 10, 2026, 3:57pm

No, I was more thinking related with the “in the loop”, “on the loop”, “above the loop” etc…

Like if trust is high, focus on “on or above” or even “behind” the loop.

Or the greater the task size, the more likely it will be “in” the loop..

So a decision model for “where to put the human”. This may very well vary per line of business and or use case though. But there must be generic principles, like when the tasks go into the 10s of thousands, you want to see a report. When there are few but “expensive” tasks, you want to be close etc…

So, if volume goes up, human necessity goes down.

If risk goes up, human necessity goes up.
If trust is high, human need goes down, untill a certain threshold?

Bryan · January 14, 2026, 2:45am

Here are a couple research-oriented approaches…but not seeing practical examples:

https://repository.lsu.edu/cgi/viewcontent.cgi?article=1464&context=honors_etd

jonathon.thomason_m2 · January 14, 2026, 5:42am

Cheers and thanks for this topic!
The distinction between outputs and outcomes feels critical here. AI can flood teams with plausible outputs, but outcomes only emerge once someone applies judgment, context, and consequence thinking. When that layer is underdefined, responsibility is likely to quietly evaporate.

What’s tricky is scale. As volume increases, it becomes unrealistic to put humans “in the loop” for everything. That suggests the real design challenge isn’t whether humans are involved, but where judgment is most valuable and where it meaningfully changes risk.

I’m increasingly convinced we need clearer models for:

Which decisions are reversible vs irreversible
Where bias or harm would be most costly
What signals justify slowing the system down

Without that, HITL risks becoming theater — comforting, but ineffective.

Bryan · January 27, 2026, 2:55pm

Good stuff @jonathon.thomason_m2! Went digging a bit more to see which frameworks aligned with judgment and trade-offs. These are not specific to AI, but overlap in other disciplines.

Leader Character Framework
Defines effective leadership as rooted in a set of core character dimensions that shape how leaders think, decide and act to achieve sustained excellence. Not specifically AI, but a core group of ideas to support leadership.

Human-Machine Teaming (HMT)
Refers to the formation of collaborative partnerships between humans and artificially intelligent systems, characterized by shared goals, coactive problem-solving, mutual awareness, and synergistic adaptation.

The 30% Rule of AI: Automate a Third, Amplify the Rest
Guideline in AI integration where 30% of routine work is automated, and 70% is left for human judgment to handle tasks requiring creativity, context, and empathy. Not sure on this one… is this a balance that makes sense now?

Comparative Judgement (CJ)
Specifically used in education, combines the strengths of human judgment with the speed of Artificial Intelligence

AI Debate (Adversarial) Framework
It’s a method where AI systems argue opposite sides of a topic to catch mistakes, bias, and weak reasoning.

The Seven-Eyed Model
Supervision for counsellors and psychotherapists with seven areas of focus

Ultimately, it seems we will need to create new patterns @menno, but there are a lot of great areas to pull from.

Topic		Replies	Views
Design has shifted from control to stewardship (Q&A) Using AI	23	34	February 2, 2026
Who Owns Judgment When AI Is Doing the Work? (Q&A) Using AI	19	54	March 27, 2026
Understanding How Your AI Agent Acts Using AI	27	63	February 2, 2026
Learning Endlessly, asking for a friend (AI edition) Using AI	3	23	January 16, 2026
Are your teams heading toward the same future? (Q&A) Aligning your team	22	27	February 10, 2026

What does human-in-the-loop really mean in practice? (Q&A)

Related topics