Understanding How Your AI Agent Acts

AI agents are easy to use and hard to reason about.

For product and design leaders, it’s a risky combination. When something ends up going wrong (it will), users will not be blaming the model. They will blame your product.

I have been reading a lot about the Claude Code architecture, OpenAI agent documentation, and building some myself to understand how these systems behave. Along the way, I started mapping out a simple mental model for how agents think, loop, and execute.

This post’s goal isn’t to fully break-down everything on an implementation level. It’s more about giving non-engineering leaders some background info to evaluate agent behavior. It should help you spot weak assumptions and avoid shipping systems your team may not fully understand.

LLMs vs AI Agents

As a product leader deep in the software world, you need to understand the difference between these two technologies. They are NOT the same. When your engineering department comes to you talking about an LLM and an Agent, they’re talking about two different things.

The core technology with any AI system are LLMs or Large Language Models. Simply put, they take inputs (from users) and produce outputs (funnily enough, they actually output numbers aka tokens that end up getting translated into human readable text).

With this, you can talk to the LLM. This is how chat tools like ChatGPT work. They’re dead simple in this way. You talk, the LLM responds.

This can be useful in many contexts; getting direct answers to questions, finding information on the web, etc. However AI Agents take this to the next level.

The Agent Loop

Core to the architecture of an Agent is the Agent Loop. The LLM runs more than once. “Why do I even need to know this” you might be asking yourself. Well, you want to help your team perform, right? You have to understand what an Agent “Loop” entails.

Let’s start off with something you might have more familiarity with (hopefully). The human brain.

When you’re doing any sort of task, what’s really happening in that thick skull of yours? Our brains are very complex. Millions of neurons firing at any given moment. There’s loads of filtered context being pulled in at any given moment, where you are, who you are, why you’re doing things in the first place, checks and balances to tell you to stop when it makes sense to.

Agents are the same.

We’re trying to do the exact same thing with an Agent Loop. Applying checks and balances, pulling in context, consuming documents to understand its surroundings, etc. In order to do all of this, it needs to be able to have the ability to pull that context in. Prompting the task-giver for questions, looking for documents online, trying to understand the current codebase’s structure, applying its own checks and balances against the outputs it’s producing.

Ultimately, just as we’ve figured out how to fly based on birds’ bone structure, we’re doing the same thing in the case of AI. Agents are one step closer than an LLM to a human.

Here’s a diagram I whipped up to help contextualize what an Agent Loop consists of:

This specific Agent loop is more closely related to Claude Code’s structure, but I think we can apply it to a lot of other areas, not just programming.

Here are a couple of core phases that I was able to pull out:

  • The Reasoning Phase - The initial pass of an LLM to understand the current high-level context, the user’s ask, and rationalizing what needs to happen next.

  • The Exploration Phase - The phase after initial reasoning that allows the agent to build more context, ask questions, and put together an executive plan.

  • The Execution Phase - The final phase that allows the agent to build on top of itself, adding context, and attempting to solve problems.

These all come into play as we look at an agent from a higher level. This diagram below shows all of the different levers that AI agents can take advantage of:

There are three core spaces that we can think about here:

  • The Core LLM - The model has its own set of core capabilities that allow it to effectively navigate the world. Compute, context limits, training data. Most of the time, you won’t touch any of this (unless you have billions of $$$ to spend). You just need to understand it.

  • Agent Context - What Agents can “see”. Super important to manage this effectively. A badly managed context is a badly managed agent. Our brains consume relevant information depending on the situations. Agents need to work the same.

  • Agent Capabilities - These are the tools that the agent is either provided for by default, or extended on top of through developer written MCP (model context protocol; a bit like an API but AI specific) tools, executive access to resources, and other means of execution like API’s, uploaded documents, etc.

Take a step back, take a deep breath in, and take another look at those diagrams. How does everything connect?

All of these spaces have to be considered in order to create effective AI agents. Not knowing that an AI’s context window can handle only a few 1000 tokens can severely limit you on how to leverage context effectively.

Not knowing these fundamentals will ultimately be a foot-gun when you want to make improvements, or just simply build an AI Agent that actually works. That way you can create solutions that aren’t dragging the agent down, but building it up.

1 Like

I like this breakdown, and it’s interesting to read how you are considering the different pieces.

So…for the average worker, should they be more invested in learning about agents?

1 Like

Love this @ben ! Excited that I can talk to you later, because those charts are impenetrable for me. (No shade, I’ve seen multiples that look very close to these.)

1 Like

Yes, because it’s a harder problem, with higher value potential. LLMs (not Agentic) only really matter in a chat interface, without much room for any sort of actions that it can actually perform.

With just an LLM, you’re not solving for anything new nor worthwhile.

If you could create better ones I’d definitely leverage them! (You should see the larger ones this came from lol)

I wanted to open up the conversation a bit more here to gauge interest:

  • Do people here care how agents work?
  • Is AI a good topic to be talking about?
  • Does the difference between LLMs, VLMs, and Agents actually matter to product leaders?

My hunch is that this will be a progressively more important conversation over the next year, but something tells me teams are still trying to wrap their heads around basic workflows and handoffs without agents.

I know @menno, @nikhil_mahen, @sean_savage, @schuboxaz, @Kike_Pena, @greg_nudelman are heavily invested in this work. Curious on your thoughts.

1 Like

So to clarify @ben, LLMs just produce an output from an input? while Agents account for context before giving it’s output?

Or in other words, is it like training a dog? a dog with no training (LLM) poops whenever and wherever it needs to go after it has eaten, while a trained dog (Agent) gets the same ‘inputs’ but knows to poop outside?

Such interesting questions, @ben. Here are some answers from my experience:

  1. People (users) shouldn’t bother understanding how a product works; they just hope it works to use it or pay for it eventually. Maybe in some conversations it’s helpful to use these terms, but it’s not a familiar language to engage with people; that practice makes us look niche and distant. For people building products, I think it’s essential to create experiences.

  2. AI is the new core of technology, so yes, it’s a topic that shouldn’t be overlooked. In general, our reality is being affected by it.

The distinction between LLMs, VLMs, and Agents, I would say, matters when building something; understanding the technology gives you the power to strategize and make decisions. You don’t have to excel at everything, but being conscious of the technology is good enough.

2 Likes

Mostly! Here’s the difference:

An Agent can retrieve context without the help of a user dynamically. Basically, it can “feed” itself when it needs to. Example: if you don’t know how to write posts, you do a web search and read through the pages. Agents can do the same.

An LLM can have the context, but it needs assistance from either a user and/or program (like how we’ve been injecting the context like images and knowledge into our prompts in the Design Analysis tool). Example: an LLM can’t open a book and read it if it needs to learn something. (ofc, ChatGPT these days will do this, but I’d argue that’s the agentic piece that’s running)

With your example, an LLM can be a pre-trained dog (if the training that goes into the core of the model overlaps, OR if you feed it on how to be a good dog in its prompt), but, if it doesn’t know a trick, it can’t really teach itself. Being an Agent gives it the capabilities to fill this type of gap.

Love the response @Kike_Pena!

I also agree that users shouldn’t be obligated to learn about agents at all (unless of course they want to).

1 Like

Wanted to follow up on this because I realized I didn’t talk about what the truly “average” worker should be focused on.

They don’t have to understand exactly how an Agent works, but they should have a high-level idea about what it can do.

Ultimately, if things keep trending the way they are now, we’ll see a shift with how people need to be thinking about how to work. Right now, a wider range of patterns can now be automated, which means that there’s a huge amount of chaos to follow, much larger than previous tool revolutions.

Either we’ll see the government start throwing the hammer down to slow down the progress so that 90% of the workforce doesn’t get automated out over the next few decades (or they’ll lean in, but they have to deter the stress-factor of society with the large amounts of chaos)

Why do you think there is so much apprehension around the adoption?

1 Like

I’d argue it’s the rate of change is happening so quickly that people are unaware of what’s available and what changes they need to make. The winners right now are making huge bets and hedging across a bunch of areas.

2 Likes

It’s scary and there’s resentment around the idea of robots taking jobs.

Yep, it’s absolutely crazy. Surprised Claude is not considered, I actually think they’re starting to lead the way when it comes to automation.

I always go back to the rise and fall of switchboard operators. Jobs will be lost for sure. The question is, are they replaced by something else?

This is where it gets tricky. It’s hard for me to imagine a functioning society where 10% of people work to support everyone else.

In the very near term, I think people with a systems-thinking mindset will need to go even deeper into the systems that sustain a product or an organization.

While I have highlighted these as linear flows in my post, these systems are about to get crazily complex with an increasing number of feedback loops that anyone can keep track of.

1 Like

Agreed. I think ultimately though anyone with agency and a need for speed (to work lol) will end up leaning into meta systems thinking, and I think it’ll also be more fun and impactful than ever.

:light_bulb: perhaps controversial, I watched about half of that interview with Musk and was totally wigged out. I’m unsure how peeps are excited / onboard with the idea that AI will take up so many jobs. Seems like we have the ability not to lean into that idea, as a society but what do I know. LOL

1 Like

I’ve heard that we’ll probably see a fork of what work becomes. I think there’s still a pretty big position for people to manage.

Take a lawyer for example: yes, AI could do 90% of the research but ultimately people want a specialist to fall onto to blame when something goes wrong.

Plus, there are still plenty of AI gaps. Hallucinations are still a vibrant issue. Same with the propensity of the models to lie.

1 Like