AI Does More Than You Think (But It Needs Your Help)

General LLMs can do a lot, but as we all know, they fall down hard in more specialized/complex areas.

Claude is doing a great job at creating models that are fantastic at programming. All of the toolsets, harnesses, model training, and user-facing tools point towards programming.

Claude isn’t good as ChatGPT at:

  • Helping you with daily health routines
  • Solving business problems (more or less)
  • Supporting your interpersonal communication skills

…and a bunch of other more general information/tasks which makes complete sense.

What I see a lot of companies doing is specializing agents (aka creating a harness, training, etc around a particular vertical), and some are creating tons of success doing so (Chatbase, a customer-facing AI harness, hitting 10m ARR).

Skills, I think, are one of those areas that you can help give expertise to models, or, what I like to call JIT information, that help guide the LLMs/agents making them more vertically inclined towards a specific skill set.

Normally, LLMs suck at doing things that ZURB is great at: defining business problems, aligning problem spaces, exploration, etc, but with the power of Glare, and the open source AI Skills that we’ve built for it, we’ve been able to emulate a decent amount of quality and thinking that ZURB does. It’s actually super cool to see in practice.

I’m really intrigued to if other companies/teams start taking this direction as well, compacting their domain knowledge into more flexible, AI-usable formats.

Great stuff Ben.

“That means we give the model the context, tools, workflows, guardrails, and human-in-the-loop systems to be the best ambassador for your brand”

Ok, for the nerds out there (@nikhil_mahen), here’s how we’re thinking about this with Glare.

Inputs

Validation Rules

Confidence Evaluation

Classification Constraints

Routing Logic

Output Contract

Escalation Conditions

Next Workflow State

We’re building each decision block to include agent operations- curious if people have experience with these details and learnings from diving deep into the workflows.

  • Required Inputs
  • Output Contract
  • Confidence Thresholds
  • Classification Constraints
  • Failure Modes
  • Transition Logic
  • Escalation Logic
  • Ambiguity Handling
  • Guardrails
  • Multi-Agent Workflows
  • Human Judgment

Diving deeper into these topics…seems the Anthropic team is really leading the way here: https://www.anthropic.com/engineering

1 Like

Yup. At different levels of granularity you can have the system architecture, process map, user journeys etc.

What you are trying to do will be awesome. Making it generic will be challenging though.

Maybe look at the architectural map.

That might be a bit more forgiving of generalisation.

1 Like

:slight_smile: We’re a bit ambitious.

We just got done rewriting the User Needs section in the Decision Map (haven’t updated the online documentation yet). The Skills are being built from these docs, and we have a Master router. The goal is to create tighter integration so that they are building on each other.

I think it’s also helpful to have good, original data to start from, like the UX metrics scores and the concept testing.

Here’s how were thinking about the Blocks, like the User Need pages:

  • Overview- Explains how User Needs helps teams turn signals, feedback, and behavior into clearer understanding of user friction and experience gaps.

  • Techniques- Covers methods for interpreting behavior, recognizing patterns, and identifying the underlying conditions driving user friction.

  • Playbook- Provides the workflows, prompts, inputs, and outputs teams and AI Skills use to structure user needs operationally.

  • References- Provides definitions, signals, UX metrics, heuristics, and evaluation guidance for all 20 user needs.

  • Examples- Shows realistic situations, emerging hunches, and near-miss cases that help teams and AI systems recognize user needs in practice.

  • Decisions-Helps teams identify their situation, determine where to start, and choose the next step that reduces uncertainty.

  • Agent Operations- Defines how AI Skills should behave operationally through routing, confidence rules, escalation logic, output contracts, and ambiguity handling.

1 Like