Skip to content
All posts

The Pattern That Keeps AI Products From Turning Into Spaghetti

Andrew Williams··8 min read

There is a particular kind of technical debt that only shows up once something is working.

You build an AI-powered system - a content engine, an outreach tool, a research pipeline. It does its job. Then someone asks: can we add lead generation to this? Or competitive monitoring? Or contract drafting?

And the answer, if you built it the normal way, is: yes, but it is going to be painful. Because the agents, the database schema, the sidebar navigation, the API routes - they are all coupled together in a way that made sense when there was only one use case and makes no sense now that there are two.

This is the unglamorous side of shipping AI products quickly. Moving fast is the right call. But if there is no structure underneath, every new capability becomes a surgery, not an addition.

Here is the pattern we reached for when we ran into this problem with a marketing engine we have been building.

The problem with monolithic AI pipelines

Most AI pipelines start life as a single workflow. There is an orchestrator that calls a handful of agents in sequence - researcher, strategist, writer, publisher - and the whole thing is wired together in one place.

That works fine for one use case. The problem starts when the system grows.

When we wanted to add a lead generation pipeline alongside the content pipeline, we had a choice: bolt it onto the existing orchestrator, or create a second parallel system that duplicated all the shared infrastructure. The first option makes the orchestrator a mess. The second means maintaining two versions of the same plumbing.

Neither is the right answer. The right answer is to make the system extensible before you need it to be.

What a capability registry is

A capability registry is not a framework. It is a convention and three functions.

The idea is that each distinct use case - content generation, lead generation, anything else - is a "capability." Each capability lives in its own folder, owns its own agents and database tables and API routes, and declares its own metadata (name, nav items, agent list) by calling registerCapability() at import time.

The core registry itself is about thirty lines:

// lib/capabilities/registry.ts
const capabilities: CapabilityDefinition[] = []

export function registerCapability(def: CapabilityDefinition) {
  capabilities.push(def)
}

export function getAllCapabilities() {
  return capabilities
}

export function getAllNavItems() {
  return capabilities.flatMap(c => c.nav)
}

export function getAllAgents() {
  return capabilities.flatMap(c => c.agents)
}

That is genuinely it. There are no abstract base classes. No lifecycle hooks. No dependency injection. No config files that need updating every time you add a capability.

To register a capability, you create a file that calls registerCapability():

// lib/capabilities/leads.ts
import { registerCapability } from './registry'

registerCapability({
  id: 'leads',
  name: 'Lead Generation',
  description: 'Find, classify, and prepare outreach for potential leads',
  nav: [
    { href: '/leads', label: 'All Leads', section: 'Lead Generation' },
    { href: '/leads/generate', label: 'Generate Leads', section: 'Lead Generation' },
  ],
  agents: [
    { key: 'lead-finder', label: 'Lead Finder', name: 'finder', color: 'blue' },
    { key: 'lead-classifier', label: 'Lead Classifier', name: 'classifier', color: 'purple' },
    { key: 'pitch-generator', label: 'Pitch Generator', name: 'pitcher', color: 'green' },
  ],
  runType: 'lead-generation',
})

And then you add one import to a barrel file:

// lib/capabilities/index.ts
import './content'
import './leads'

The sidebar reads from getAllNavItems(). The agent runner reads from getAllAgents(). New capability appears everywhere automatically.

What the capability actually owns

The registry handles discovery. The capability handles everything else.

A capability is a folder, not a class. For lead generation, that means:

Agents in lib/agents/leads/ - a finder that uses Claude with web search to identify potential leads, a classifier that scores and rates each one against your criteria, a pitch generator that produces a tailored angle and outreach draft for the strong leads. Each agent uses the same askClaude() and extractJson() helpers as the content agents. No new patterns to learn.

Database schema in db/schema-leads.ts - a leads table with company and contact information, classification scores and rationale, outreach drafts, and status tracking. Re-exported from the main schema file but owned by the capability.

API routes in app/api/leads/ - list, detail, and a generate endpoint that fires the pipeline and returns a run ID.

UI pages in app/leads/ - a list view with filters, a detail page with the full classification and pitch output, and a generation form that hands off to the existing agent session viewer once the run starts.

The content pipeline does not know any of this exists. The lead generation pipeline does not touch any content code. They share infrastructure (the run log, the agent runner, the SSE streaming) but own their own domains completely.

The shared infrastructure changes

Adding a new capability does require widening a couple of shared pieces. These are small but deliberate.

The agentRuns table gets two new columns: runType (which capability owns this run) and result (a JSON blob for capability-specific output). Both are nullable, both are backwards compatible. Existing content runs continue to work without changes.

The RunEvent.agent type - previously a fixed union of the content agent names - gets widened to string, so lead generation agents can write to the run log without being listed in a hardcoded type definition.

That is the full scope of the shared infrastructure changes. Two columns and a type widening. Everything else is new files.

The lead generation pipeline itself

The lead gen orchestrator follows the same fire-and-forget pattern as the content orchestrator.

It takes a search query and optional criteria, creates an agentRuns record with runType: 'lead-generation', and runs the pipeline:

  1. Lead Finder - Claude with web search, looking for companies and contacts that match the query. Returns a JSON array of raw leads with company name, website, contact details, and context.

  2. Lead Classifier - Takes each raw lead, scores it 0-100 against your ideal customer profile, assigns a classification (ideal, strong, possible, poor), and writes a rationale. This is where the brand context matters: what you are offering, who you are best suited to work with, what signals indicate a good fit.

  3. Pitch Generator - Runs on leads above a score threshold. Takes the classification rationale and company context, produces a pitch angle, sales notes, suggested service, and a first-touch outreach draft. Not a template - a tailored output based on what the classifier surfaced about that specific company.

  4. Leads are inserted into the database. The run record is updated with a summary result.

The whole pipeline streams through the existing SSE viewer at /agents/{runId}. If you have watched a content run generate and stream its output, lead generation looks exactly the same - because it uses exactly the same infrastructure.

Why not a framework

The instinct when building extensible systems is to reach for a framework. Abstract base classes with required methods. A plugin interface. A lifecycle with hooks.

The problem with that approach at this stage is the overhead. Frameworks require documentation. They require every new capability to learn the interface. They create coupling in the opposite direction - the capability must conform to the framework, which means the framework must be designed carefully enough to accommodate capabilities you have not thought of yet.

The registry pattern sidesteps this. The convention is: put your agents in lib/agents/{id}/, your schema in db/schema-{id}.ts, your routes in app/api/{id}/, your pages in app/{id}/. Register metadata by calling one function. That is the entire interface.

When you want to add a third capability - say, contract generation, or competitor monitoring - you create the folder, write the code, and add one import. The sidebar updates. The agent runner sees the new agents. Nothing in the existing code changes.

Extensibility should feel like addition, not modification. The registry pattern enforces that without requiring a framework to hold it together.

The version you actually ship

There is a temptation, when writing about architectural patterns, to present a fully designed system and work backwards to the rationale. That is not how this happened.

The marketing engine started as a single pipeline because that is all it needed to be. The registry pattern was introduced when the second use case appeared - and it was introduced as the simplest possible thing that solved the problem, not a designed architecture to handle every future case.

The structure is light enough that adding a capability takes a few hours, not a few days. That is the measure that matters. Not elegance in the abstract, but friction in practice.

If you are building internal AI tooling and hitting the same wall - working pipelines that are painful to extend - the answer is usually not a framework. It is a convention, a registry, and a shared infrastructure that stays genuinely shared. The rest belongs to the capability.

Founder insights

Weekly notes on product, brand, and shipping fast - no spam.

More Posts