The `(ctx, next)` pattern for AI agents

Fast and minimalist agentic AI framework in TypeScript.
Only three concepts: Agent, Session, Middleware.
The middleware pattern from Express, Koa, and Hono — applied to building AI agents. Everything beyond the core agent loop — retries, cost tracking, memory, tools, guardrails and other — is middleware.

const { text } = await new Agent({
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a helpful assistant.",
})
  .use(observe.usage())
  .run("What is the meaning of life?").result

const agent = new Agent({
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a weather assistant.",
})

agent.use(tools.function({
  name: "get_weather",
  description: "Get weather for a city",
  schema: z.object({ city: z.string() }),
  execute: async ({ city }) => `Weather in ${city}: 72°F`,
}))

const { text } = await agent.run("Weather in Tokyo?").result

const { text, state } = await new Agent({
  model: "anthropic/claude-sonnet-4-6",
  instructions: "You are a helpful assistant.",
})
  .use(observe.usage())              // token tracking
  .use(guard.budget({ limit: 1 }))    // $1 cost cap
  .use(model.retry({ maxRetries: 3 })) // auto-retry
  .run("What is the meaning of life?").result

console.log(state["observe:usage"])  // { inputTokens, outputTokens }

Get Started →

Try it now

Terminal

 $ npx create-agent-express --template research▊ 
 ✓ Creating research agent...
 ✓ Installing dependencies...
 ✓ Project ready at ./research-agent

Run: cd research-agent && npx agent-express dev 

$ npx create-agent-express --template support-bot

Built for production

Works with Anthropic OpenAI Google Mistral Groq DeepSeek + any AI SDK provider

600+ tests TypeScript strict MIT license ESM only Node.js 20+

🛡 Guardrails Input/output validation 💰 Cost Tracking Per-session budget caps 👤 Human-in-the-Loop Tool approval gates 🔍 Search & RAG Knowledge base + web search 🔐 PII Protection Redact & restore for tools 💾 Session Persistence SQLite, Redis, Postgres 🧠 Memory Context compaction 🔗 MCP Model Context Protocol 🔀 Model Routing Complexity-based 📈 Observability Logs, OTel metrics & traces ⚡ Streaming Real-time SSE events 🔧 Testing Mock models, replay, snapshots 📜 Structured Output Zod-validated JSON

Everything is middleware

Every feature above is a (ctx, next) middleware. Same pattern, 6 namespaces.
Or build your own — same interface, infinite possibilities.

guard search observe model memory tools dev

agent.use(guard.budget({ limit: 5 }))     // $5 USD cost cap
agent.use(guard.approve({ tool: "email" })) // require approval
agent.use(guard.timeout({ turn: 30_000 })) // 30s timeout

agent.use(search.file({ retrieve: myRetriever })) // RAG — knowledge base search
agent.use(search.web({ provider: braveProvider() }))  // web search tool

agent.use(observe.log())      // structured JSON logging with levels
agent.use(observe.metrics())  // OpenTelemetry metrics (Prometheus, OTLP)
agent.use(observe.traces())   // OpenTelemetry distributed tracing
agent.use(observe.usage())    // token tracking → state['observe:usage']
agent.use(observe.tools())    // tool call recording
agent.use(observe.duration()) // turn timing

agent.use(model.retry({ maxRetries: 3 }))  // exponential backoff
agent.use(model.router({                    // complexity routing
  tiers: [{ model: "anthropic/claude-haiku-4-5-20251001", maxTokens: 100 }]
}))

agent.use(memory.store({ backend: sqliteStore() }))  // session persistence
agent.use(memory.compaction({                       // context window management
  strategy: "hybrid",
  maxTokens: 8192,
}))

agent.use(tools.function({
  name: "get_weather",
  description: "Get weather for a city",
  parameters: z.object({ city: z.string() }),
  execute: async ({ city }) => `Weather in ${city}: 72°F`,
}))

agent.use(dev.console()) // full lifecycle terminal trace

How it compares

Fewer concepts, more built-in capabilities

Feature	agent-express	Mastra	Vercel AI SDK	LangChain.js
Core concepts	3	15-20	5-8	30+
Extension model	`Middleware (ctx, next)`	Processors, Tools, Workflows	Hooks, Providers	Chains, Agents, Tools, Memory
Built-in testing	Yes	No	No	No
Cost control	`guard.budget()`	Manual	Manual	Manual
Human-in-the-loop	`guard.approve()`	Manual	No	Manual
Sessions	First-class `Session`	Manual state	No	Memory modules
RAG / Search	`search.file()` + `search.web()`	Built-in RAG	No	Retrievers + chains
Session persistence	`memory.store()`	Built-in	No	Memory modules
Memory management	`memory.compaction()`	Plugin-based	No	Memory modules
PII protection	`guard.piiRedact()`	No	No	Manual
Observability	Logs, OTel metrics & traces	OTel tracing	OTel spans	Callbacks / LangSmith
Streaming	SSE via `createHandler()`	Built-in	First-class (`streamText`)	Callbacks
TypeScript	Strict, ESM only	TypeScript	TypeScript	TypeScript

Start from a template

default

Minimal agent starter with one tool

npx create-agent-express --template default

coding

Coding assistant with file tools and approval gates

npx create-agent-express --template coding

research

Research agent with model routing and output guards

npx create-agent-express --template research

support-bot

Production support bot with budget, approval, and memory

npx create-agent-express --template support-bot

FAQ

How is this different from Mastra?

Mastra is a full AI platform with 15-20 concepts, including workflows, RAG, and deployment infrastructure. Agent Express is a focused middleware framework with 3 concepts. If you want an all-in-one platform, use Mastra. If you want a composable building block that fits into your existing stack, use Agent Express.

How is this different from Vercel AI SDK?

AI SDK focuses on streaming UI and provider abstraction. Agent Express focuses on the agent loop — the model-to-tool-to-model cycle with middleware at every level. They're complementary: Agent Express uses AI SDK's provider format internally.

Is it production-ready?

Agent Express is at v0.3.0 with 619 tests, 89% coverage, and TypeScript strict mode. The core middleware API is stable. 14 adapter packages available for search, embedding, and session persistence.

What models are supported?

Any AI SDK v3 provider works out of the box. Install the provider package (e.g., npm install @ai-sdk/google) and use "google/gemini-2.0-flash". Supports Anthropic, OpenAI, Google, Mistral, Groq, DeepSeek, Amazon Bedrock, Azure, xAI, Cohere, and more.

Can I use it with my existing Express/Hono/Fastify app?

Yes. createHandler(agent) from agent-express/http returns a Web-standard Request/Response handler. Mount it on any route. See the HTTP & Framework Integration guide.

How does testing work?

Agent Express ships with TestModel, FunctionModel, record/replay cassettes, and snapshot testing via agent-express/test. Tests run without real API calls, zero cost, zero latency. See the Testing guide.

Why middleware instead of graph topology?

Every JS/TS backend developer already thinks in (ctx, next) from Express, Koa, and Hono. Middleware composition replaces dedicated APIs for memory, guards, tools, tracing. Graph-based topology (LangGraph) is more powerful for complex state machines, but most agents are linear loops with cross-cutting concerns. Middleware is the simpler abstraction for the common case.

What's the license?

MIT. Use it in personal projects, commercial products, SaaS, whatever you want. No strings attached. See the full license.

The (ctx, next) pattern for AI agents