Skip to content

Memory

Memory management middleware. Prevent context window overflow by compacting conversation history with 5 built-in strategies.

Automatically compacts messages when the token count exceeds a configured limit. Only modifies ModelContext.messagesSessionContext.history is never touched.

function memoryCompaction(config?: CompactionConfig): Middleware
// Simple truncation (default, zero cost)
agent.use(memory.compaction({ maxTokens: 8192 }))
// Hybrid: summarize old + keep recent verbatim (best quality)
agent.use(memory.compaction({
maxTokens: 8192,
strategy: "hybrid",
summaryModel: mySummaryModel, // LanguageModelV3 instance
keepRecentMessages: 10,
}))

Hooks: model — checks token count and compacts before next().

Five strategies (gentlest to most aggressive):

StrategyDescriptionCost
clear-tool-resultsReplace old tool results with placeholdersFree
truncate (default)Drop oldest messagesFree
windowKeep last N messagesFree
summarizeLLM summarizes old messages1 LLM call
hybridSummarize old + keep recent verbatim1 LLM call

Config options:

OptionTypeDefaultDescription
maxTokensnumber8192Token limit for context window
strategyCompactionStrategy"truncate"Compaction strategy
keepLastnumber20For window: keep last N messages
keepLastToolResultsnumber3For clear-tool-results: keep N recent results
keepRecentMessagesnumber10For summarize/hybrid: keep N recent messages
summaryModelLanguageModelV3For summarize/hybrid: model for summaries
tokenCounterTokenCounterchars/4Token estimation function