Overview / Description
ReasoningBank by Google is an AI agent memory framework that distills generalizable reasoning strategies from both successful and failed experiences for AI developers building long-running autonomous agents. Published by Google Cloud Research Scientists and presented at ICLR, ReasoningBank addresses a core limitation in deployed AI agents: the inability to learn and improve after initial deployment without retraining.
Unlike trajectory memory approaches such as Synapse — which store exhaustive action-by-action logs — or workflow memory systems like Agent Workflow Memory that document only successful runs, ReasoningBank distills higher-level strategic patterns into structured memory items. Each memory item contains a concise title, a description, and extracted reasoning steps or decision rationales drawn from past experiences. This structured format makes memories reusable across different contexts rather than tied to a specific task sequence.
The memory system operates in a closed retrieval-extraction-consolidation loop. Before acting, the agent retrieves relevant memories from the ReasoningBank. After completing a task, an LLM-as-a-judge self-assessment evaluates the trajectory and extracts either success insights or failure reflections into new memory entries. Critically, the system actively mines failed experiences to build counterfactual signals — for example, learning from a navigation error to always verify the current page identifier before attempting to load more results, rather than only storing the procedural step itself.
When benchmarked on web browsing and software engineering tasks, ReasoningBank improved both agent success rates and efficiency measured in steps per task compared to baseline methods. The framework and code are available on GitHub as an open-source research artifact. ReasoningBank is best suited for AI researchers and developers building persistent, self-improving agent memory systems who want a foundation grounded in peer-reviewed research.
Used For
ReasoningBank is used by AI researchers and developers to build persistent, self-improving memory for autonomous agents, enabling them to learn from both successes and failures after deployment across tasks like web navigation and software engineering.
Pricing
Pros & Cons
Pros
- Distills both successful and failed experiences into structured memory items with title, description, and extracted reasoning steps
- Closed retrieval-extraction-consolidation loop enables continuous self-improvement after deployment without retraining
- LLM-as-a-judge self-assessment is robust to judgment noise, reducing the need for perfectly accurate self-evaluation
- Benchmarked on web browsing and software engineering tasks, showing higher success rates and fewer steps compared to baseline approaches
- Open-source code available on GitHub, grounded in a peer-reviewed ICLR paper
Cons
- Memory consolidation strategy is deliberately simple (append-only) — more sophisticated merging and deduplication are left for future work
- Limited public documentation on production deployment, scaling to large memory banks, or integration with specific agent frameworks outside the paper's benchmarks
- Research-grade codebase aimed at AI researchers rather than a turnkey developer SDK with versioned releases and support
Questions & Answers
Alternatives
Synapse, Agent Workflow Memory, Mem0, Zep, MemGPT