Light
Dark

Stateful Agents: The Missing Link in LLM Intelligence

Research
February 6, 2025

Large language models possess vast knowledge, but they're trapped in an eternal present moment. While they can draw from the collected wisdom of the internet, they can't form new memories or learn from experience: beyond their weights, they are completely stateless. Every interaction starts anew, bound by the static knowledge captured in their weights. As a result, most “agents” are more akin to LLM-based workflows, rather than agents in the traditional sense

The next major advancement in AI won't come from larger models or more training data, but from agents that can actually learn from experience. This post introduces “stateful agents”: AI systems that maintain persistent memory and actually learn during deployment, not just during training.

The Fundamental Limitation of LLMs

The only information an LLM has is what is baked into its weights, and what is in its context window. This is why most “agents” today are essentially stateless workflows: they have no way to persist interactions beyond what fits into the context window. 

Why Current Memory Approaches Fall Short?

Context Pollution Most approaches to memory today rely on rudimentary retrieval mechanisms (e.g. embedding-based RAG) that pollute the context with irrelevant information. “Context pollution” from RAG-based memory is particularly problematic as it can degrade agent performance. Recently released “reasoning models” explicitly discourage developers from adding excessive in-context learning (ICL) examples or data retrieved via RAG. These newer models benefit from simpler, shorter prompts: making stuffing the context window with potentially relevant “memories” even more of an insufficient solution.

Lack of Memory Consolidation
Forming memory is an iterative process: whether its reviewing lecture notes or thinking back on what we should have said during that argument, our brains spend significant energy deriving new insights from past information. Unlike us, agents don’t spend downtime reflecting on their memories.

Stateless Abstractions LLM APIs and agentic frameworks that are built around the assumption of statelessness. State is assumed to be limited to the duration of ephemeral sessions and threads, baking in the assumption that agents are and always be stateless. “Memory” in these systems is an add-on bandaid, rather than a fundamental part of the system.

What makes an agent stateful?

A stateful agent has an inherent concept of experience. Its state represents the accumulation of all past interactions, processed into meaningful memories that persist and evolve over time. This goes far beyond just having access to a message history or knowledge base. Key characteristics include:

  1. A persistent identity providing continuity across interactions
  2. Active formation and updating of memories based on experiences
  3. Learning via accumulating state that influences future behavior

The Technical Challenge: Context Window Management

The performance of stateful agents depends heavily on how we compile accumulated state into the limited context window. This isn't just about token fitting and prompt engineering - it's about meaningful representation of learned experience. We're seeing advances in:

Tool-based memory management allowing agents to decide what information to retrieve (e.g. MemGPT) to ensure relevant context 

Agents specialized for context management Using agents themselves to manage their own context window by writing to in-context memory, or even using an external agent specialized in memory management to do this (via multi-agent)

Reasoning & inference-time compute Scaling inference time compute and reasoning allows agents to learn more effectively, as they can derive the most important insights from data which they save to context 

Introducing Letta: A Framework for Stateful Agents

At Letta, we've built the first comprehensive framework for creating and deploying stateful agents.

State Architecture

Letta manages persistence of state for long running agents, including components for: 

In-context memory Persistent memory blocks across LLM requests
External memory
Automatic recall memory for interaction history and general-purpose archival memory storage
Multi-agent orchestration
Built-in mechanisms for communication and sharing state for multi-agent

In Letta, all state (including memory blocks) is queryable via our REST API

Automated Context Management

Letta manages the context window automatically, which is composed of: 

  • Read-only system prompts for core instructions
  • Editable memory blocks for learned information
  • Metadata about memories stored externally 
  • Recent messages for immediate context
  • A summary of historical messages not included

Building with Stateful Agents

Building stateful agents shouldn't require managing complex memory systems. That's why we've focused on creating REST APIs purpose built for stateful agents. Developers can focus on designing their agents' capabilities while Letta manages state persistence and memory management.

The Future of AI is Stateful

The next generation of AI applications won't just access static knowledge - they'll learn continuously, form meaningful memories, and develop deeper understanding through experience. This represents a fundamental shift from treating LLMs as a component of a stateless workflow, to building agentic systems that truly learn from experience.

What This Enables

  • Personalized interactions that improve over time
  • Agents that learn from feedback and adjust behavior
  • Long-term relationship building between users and agents
  • Continuous improvement without retraining

Getting Started with Stateful Agents

You can get started with Letta today with Docker or Letta Desktop. If you’re interested in Letta Cloud, you can also sign up for early access here. We highly recommend joining our Discord to learn about what models work best for agents and getting agent building tips! 

Jul 7, 2025
Agent Memory: How to Build Agents that Learn and Remember

Traditional LLMs operate in a stateless paradigm—each interaction exists in isolation, with no knowledge carried forward from previous conversations. Agent memory solves this problem.

Jul 3, 2025
Anatomy of a Context Window: A Guide to Context Engineering

As AI agents become more sophisticated, understanding how to design and manage their context windows (via context engineering) has become crucial for developers.

Feb 13, 2025
RAG is not Agent Memory

Although RAG provides a way to connect LLMs and agents to more data than what can fit into context, traditional RAG is insufficient for building agent memory.

Nov 14, 2024
The AI agents stack

Understanding the AI agents stack landscape.

Nov 7, 2024
New course on Letta with DeepLearning.AI

DeepLearning.AI has released a new course on agent memory in collaboration with Letta.

Sep 23, 2024
Announcing Letta

We are excited to publicly announce Letta.

Sep 23, 2024
MemGPT is now part of Letta

The MemGPT open source project is now part of Letta.

Jul 24, 2025
Introducing Letta Filesystem

Today we're announcing Letta Filesystem, which provides an interface for agents to organize and reference content from documents like PDFs, transcripts, documentation, and more.

Apr 17, 2025
Announcing Letta Client SDKs for Python and TypeScript

We've releasing new client SDKs (support for TypeScript and Python) and upgraded developer documentation

Apr 2, 2025
Agent File

Introducing Agent File (.af): An open file format for serializing stateful agents with persistent memory and behavior.

Jan 15, 2025
Introducing the Agent Development Environment

Introducing the Letta Agent Development Environment (ADE): Agents as Context + Tools

Dec 13, 2024
Letta v0.6.4 release

Letta v0.6.4 adds Python 3.13 support and an official TypeScript SDK.

Nov 6, 2024
Letta v0.5.2 release

Letta v0.5.2 adds tool rules, which allows you to constrain the behavior of your Letta agents similar to graphs.

Oct 23, 2024
Letta v0.5.1 release

Letta v0.5.1 adds support for auto-loading entire external tool libraries into your Letta server.

Oct 14, 2024
Letta v0.5 release

Letta v0.5 adds dynamic model (LLM) listings across multiple providers.

Oct 3, 2024
Letta v0.4.1 release

Letta v0.4.1 adds support for Composio, LangChain, and CrewAI tools.

May 29, 2025
Letta Leaderboard: Benchmarking LLMs on Agentic Memory

We're excited to announce the Letta Leaderboard, a comprehensive benchmark suite that evaluates how effectively LLMs manage agentic memory.

May 14, 2025
Memory Blocks: The Key to Agentic Context Management

Memory blocks offer an elegant abstraction for context window management. By structuring the context into discrete, functional units, we can give LLM agents more consistent, usable memory.

Apr 21, 2025
Sleep-time Compute

Sleep-time compute is a new way to scale AI capabilities: letting models "think" during downtime. Instead of sitting idle between tasks, AI agents can now use their "sleep" time to process information and form new connections by rewriting their memory state.