FORGE: Enabling Agent Memory to Self-Evolve Without Updating Weights—A Paper with a Bold Approach

Making AI remember things is harder than it sounds.

Current approaches generally fall into two paths: either stuffing memories into model weights—which requires fine-tuning, consuming time and compute; or attaching an external database—retrieval-based memory, which is flexible but not internalized, forcing the Agent to "look up information" from scratch every time rather than truly "remembering."

The freshly published FORGE paper on arXiv (Self-Evolving Agent Memory With No Weight Updates via Population Broadcast) takes a third path: enabling a population of Agents to broadcast experiences to one another, achieving autonomous memory evolution without touching model weights.

What is Population Broadcast?

The core mechanism of FORGE is called "Population Broadcast." Its logic is straightforward:

When an Agent accumulates experience in a task—which strategies worked, which failed, and which information is worth retaining—it broadcasts this experience to other Agents in the population. Upon receiving it, other Agents integrate this experience into their own memories. The next time they encounter a similar scenario, they can directly call upon this "collective memory" without needing to re-explore.

The key point is: this process does not update model weights. The storage and retrieval of experiences are handled entirely by an external memory module. The model itself remains unchanged, but the Agent's behavioral capabilities continue to grow.

Why This Approach Is Interesting

Bypassing weight updates brings several key advantages:

Zero Fine-tuning Cost. Traditional continuous learning approaches either require periodic model fine-tuning or complex incremental learning algorithms. FORGE requires absolutely no model modifications—memory growth is externalized, plug-and-play.

Instant Sharing. What one Agent learns can be instantly acquired by all other Agents. In traditional fine-tuning approaches, you would need to retrain the entire model and then deploy the new version to all instances. FORGE only requires broadcasting a single message.

Auditable and Reversible. Because memories are explicitly stored, you can view, edit, or delete any memory entry. If a memory is incorrect, just delete it. Weight updates cannot do this—you cannot precisely locate and modify a specific "false belief" within a model.

Differences from RAG

You might ask: Isn't this just RAG (Retrieval-Augmented Generation)? An external knowledge base, retrieve relevant content, then generate an answer.

The difference lies in "self-evolution." RAG knowledge bases are human-maintained—you need to manually add documents, update content, and manage indexes. FORGE's memory is generated and maintained by the Agents themselves during interactions. It automatically determines which experiences are worth retaining, which should be forgotten, and which need integration.

This is much closer to human memory mechanisms. You don't memorize an encyclopedia every day—instead, you automatically form and update your memories through experience and reflection.

The Paper's Experimental Design

The paper evaluates the approach on multiple Agent task benchmarks, including strategy tasks requiring long-term memory and complex scenarios requiring cross-task knowledge transfer. Results show that FORGE's population broadcast mechanism outperforms individual learning and traditional retrieval-based approaches in both memory efficiency and task performance.

Notably, as the Agent population size increases, learning performance exhibits superlinear growth—indicating that the population broadcast mechanism genuinely produces a "collective intelligence" effect, rather than just a simple aggregation of information.

My Take

FORGE's direction hits a core contradiction in current Agent systems: we want Agents to get smarter, but we don't want to retrain the model every time their capabilities improve.

The population broadcast approach offers an elegant solution. It decouples "learning" from "model updates"—learning occurs at the memory layer, while the model remains static. This means you can use a fixed model version and continuously enhance the Agent's capabilities through an ever-growing collective memory.

Of course, challenges remain. How are memory conflicts resolved? What happens if one Agent's erroneous experience is broadcast to the entire population? As memory volume grows, how do retrieval efficiency and memory quality remain stable?

The paper does not yet fully resolve these issues. But the direction it points to is clear: Agent memory should not be tied to model weights, but should exist as an independent, evolving, and shareable layer.

If this path proves successful, the deployment and maintenance costs for Agents will drop significantly. You won't need to frequently retrain models—you just need to let the Agent population learn, broadcast, and evolve on their own.

Primary Source:

arXiv:2605.16233 - FORGE

What is Population Broadcast?

Why This Approach Is Interesting

Differences from RAG

The Paper's Experimental Design

My Take

Related

APWA: A Distributed Architecture for True Parallelization in Multi-Agent Systems

Dual-Dimensional Consistency: A New Method to Save 10x Tokens During Inference-Time Scaling

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Capabilities