Designing a memory optimised for storing codebase context

Memory systems have become popular but most of them suffer in performance from being spread too thin in terms of the use cases they support.

We have built Greplica - memory designed specifically for engineering context relevant to the coding agent. In other words, it is a coding agent’s employee handbook. Here we talk about what such a system should optimise for and the design choices we made to achieve it.

Developer teams create lots of valuable data while working on a repository such as aligned decisions, past trade-offs, standard workflows, tribal knowledge and follow-up work. With the onset of coding agents writing majority new code, the session transcript, plan.md files, PR comments and issues have all gotten much richer.

Greplica stores this information and surfaces it to the agent when it needs it, making it better at the task.

At the highest level, the graph keeps a record of Components, Flows, and Claims of a code repository.

Components, Flows, and Claims

For reference, at the time of writing the Greplica repo itself has 12 components, 9 flows and 69 claims in around a month of development.

Some primitives we solved for:

Memory should be actively monitored to prevent stale or faulty information

The largest pain while using AGENTS.MD to maintain context is that the information goes stale faster than teams realise. That is something our system must avoid, as giving faulty information to the coding agent would mislead it and be harakiri for us.

To ensure that, we borrowed from the concepts of main and working branches in git. Developers code across many sessions but not all of them merge into the main repo.

We introduced the concept of Graph Scope - every member of the graph memory would belong to a scope.

export type GraphScopeKind = "main" | "working" | "branch" | "session" | "source";

All fresh memories get added to working scope. Human must review what goes into main

Future work would also link memories to git commits to prioritise facts related to code depending on when it made it into production.

What Greplica stores should be human-readable

To ensure humans are able to audit, verify and edit memory, it is important the graph contents remain comprehensible (unlike many blackbox memory systems!). Components mimic the folder structure of a repo since most repositories are written in a way that semantic components are hierarchically broken into folders and sub-folders. Flows resemble the way humans think about the most important processes in their system. Claims are standalone facts derived from work done by a developer.

Human-readable graph structure

We noticed this was the way many teams were exposing their context to agents already - by placing hierarchical AGENTS.MD files in important folders and subfolders to be progressively discovered by agents.

Memory should be built as automatically as possible

Despite the need for facts in memory to be true, it is only usable if it gets built with little to no effort. This has been made possible by the recent popularity of coding agents, making session transcripts a rich first-order source of truth. Greplica auto-captures memories from contents of a session transcript and the accompanying code written. Future work will include ingesting other sources like issues, PRs, PRDs, ADRs, etc.

What changed should be recorded and not just what is true now

When it comes to engineering-context, what was true before and is false now is often as important as the latest true facts. This would tell anyone working on it past mistakes, trade-offs and learnings.

So we made our memory system append-only to not allow any editing or deleting. Everything would be updated by a superceding item.

- `claim.supersedes[]`, `component.supersedes[]`, or `flow.supersedes[]` when replacing known existing memory

Patterns emerging out of such superceding claims which can inform other parts of system design.

Not just Claims, other components also follow this philosophy. We have made Flows and Components to be able to contain other entities of the same kind. Consider how a UserLogin flow should change if a new UserPasswordChange flow is created. In current design it would become a part of the UserLogin flow and any claims about UserPasswordChange would go with it.

- `component.contains[]` for Component -> Component.
- `flow.contains[]` for Flow -> Flow

A practical example would be if a team needs to rework on a release made a couple months ago for a customer. Just like they would go back to the commit on that date and search for who worked on it, finding "what was true in my memory when we released" is equally important to know.

Memory should be grounded in code or other sources

It is important to have each MemoryCommit traceable to the exact session, doc or git commit it came from. So we added them as first-class citizens of the graph.

export interface Component {
  id: ComponentId;
  name: string;
  code_anchor?: string;
}

Components anchor to code paths. Claims cite Sources (session, PRD, doc, etc.) via evidenced_by edges — multiple sources per claim. Sources are edge-referenced, not scope-membership objects.

If you find this work interesting or have feedback, please find us on Discord.