The case for VanillaDB

Agents are here.
The infrastructure
isn't ready.

Software was built for humans. Agents operate differently: they run continuously, accumulate context, make decisions autonomously, and need to remember. The stack hasn't caught up yet. That's the problem we're working on.

01 / The shift

Agents aren't
a feature anymore.

Two years ago, an "AI agent" meant a chatbot with a few tools bolted on. Today, agents are running code reviews, managing deployments, drafting and sending communications, processing documents at scale, and coordinating other agents. The use cases are expanding faster than anyone predicted.

This isn't hype about what agents might do eventually. Teams are shipping these systems into production right now. And as they do, a class of infrastructure problems that didn't exist before is becoming urgent.

When an agent runs for hours, coordinates with other agents, and makes hundreds of decisions, what it knew at step one matters at step two hundred. There's no mechanism for that in most current stacks.

The shift isn't just about scale. It's about the nature of computation changing. Agents are persistent processes, not request-response cycles. They accumulate state. They need to retrieve context selectively. They operate in loops that can run for minutes, hours, or days. The primitives we have now weren't designed for any of that.

02 / The infrastructure gap

Existing tools were built
for human-paced systems.

Traditional databases are excellent at what they were designed for: storing records, serving queries, handling concurrent writes from human users. But they assume a human is on the other end of every interaction. Schema design, query structure, retrieval patterns, even logging formats, all optimize for the way developers and analysts think about data.

Agents interact with data differently. They don't browse a UI. They need to ask questions that aren't pre-specified, reason over context that spans many prior sessions, and retrieve information based on semantic relevance rather than exact key lookups. Fitting that into a relational model is friction all the way down.

Context Context windows are not memory. Stuffing everything into a prompt works for a single interaction. It doesn't scale to an agent that has been operating for weeks and needs to reason about what it knew three months ago.

Retrieval Keyword search misses the point. Agents need to retrieve information by relevance and relationship, not by matching strings. A knowledge graph with proper structure is fundamentally different from a full-text search index.

Persistence Session state isn't durable. When an agent process restarts, its working memory is gone. There's no standard mechanism for persisting the structured context an agent builds up across multiple sessions.

Ownership Managed services create dependencies. Most agent memory solutions are cloud APIs with black-box internals. Teams ship production systems without knowing what's happening inside the layer their agents depend on most.

These aren't edge cases. They're the core problems every team building serious agent systems runs into. The solutions that exist are either too rigid, too opaque, or require giving up control of the data layer entirely.

03 / The memory problem

Memory is what separates
useful agents from disposable ones.

An agent without persistent memory degrades on every restart. It has no history of what it's already tried. No record of what a user cares about. No ability to connect a current decision to a pattern it observed three weeks ago. Every session starts from zero.

This is not a problem you can fully solve at the model layer. Models have fixed context limits and no built-in persistence. Memory is an infrastructure problem, which means it needs an infrastructure solution: a structured, queryable layer that sits outside the model and persists across every session, every restart, every model swap.

The most capable agent isn't the one with the most parameters. It's the one that can retrieve the right context at the right moment, reliably, across time.

What that layer looks like matters a lot. It needs to store structured knowledge, not raw text. It needs to be queryable by relationship and meaning, not just by key. It needs to be local-first so teams can run it without accounts, without cloud dependencies, without handing their operational data to a third party. And it needs to be simple enough that the people building on top of it can actually understand what's happening inside.

04 / The untapped resource

Operational data is the most
underused asset in agent systems.

Every system generates operational data continuously: logs, traces, decisions, documents, communications, outputs. In human-operated systems, this data is stored for compliance or debugging and rarely touched again. In agent systems, it's something different entirely. It's the record of what the agent has done, what worked, what didn't, and why.

The teams that figure out how to structure and query that operational record will have agents that genuinely improve over time. Not because the model got smarter, but because the infrastructure gives it access to its own history in a form it can reason over. That's a fundamentally different capability than anything you get from a static knowledge base.

We think this is one of the most important open problems in the space. Operational data is already being generated everywhere. The gap is the infrastructure to turn it into something agents can actually use: typed, structured, queryable, persistent, and open enough to adapt to whatever your specific system needs.

This is why we build
the way we do.

01 Infrastructure you can't read is infrastructure you can't trust.

02 The right primitive solves the problem and nothing else.

03 Local-first means you own your data, always.

04 Open source is the only way to build things that last.

The agent infrastructure space is filling up with managed services, black-box APIs, and opinionated frameworks that want to own the whole stack. We're building the opposite of that. Every tool we ship is MIT-licensed, fully readable, and designed to be understood end to end.

Our releases are intentionally minimal. We don't add abstractions you didn't ask for. We don't ship defaults you can't change. If something we build doesn't make sense on a first read, that's a bug in our design.

View VanillaGraph on GitHub

We believe the teams building the most serious agent systems aren't looking for another managed service. They're looking for primitives they can actually own: components that do one thing well, have clear interfaces, and can be extended without fighting the framework.

That's what we build. A knowledge graph layer that ingests your documents and surfaces your operational data as structured, queryable nodes. Local, open, intentionally vanilla. Fork it, extend it, make it yours.

Join the Discord ↗

Agents are here.The infrastructureisn't ready.

Agents aren'ta feature anymore.

Existing tools were builtfor human-paced systems.

Memory is what separatesuseful agents from disposable ones.

Operational data is the mostunderused asset in agent systems.

This is why we buildthe way we do.