96% Cache Reads: Nine Months of Running AI Agents Affordably

ai-infrastructure agent-systems ai-workflow productivity

If you have used a coding agent on anything bigger than a quick script, you have felt this. The agent does real work. It writes the code, runs the tests, hits the API. Then it stops, hands you the result, and waits for you to re-explain the goal, the priorities, and the constraint that changed two days ago. Do that across a dozen projects and you are no longer the engineer. You are the memory and the coordinator for the whole operation, and that job gets harder every time the agents get faster.

The capability is not the constraint anymore. The agents can do the work. What breaks down is everything around the work: holding the right context, keeping the agent pointed at the goal across many steps, and trusting what comes back enough to build on it. Supply all of that by hand, one prompt at a time, and you become the part that does not scale.

Festival is what I built so the work stops depending on me to hold it together. It is not a wrapper around a model or a helper bolted onto a single repo, but a filesystem-first workspace and planning system: agents get persistent context, structured direction, and a verifiable record, so long-running work across many projects stays coherent on its own. This post is a field report on nine months of running my workload through it, taken from the logs.

What Festival is

Festival is free and open source (npm install -g @obedience-corp/festival, or brew install --cask Obedience-Corp/tap/festival). It runs alongside whatever agent you already use, Claude Code, Codex, or anything that can run shell commands. Two ideas do most of the work.

The first is the campaign: a git-tracked workspace, a monorepo with your projects as submodules, where the code, the context, the research, and the history live in one place. It persists between sessions. An agent that opens a campaign starts with the whole picture instead of a blank prompt and a paste of yesterday’s notes. Nothing has to be reconstructed because nothing was thrown away.

The second is the festival: a plan with structure. You give an outcome, and the agent breaks it into phases, then sequences, then individual tasks, each with its own verification step, all written to files in the repo. The agent works the plan one task at a time and commits as it goes, so progress shows up as real changes in git rather than a transcript you have to take on faith. You review outcomes instead of reconstructing intent from a chat log.

That is what separates it from generic agent orchestration. Context lives on disk and stays there, direction is a structure the agent can pick back up, and verification is the git history itself. None of it depends on one long session staying alive, and none of it depends on me to carry state from one step to the next. The design comes out of years of working out how hard problems actually get carried to completion, turned into tooling instead of advice.

What it looks like to use

The loop is small. You set up a campaign once and add your projects. After that you work by describing what you want and handing it to the agent. The agent plans the festival around your idea: it generates the structure, breaks the goal into phases, sequences, and tasks, and writes the plan files itself. You do not fill those in. You read them, and when the plan looks right you have the agent run the fest next loop, where it pulls the next task, does the work, marks it done, and commits, over and over until the plan is finished. When you come back, the state is on disk: what is done, what was decided, what is next. You review a structured trail instead of reconstructing where the agent left off from a wall of scrollback.

The rest of this post is about whether that design holds up when you lean on it hard, for a long time, across a lot of work at once.

Does it hold up? Nine months of real use

I do not just build Festival. I run everything through it. The numbers here cover my whole working life over the period: fifteen campaign workspaces, from the Obedience Corp product to client consulting, a crypto project, and personal tooling, holding 138 repositories and 256 festivals between them. They come from local Claude Code, Codex, and Gemini usage logs, plus git history, between September 30, 2025 and June 25, 2026. This is how I actually work, not a demo I set up for a post.

What nine months of organized work looks like

Start with the raw output. Here is my GitHub activity for the year:

GitHub contribution graph showing 14,965 contributions in the last year, with the squares getting densest from February through May

That is 14,965 contributions in the trailing year, 706 of them pull requests, spread across a lot of different projects. Most of it is in private repositories, so the public count alone, about 5,000, badly understates it.

That is also the problem in one image. It is a huge amount of work across a huge number of projects, and if I tried to explain why each one exists and how it connects to the rest, it would take forever and lose you a third of the way through. Removing exactly that cost is the point of Festival. It keeps all of this organized so I never have to hold it in my head, and so the work can be shown instead of narrated.

Here is the same body of work as a graph, with every piece linked to the campaign workspace it belongs to:

A timeline linking each campaign workspace to its projects: a year of work organized into campaign workspaces, each branching into its repositories and plans

This is a timeline of that same work, with every campaign workspace linked to its projects, scrubbed across the nine months. Every node is real work, and every node hangs off the campaign workspace it lives in. The root fans out into fifteen separate workspaces, each into its repositories, and the slider walks the whole thing forward month by month. By June it covers 138 repositories and 256 festivals.

The point is not the node count. It is that none of it had to be explained to make sense. Each branch is a structured plan with its decisions and tasks recorded in files, every contribution sitting under the workspace and the intent that produced it. That is what makes running this much, across this many fronts, survivable for one person.

The overhead stayed flat as the work grew

Tokens used per month and the share that came from reused context, holding in the 90s and rising toward 97 percent

This is where the design pays off. Over the nine months the agents processed 26.5 billion tokens across 153 active days of work.

The number that matters is not the total, it is this: 95.8% of those tokens were cache reads priced 0.1x standard input pricing. When an agent takes a step, it re-reads the context it needs. A cache read is context the system had already loaded and held onto, rather than rebuilding it from scratch. A high cache-read share means the agents spent their budget reusing stable, already-established context instead of regenerating it every step.

Why care? Because the usual failure mode on a long project is the reverse. As a codebase grows, agents burn more and more of every request just re-deriving what the project even is, and the cost of each change creeps upward. Here that share stayed high for nine straight months and climbed as the work grew, finishing above 97%.

That reuse is a direct result of the architecture. The context the agents reuse is the campaign sitting on disk: the same project files, the same plans, the same recorded decisions, waiting where the last session left them, easily discoverable by both me as the user and agents. There is a fixed thing to reuse instead of a context to rebuild every time, so new work plugs into a stable base rather than forcing one to be reconstructed.

Try it, and star it

If the problem at the top of this post is familiar, the fastest way to get it is to run Festival on something real and watch the work organize itself. It is free and open source.

Star the repo so other people building with agents can find it. One click on the Star button helps more than you would think: github.com/Obedience-Corp/festival

Install it:

# npm
npm install -g @obedience-corp/festival

# or, on macOS
brew install --cask Obedience-Corp/tap/festival

Create a workspace and your first plan:

camp init my-project && cd my-project
fest create festival --name "my-first-feature" --type standard

The full quickstart takes about five minutes.

What Festival is

What it looks like to use

Does it hold up? Nine months of real use

What nine months of organized work looks like

The overhead stayed flat as the work grew

Try it, and star it

Links

Related Posts

Start a Conversation