on agent design

File systems are the new primitive for AI agents

Models arrive already fluent in ls, cat, and grep. The cheapest memory you can give an agent is the one it has read trillions of tokens about.

agent — memory/
agent$ ls memory/CLAUDE.md   team.md   tasks.md   audit.logagent$ cat team.md# Team · ## Maya- [x] Review Q3 metrics   # synced by agentagent$ grep -r "onboarding" .agent$ 

An essay · with animated diagrams

01 — the shift

From schemas to a box that asks “What can I do for you?”

For two decades, building software meant starting with tables. A customer had requirements; your brain went straight to what schema do we need, what interface do we build? You can't look at a web app without seeing its data model underneath.

But interfaces are collapsing into a single prompt, and agents are what's behind them. The question is what those agents should stand on.

2000 — THE DATA MODEL users id · nameowner_id tasks id · assignee_idstatus · user_id precise · deterministic · developer-shaped NOW — THE BOX What can I do for you? ambiguous · exploratory · agent-shaped
Fig. 1 — The interface moved one layer earlier: from operations a developer pre-defined to intent an agent must still interpret.

02 — what agent memory needs

A good read/write memory does four things

🗄️

Retain

Hold facts across sessions, not just within one conversation.

🔍

Retrieve selectively

“Dump everything into context” stops working fast. Pull only what matters.

✏️

Update & correct

Revise itself over time — which rules out read-only approaches.

👁️

Stay inspectable

LLMs are non-deterministic; a black-box memory is a debugging nightmare.

Reaching for SQL, ORMs, or APIs is natural — we've built a lot of software that way. But those interfaces were designed for programs that already know exactly what they want.

03 — the mismatch

Agents live one layer earlier than APIs

A web app calls GET /tasks/123 because a developer already turned intent into a precise operation. An agent is still upstream of that — interpreting messy goals, partial context, ambiguous names, and human corrections.

messy goal precise operation AGENT “mark Maya’s onboarding task done” interpreting · ambiguous names · corrections partial context · changing assumptions DETERMINISTIC SOFTWARE PATCH /tasks/123 intent already resolved by a developer hidden work teach the schema · business rules · safe mutations · edge cases · keep it updated
Fig. 2 — Forcing an agent onto normalized tables makes it spend context rebuilding the mental model a developer already had.

The agent can call the API — but it burns reasoning budget reconstructing assumptions, and you burn engineering budget keeping its instructions in sync with the system.

What if there were an API agents had already been trained on — to the tune of trillions of tokens?

04 — the prior

LLMs already know how to use file systems

Really know. ls, cat, grep, cd, mkdir — every engineer recognizes these instantly, and so does every model, trained on 50+ years of Unix man pages, GitHub repos, and millions of documents about exactly how they behave.

$ ls $ cat $ grep $ cd $ mkdir $ mv $ diff
man pages GitHub repos tutorials · docs Foundation LLM fluent in filesystem semantics trillionsof tokens
Fig. 3 — You don't have to teach this interface. The model paid for it during pre-training.

05 — the fit

A filesystem checks the boxes — and gives the model handles it knows

It isn't a complete memory architecture, and pretending it is would be a mistake. But it's an unusually good substrate for the part agents struggle with most: durable, inspectable, revisable working context. Files give the model names, paths, hierarchy, timestamps, permissions, diffs, and conventions it already reasons about.

Memory needs……the filesystem already has
Retain across sessionspaths & durable files on disk / cloud drive
Retrieve selectivelyls, grep, hierarchy & naming conventions
Update & correctedit, mv, diff, timestamps
Stay inspectableopen the file, read the diff, revert, comment

06 — the demo

Two markdown files, one agent, no bespoke API

A team-management agent backed by two files in Box — the same data denormalized across both. A human marked a task complete in the UI. Then: the agent noticed one file was newer, compared it to the other, synced the stale copy, and left an audit trail.

Human (in the UI) marks a task complete ✓ team.md # Team ## Maya - [x] Review Q3 metrics - [ ] Draft onboarding just edited ● tasks.md ## Todo - Draft onboarding — Maya ## Done - Review Q3 metrics — Maya ↻ synced Agent compare → sync audit.log human: ✓ · agent: synced tasks.md reads, newer ● writes stale copy appends read files · compare timestamps · update the stale one · write a log
Fig. 4 — No task API, no query language, no tool contract — just filesystem semantics. It's almost too simple, which is exactly why it's interesting.

07 — once you see it

You start seeing files everywhere

It's not a coincidence that agent tooling keeps converging on file-shaped surfaces:

CLAUDE.md SKILL.md repo/ (read · edit · grep · test) mounted workspaces document collections notes/ · plans/ · logs/

CLAUDE.md is just a markdown file. A skill is often a directory with a SKILL.md. Coding agents work by reading, editing, searching, and testing repositories. Platforms keep exposing file-like workspaces as the place models do their work.

08 — why it matters

Shared legibility is the real payoff

If an agent writes a row to a database, a human needs a product surface, an admin tool, a SQL query, or a log pipeline to see what happened. If an agent edits a markdown file, a human just opens it.

Agent → database
▸ writes a row
▸ human needs an admin tool / SQL / log pipeline to inspect
▸ debugging the agent's memory is its own project
Agent → file
▸ edits a markdown file
▸ human opens it, reads the diff, comments, reverts, fixes it directly
▸ versioning, permissions, retention, audit — systems that already exist

Cloud filesystems make this richer still: a shared drive isn't just storage — it's collaboration, access control, version history, search, preview, comments, legal hold, retention, and auditability. Exactly the boring enterprise requirements demos ignore until they become production problems.

09 — the honest caveat

A markdown file is not a database

Reach for a real database when you need

High-volume transactions · complex joins · strict consistency · arbitrary analytical queries · carefully enforced invariants. Pretending otherwise is how you end up with a very expensive shared document with extra steps.

But agents often don't need the database directly. What they need is a working set — plans, notes, task lists, policies, drafts, summaries, logs, corrections, decisions. For that layer, a filesystem-shaped interface is more legible to both the model and the humans supervising it.

10 — the broader move

Ride the model's priors

Every time you find yourself spending inference-time effort teaching an agent a custom abstraction, stop and ask whether there's already a paradigm the model knows deeply.

filesystems email spreadsheets Git calendars issue trackers

The answer won't always be “filesystem.” The point is that models aren't blank slates — they arrive with operational priors learned from the digital world we already built. Good agent design uses them.

a fifty-year-old bet, with a new job

Thompson and Ritchie bet that devices, streams, programs, and state get easier to compose when they share a file-like interface. That idea scaled astonishingly far — into whatever device you're reading this on, and the tools building the most powerful LLMs on the planet.

Now agents are giving it a new job. The filesystem may be one of the most natural interfaces we have for giving them memory, context, and a place to work.

everything is a file — again