Architecture · HTML Docs Desktop

A local-first vault editor
with agents built in

The HTML Docs Mac app (apps/desktop) is an Electron application that turns a folder of plain .html files into a Notion-style block editor — with version history, threaded comments, cloud sync, and embedded AI coding agents (Claude Code) that can plan and edit documents in place.

Electron 35 · main + preload + renderer React 19 · TypeScript · Tailwind v4 electron-vite build Shared @html-docs/editor engine Local-first · files are the source of truth
01

The big picture

Unlike the web app (which stores documents in Supabase), the desktop app is local-first: a user picks a folder on disk — the vault — and every document is a self-contained .html file inside it. Edits write straight back to those files. Agents, git, and other editors can touch the same files, and the app reconciles their changes live.

The app is split across Electron's three contexts, connected by a single typed IPC contract:

Renderer UI

React app. Vault picker → workspace → editor. Renders editable regions in a Shadow DOM, chat panel, version/comment panels. No Node access.
window.htmldocs
contextBridge

Main Node

Owns the file system, vault state, file watcher, version/comment stores, agent subprocesses, and cloud sync. All privileged work lives here.

The preload script is the only thing exposed across the isolation boundary. It mirrors the HtmlDocsApi interface in src/shared/ipc-contract.ts — every method round-trips through ipcRenderer.invoke(), every push event through ipcRenderer.on(). Channels are named <namespace>:<method> and registered one-for-one in src/main/ipc.ts.

Subsystems & their colors

Each subsystem keeps its color throughout this document.

Renderer / Editor UI Main process / IPC Vault & storage @html-docs/editor engine AI agents Cloud sync
02

The vault — files as the source of truth

Vault & storage

A vault is just a folder. Documents are normal HTML files (nestable in subfolders); everything else the app needs — version history, comments, agent sessions, sync state — lives in a hidden .htmldocs/ sidecar folder so the documents stay clean and portable.

my-vault/ ├─ proposal.html # a document — <meta name="htmldocs-id"> carries its UUID ├─ research/ │ └─ findings.html ├─ plans/ │ └─ migration-plan.html # agent plan docs land here ├─ cloud/ │ └─ shared-doc.html # docs pulled from html-docs.com ├─ assets/ │ └─ diagram.png # pasted/dropped images └─ .htmldocs/ # hidden sidecar — never shown in the UI ├─ versions/<docId>/ # <ts>-<rand>.json snapshots per doc ├─ comments/<docId>.json # flat comment rows per doc ├─ sessions/<localId>.json# persisted agent chat transcripts └─ sync.json # cloud ↔ local mapping + sync hashes

Document identity

Each real doc embeds <meta name="htmldocs-id" content="UUID"> in its <head>. Files not yet saved get a provisional path:<relPath> id; the first save injects a real UUID and registers both as aliases. This is what makes renames survivable — identity travels inside the file, not in a path.

Crash-safe writes

Region edits update the in-memory DocModel immediately, then a 1s debounced flush reconstructs full HTML and writes atomically (temp .htmldocs-tmp~ file → rename). Every write's SHA-1 is recorded so the watcher can tell the app's own writes apart from external ones.

The file watcher — reconciling outside edits

VaultWatcher (chokidar) watches the vault for add / change / unlink events with a 300 ms stability threshold. The flow that keeps the editor honest:

1
Detect a file event, read the content, compute its SHA-1.
2
Echo check. If the hash matches what the vault just wrote (isOwnWrite), ignore it — it's our own save coming back.
3
External change. Otherwise re-index the file; if the doc is open, snapshot it first, reload the model, and broadcast vault:docChanged with reason external.
4
Rename detection. A delete is held for a 750 ms grace window — if a new file with the same embedded id appears, it's treated as a rename (reason rename), not delete + add.

Portable assets: on disk, image src/href stay relative. On read they're rewritten to a vault://doc/<relPath> URL served by a privileged Electron protocol handler, so the renderer can load them regardless of its own origin. On write they're rewritten back to relative paths.

03

The editing engine — shared with the web app

@html-docs/editor engine Renderer

The hard part of the editor — parsing HTML into regions, the Shadow DOM viewer, block keymaps, slash commands, and structural mutations — lives in a workspace package, packages/editor (@html-docs/editor), imported as TypeScript source by both the Next.js web app and the desktop renderer. The desktop app contributes only the IPC plumbing around it.

Region-based model

Imported HTML is parsed (Cheerio) into a shell with {{region-xxx}} placeholders plus a map of regionKey → content. Block-level tags (p, h1–h6, li, td, blockquote, pre, …) become independently editable regions. The ShadowDomViewer renders the shell in a Shadow root (for style isolation), splices region content into the placeholders, and makes each region contenteditable.

Two kinds of edit

  • Character edits — typing/paste/delete. Debounced saveRegion() persists just that region's content. No undo snapshot.
  • Structural edits — split, merge, delete, convert type, move, indent, duplicate, slash-insert. Routed as a StructuralOp through the shared engine.

Pure mutation engine

All structural ops are pure transforms over (shellHtml, regions) → MutationResult. The result carries newHtml plus region deltas (insert / update / delete) and a focus target — so the renderer and disk stay in sync from one source of truth.

What a structural edit actually does

Example: pressing Enter mid-paragraph to split a block.

1
useBlockKeymap in the viewer computes head/tail HTML and calls applyOp({kind:'split', …}).
2
The controller pushes an undo snapshot, then invokes vault.applyStructural(docId, op) over IPC.
3
Main runs applyStructuralOp from the shared engine, persists the new shell + region rows, and returns the MutationResult.
4
The renderer updates shellHtml + live regions in one pass and focuses the new region. Cmd-Z restores the pre-op snapshot via restoreSnapshot.

The engine also ships the parser, block-type registry, slash-command menu, block hover controls, comment-tree shape, interactive-pattern runtime (re-arming imported accordions/tabs/carousels), linkify, image-resize, and markdown export.

04

AI agents — Claude Code, embedded

AI agents

The app can spawn AI coding agents that work directly inside the vault. The AgentManager drives pluggable adapters that translate each CLI's wire protocol into one unified AgentProcess. The Claude Code adapter is fully implemented; a Codex adapter is stubbed behind the same interface.

Long-lived stream-JSON

One persistent claude -p subprocess per session, spawned with cwd = vaultRoot and --input-format/--output-format stream-json. The app speaks NDJSON over stdin/stdout: user turns go in as JSON; system/stream_event/assistant/result events stream back and are parsed into AgentUiEvents forwarded to the renderer (text deltas, tool-use, tool-result, turn-complete).

Resume as the safety valve

Sessions persist to .htmldocs/sessions/<localId>.json with the CLI's sessionId. Switching permission mode (or recovering a dead process) just kills the child and re-attaches with --resume — robust across CLI versions, no transcript loss.

Three session kinds

Plan

Bound to a generated plans/<slug>.html doc. The agent maintains the plan file live (mode acceptEdits). submitPlan snapshots a version, then tells the agent the plan is approved — switching it from planning to execution.

Doc · Docsmith

A chat bound to one existing vault document (the "Docsmith" button in the editor). Answers questions about and edits that doc, scoped to its data-editable-region blocks.

Chat

Free-form vault chat with no document binding. Permission mode is user-selectable: plan / default / acceptEdits / bypassPermissions.

Two details worth knowing

05

Cloud sync — drive-style, manual, conflict-aware

Cloud sync

The desktop app can link vault files to documents on html-docs.com. Sync is explicit (the user pulls and pushes — there's no background loop) and never auto-merges — it detects conflicts and lets the human decide. Auth uses an hdk_ API key encrypted at rest with Electron safeStorage.

How state is computed

.htmldocs/sync.json records, per linked doc, the cloud updated_at and the local file SHA-1 as of the last sync. Status compares both sides against those baselines: local content via hash, remote via timestamp.

Local API
PUT /api/v1/docs/{id} (push) · GET …/{id} (pull)
Resolution
Force-pull overwrites local; force-push overwrites remote. No 3-way merge.

The six states

not-pulledno mapping yet
in-syncboth match baseline
local-changesonly local hash differs → push
remote-changesonly cloud time differs → pull
conflictboth differ → user chooses
missing-localfile deleted → pull to restore
06

Versions & comments — sidecar storage

Vault & storage

Version history

Snapshots (full shell + region array) are written to .htmldocs/versions/<docId>/ as timestamped JSON. The app auto-captures ~30 s after edits settle; users can also name versions. Pruning keeps named versions and the most recent forever, capping only unnamed auto-snapshots. Restoring snapshots the current state first, so a restore is itself undoable.

Comments

Threaded comments are stored as a flat DbCommentRow[] in .htmldocs/comments/<docId>.json — the same row shape the web app uses in Supabase, ready for future cloud sync. The renderer builds the tree with buildCommentTree(). Selected-text comments render as <mark> highlights in the Shadow DOM; deleting a comment cascades to its replies.

07

End to end — a region save

To tie the layers together, here's the full life of a single keystroke that edits text — crossing the renderer, the IPC bridge, the vault, and the watcher.

1
Renderer   You type in a contenteditable region inside the Shadow DOM. The viewer debounces and calls window.htmldocs.vault.saveRegion(docId, key, content).
2
Main   The IPC handler calls vault.saveRegion. The in-memory DocModel updates instantly; a 1 s debounced write is scheduled.
3
Vault   On flush, full HTML is reconstructed, asset URLs made portable, the doc UUID injected if first save, SHA-1 recorded, and the file written atomically.
4
Watcher   The write fires a chokidar event — but the hash matches the recorded one, so it's recognized as the app's own write and ignored. (A real outside edit would broadcast vault:docChanged instead.)
5
Versions   ~30 s after edits stop, an auto version snapshot is captured into .htmldocs/versions/ — unless it's identical to the latest.

The same shape repeats everywhere: the renderer never touches disk; main owns all state; the file system is the single source of truth; and a hash-based echo check keeps the app's own writes from looping back through the watcher.