The HTML Docs Mac app (apps/desktop) is an Electron application that turns a folder of plain .html files into a Notion-style block editor — with version history, threaded comments, cloud sync, and embedded AI coding agents (Claude Code) that can plan and edit documents in place.
Unlike the web app (which stores documents in Supabase), the desktop app is local-first: a user picks a folder on disk — the vault — and every document is a self-contained .html file inside it. Edits write straight back to those files. Agents, git, and other editors can touch the same files, and the app reconciles their changes live.
The app is split across Electron's three contexts, connected by a single typed IPC contract:
window.htmldocsThe preload script is the only thing exposed across the isolation boundary. It mirrors the HtmlDocsApi interface in src/shared/ipc-contract.ts — every method round-trips through ipcRenderer.invoke(), every push event through ipcRenderer.on(). Channels are named <namespace>:<method> and registered one-for-one in src/main/ipc.ts.
Each subsystem keeps its color throughout this document.
A vault is just a folder. Documents are normal HTML files (nestable in subfolders); everything else the app needs — version history, comments, agent sessions, sync state — lives in a hidden .htmldocs/ sidecar folder so the documents stay clean and portable.
Each real doc embeds <meta name="htmldocs-id" content="UUID"> in its <head>. Files not yet saved get a provisional path:<relPath> id; the first save injects a real UUID and registers both as aliases. This is what makes renames survivable — identity travels inside the file, not in a path.
Region edits update the in-memory DocModel immediately, then a 1s debounced flush reconstructs full HTML and writes atomically (temp .htmldocs-tmp~ file → rename). Every write's SHA-1 is recorded so the watcher can tell the app's own writes apart from external ones.
VaultWatcher (chokidar) watches the vault for add / change / unlink events with a 300 ms stability threshold. The flow that keeps the editor honest:
vault:docChanged with reason external.rename), not delete + add.Portable assets: on disk, image src/href stay relative. On read they're rewritten to a vault://doc/<relPath> URL served by a privileged Electron protocol handler, so the renderer can load them regardless of its own origin. On write they're rewritten back to relative paths.
The hard part of the editor — parsing HTML into regions, the Shadow DOM viewer, block keymaps, slash commands, and structural mutations — lives in a workspace package, packages/editor (@html-docs/editor), imported as TypeScript source by both the Next.js web app and the desktop renderer. The desktop app contributes only the IPC plumbing around it.
Imported HTML is parsed (Cheerio) into a shell with {{region-xxx}} placeholders plus a map of regionKey → content. Block-level tags (p, h1–h6, li, td, blockquote, pre, …) become independently editable regions. The ShadowDomViewer renders the shell in a Shadow root (for style isolation), splices region content into the placeholders, and makes each region contenteditable.
saveRegion() persists just that region's content. No undo snapshot.StructuralOp through the shared engine.All structural ops are pure transforms over (shellHtml, regions) → MutationResult. The result carries newHtml plus region deltas (insert / update / delete) and a focus target — so the renderer and disk stay in sync from one source of truth.
Example: pressing Enter mid-paragraph to split a block.
applyOp({kind:'split', …}).vault.applyStructural(docId, op) over IPC.MutationResult.shellHtml + live regions in one pass and focuses the new region. Cmd-Z restores the pre-op snapshot via restoreSnapshot.The engine also ships the parser, block-type registry, slash-command menu, block hover controls, comment-tree shape, interactive-pattern runtime (re-arming imported accordions/tabs/carousels), linkify, image-resize, and markdown export.
The app can spawn AI coding agents that work directly inside the vault. The AgentManager drives pluggable adapters that translate each CLI's wire protocol into one unified AgentProcess. The Claude Code adapter is fully implemented; a Codex adapter is stubbed behind the same interface.
One persistent claude -p subprocess per session, spawned with cwd = vaultRoot and --input-format/--output-format stream-json. The app speaks NDJSON over stdin/stdout: user turns go in as JSON; system/stream_event/assistant/result events stream back and are parsed into AgentUiEvents forwarded to the renderer (text deltas, tool-use, tool-result, turn-complete).
Sessions persist to .htmldocs/sessions/<localId>.json with the CLI's sessionId. Switching permission mode (or recovering a dead process) just kills the child and re-attaches with --resume — robust across CLI versions, no transcript loss.
Bound to a generated plans/<slug>.html doc. The agent maintains the plan file live (mode acceptEdits). submitPlan snapshots a version, then tells the agent the plan is approved — switching it from planning to execution.
A chat bound to one existing vault document (the "Docsmith" button in the editor). Answers questions about and edits that doc, scoped to its data-editable-region blocks.
Free-form vault chat with no document binding. Permission mode is user-selectable: plan / default / acceptEdits / bypassPermissions.
~/.claude/projects/<cwd-slug>/*.jsonl to find Claude sessions you started in a terminal in the same folder, and offers to resume them inside the app via --resume.The desktop app can link vault files to documents on html-docs.com. Sync is explicit (the user pulls and pushes — there's no background loop) and never auto-merges — it detects conflicts and lets the human decide. Auth uses an hdk_ API key encrypted at rest with Electron safeStorage.
.htmldocs/sync.json records, per linked doc, the cloud updated_at and the local file SHA-1 as of the last sync. Status compares both sides against those baselines: local content via hash, remote via timestamp.
PUT /api/v1/docs/{id} (push) · GET …/{id} (pull)| not-pulled | no mapping yet |
| in-sync | both match baseline |
| local-changes | only local hash differs → push |
| remote-changes | only cloud time differs → pull |
| conflict | both differ → user chooses |
| missing-local | file deleted → pull to restore |
Snapshots (full shell + region array) are written to .htmldocs/versions/<docId>/ as timestamped JSON. The app auto-captures ~30 s after edits settle; users can also name versions. Pruning keeps named versions and the most recent forever, capping only unnamed auto-snapshots. Restoring snapshots the current state first, so a restore is itself undoable.
Threaded comments are stored as a flat DbCommentRow[] in .htmldocs/comments/<docId>.json — the same row shape the web app uses in Supabase, ready for future cloud sync. The renderer builds the tree with buildCommentTree(). Selected-text comments render as <mark> highlights in the Shadow DOM; deleting a comment cascades to its replies.
To tie the layers together, here's the full life of a single keystroke that edits text — crossing the renderer, the IPC bridge, the vault, and the watcher.
contenteditable region inside the Shadow DOM. The viewer debounces and calls window.htmldocs.vault.saveRegion(docId, key, content).vault.saveRegion. The in-memory DocModel updates instantly; a 1 s debounced write is scheduled.vault:docChanged instead.).htmldocs/versions/ — unless it's identical to the latest.The same shape repeats everywhere: the renderer never touches disk; main owns all state; the file system is the single source of truth; and a hash-based echo check keeps the app's own writes from looping back through the watcher.