HTML Docs → Wikis & Second Brains

01Why this, why now

HTML Docs already has the hard parts of a documentation platform: instant CLI/agent publishing, collaborative region-based editing, comments with AI answers, and folders with cascading permissions. What it lacks is exactly three things — and all three are additive, not rewrites:

Gap	Today
No multi-page sites	Every published page is standalone. Folders group docs for permissions, not navigation. "Tabs" exist in the editor but only the root tab publishes.
No search	Zero server-side full-text search. The dashboard filters titles client-side; the chat source picker does a title-only `ilike`.
No repo bridge	No way to sync a repo's `/docs` markdown into a living site. The API already accepts markdown — but one doc per request, no manifest, no idempotence.

The differentiator against Notion/Confluence (internal) and Mintlify/GitBook (external) is the agent surface that already exists: hdk_ API keys with per-agent attribution, an MCP server, a published skill, and account-level webhooks. Nobody else's docs product treats the agent as a first-class maintainer.

Decisions locked

One "sites" primitive serves both wedges — internal wiki and public docs site are the same thing with different access settings.
App-canonical + repo sync — HTML Docs is the source of truth; markdown syncs in from repos via CLI/CI (one-way; two-way deferred).
Knowledge layer v1 = search + "ask this wiki" — full-text search and grounded AI answers first; wikilinks/backlinks/graph later.

02The core idea: a site is a published folder

Folders already have everything a wiki needs for structure and access: nesting (25 deep), collaborator roles (viewer → admin), cascading permissions, visibility, share codes. So we don't invent a new container — we add a thin sites table that points at a folder and owns only the publishing concerns: the URL slug, theme, index page, and (later) a custom domain. Pages are ordinary documents inside the folder, each gaining a page slug and a position.

Fig. 1 — Blue marks what's new. The folder tree, permissions, and page-rendering pipeline are untouched; the chrome injection reuses the exact string-injection seam that already adds OG tags and the view tracker to every published page.

Internal wiki vs public docs site is one switch: a private site runs a folder-role check in the serving route (session cookies are already available there) and responds with no-store cache headers; a public site keeps today's CDN caching and gets indexed.

03The four phases

1 · Sites primitivefolder → published multi-page site with naveffort L

2 · Repo syncmd → site via CLI/CI, agents maintaineffort M

3 · Knowledgefull-text search + ask-this-wikieffort M/L

4 · Growthdomains · SEO · teams · importseffort L, parallel tracks

Each phase ships independently and is useful on its own.

Phase 1 — Sites primitive

ships the producteffort L

Everything needed to publish a folder as a navigable site at /site/<site>/<page>.

Schema: new sites table (1:1 with a folder); documents gain site_id, page_slug, page_position. Page slugs unique per site; site slugs share the existing namespace with single-doc slugs (checked in both directions).
Routing: the single-slug route becomes a catch-all. One segment behaves exactly as today (zero regression for every existing published page), then falls through to site lookup; two segments resolve site + page.
Shared chrome: sidebar (logo, page tree grouped by subfolder sections, search box), breadcrumbs, prev/next — rendered in Declarative Shadow DOM so the page's own CSS can't bleed into it. Theme knobs (accent, logo) per site; an opt-out for full-viewport pages like dashboards.
Freshness over fan-out: nav is one indexed query per request; a renamed page appears across a 250-page site within the existing 60-second cache window instead of triggering 250 cache invalidations.
Private wikis: role check in the route + private, no-store headers + noindex. Public sites keep CDN caching.
Editor: "Publish as site" on folders, page reordering, "new page in site", and a link picker that inserts links to sibling pages.
Billing: each published site page consumes one existing plan page-slot — no pricing changes needed to ship.

Phase 2 — Repo sync & agent maintenance

the wedgeeffort M

The "second brain that stays current" story: a repo's /docs folder becomes a site, and agents keep it alive.

Fig. 2 — One-way sync in; one-way notify out. When someone edits a synced page in the app, the repo owner's webhook fires with the source path so a bot can open an issue or PR.

New sync endpoint takes a manifest of pages (path, slug, title, order, hash, markdown) and upserts idempotently — unchanged files cost nothing, so CI runs are cheap.
Reuses the existing machinery: markdown→HTML conversion, document creation, and content replacement (which already snapshots a version first — in-app edit history survives syncs).
CLI + GitHub Action: npx @html-docs/cli sync ./docs --site acme; directory structure becomes nav sections; README/index becomes the index page.
Agent skill updated so any coding agent can create, sync, and maintain a project wiki with the keys/attribution that already exist.
Known limitation (documented): a full content replacement can detach comments anchored to rewritten regions; region-stable diffing is a later refinement.

Phase 3 — Knowledge layer v1

the braineffort M/L

Full-text search without a reindex pipeline: a generated tsvector column directly on the region table (the text already lives there), kept fresh by Postgres itself on every write. No dirty flags, no cron, and per-region snippets/deep-links come free. Title matches boosted in ranking.
Two surfaces: the site search box in the Phase-1 chrome (public for public sites; role-gated for private wikis), and real workspace search in the dashboard replacing today's title-only filter.
"Ask this wiki": Docsmith gains a retrieval tool over the site's pages and answers with page citations. Ships to members first; a public ask-widget on published sites comes later, gated by plan + owner-billed credits + durable rate limiting (the current in-memory limiter isn't enough for public abuse control).
Embeddings-ready: retrieval hides behind one interface, so swapping Postgres FTS for pgvector later changes the implementation, not the callers.

Phase 4 — Growth & monetization

effort L · independent tracks

Track	Approach
Custom domains first	Host routing in middleware rewrites `docs.acme.com/*` to the site; Vercel Domains API + TXT verification from a settings panel. Business plan.
SEO	Per-site sitemap.xml + robots.txt from the same route; private sites are already noindex from Phase 1.
Teams / seats	Real org model is its own initiative. Interim: folder collaborators already give per-site teams; price by sites/pages, defer seats.
Versioned docs	Defer — sync already snapshots every version, so "view page as of vX" is a cheap read-only render later.
Import funnels	Notion / Confluence / GitBook exports are zips of md+html — they funnel through the Phase-2 sync endpoint as guided flows.

04Validate these three things before writing Phase-1 code

Risk	How we de-risk it
Chrome injection vs arbitrary user HTML	Published pages are arbitrary HTML — full-viewport dashboards, flex bodies, fixed headers. Prototype the shadow-DOM sidebar against 15–20 real production pages before committing. Fallback: iframe shell (worse for SEO).
Private wikis on a CDN	One wrong cache header leaks a private wiki to strangers. Verify on a Vercel preview that sessions are readable in the serving route and that `no-store` responses never hit the edge cache.
FTS backfill at scale	Adding a generated column rewrites every region row. Dry-run the migration on a staging copy with real row counts; fallback is a trigger-maintained column with identical query shape.

05How we'll know each phase works

P1: a 10-page folder publishes as a site with nav/breadcrumbs/prev-next; a private site 302s logged-out visitors and never appears in the edge cache; every existing single-doc URL, OG image, and print view is regression-tested unchanged.
P2: syncing a real repo's docs twice makes the second run a no-op (hash idempotence); renames/deletes archive rather than destroy; an in-app edit of a synced page fires the repo webhook with the source path.
P3: seeded-corpus searches return ranked, snippeted, site-scoped results; private search is denied without a role; ask-this-wiki cites pages; the retrieval interface passes a swap test.
P4: a test custom domain serves end-to-end with correct canonical URLs.