Internal migration plan · v3

Hosting HTML-Docs Internally at Meta
Migration Plan

One codebase. Two deployment targets. Keep the external Vercel app running for outside users while shipping the same product internally on Meta's Nest platform — with first-class analytics and keyboard shortcuts on both.

Context

HTML-Docs currently runs on Vercel + Supabase (Auth + Postgres) + Liveblocks + OpenAI. None of these are usable for an internal Meta deployment: external SaaS is blocked by data policy, Supabase email/password isn't an accepted internal auth method, and Vercel isn't a Meta hosting target. We need to deploy this app to employees on the corp network while changing as little of the editor logic as possible — and the external version must keep running for outside users.

The right target is Nest (fbsource/nest/apps/) — Meta's internal Next.js platform. It runs the exact stack we already use (Next.js 16 + React 19 + Node 22), supports WebSockets and SSE through Proxygen/X2P, gives you employee auth at the edge for free, and ships first-class libraries for every infra dependency we have (XDB for Postgres, Manifold for storage, @nest/llm for OpenAI, Chronos for cron). The editor code itself — components/shadow-dom-viewer.tsx, lib/html-parser.ts, lib/interactive-patterns.ts, all the Tiptap / Radix UI — does not change.

User-confirmed scope

Maintenance model — one codebase, two targets

Because the external version keeps serving outside users while the internal version stands up for Meta employees, we need ONE codebase that builds to both. The pattern is a thin adapter layer + a build-time DEPLOY_TARGET env var (public | meta) that picks which implementation gets bundled. Editor code calls adapter interfaces, never vendor SDKs directly. Next.js tree-shakes the unused branch — the public build ships zero Nest code and the Meta build ships zero Supabase code.

DEPLOY_TARGET = public | meta build-time switch EDITOR CORE — shared, written once region parser · Shadow DOM viewer · Tiptap · comments · versions AI tool definitions · webhook state machine · Radix/shadcn UI lib/analytics · lib/shortcuts · all pages and API routes ~80% of the codebase ADAPTER INTERFACES · lib/{db, auth, llm, storage, realtime} db auth llm storage realtime PUBLIC build → Vercel lib/*/impl/public/ • Supabase Postgres • Supabase Auth • @ai-sdk/openai • Supabase Storage • Liveblocks • Vercel Cron • /admin/stats dashboard META build → Nest lib/*/impl/meta/ • XDB + Drizzle • @nest/intern-auth • @nest/llm • Manifold (@nest/ent) • WS stub → ws (Phase 4) • Chronos • Unidash dashboard
Figure 1. Editor core (including the new lib/analytics and lib/shortcuts modules) is shared; only the bottom layer swaps per target.

The contract

export const db = process.env.DEPLOY_TARGET === 'meta'
  ? await import('./impl/meta')
  : await import('./impl/public')

File layout

lib/ ├── db/ ├── index.ts ← dispatcher: imports public or meta ├── types.ts ← shared types both impls satisfy └── impl/ ├── public/ ← current Supabase code, moved here │ ├── client.ts │ ├── documents.ts │ └── … └── meta/ ← new Drizzle/XDB code ├── schema.ts ├── documents.ts └── … ├── auth/ storage/ llm/ realtime/ { index, types, impl/public, impl/meta } ├── analytics/ single impl — writes through lib/db └── shortcuts/ pure editor-core, no impl split
Figure 2. Five infra adapters (split). Two shared modules (analytics, shortcuts) live alongside — no public/meta split needed.

Cost model

Cheap
~80% of code
Editor core, Tiptap, comments, region logic, AI tool defs, UI, analytics, shortcuts. Written once.
Expensive
~20% adapter
Two impls per adapter. New DB column = two migrations. AI tested on both Plugboard and OpenAI.
Permanent tax
~5–10%
Feature-time overhead. Manageable if vendor SDK imports never leak past the adapter boundary.

Escape hatch: if a second team starts maintaining the internal build, or the two versions need genuinely different feature sets, switch to extracting the editor core as an internal npm package consumed by two thin app shells. Until then, the single-repo adapter model is the right cost/benefit.

Detailed architecture

This section is the implementation companion to the plan above. It zooms into request lifecycle, deployment topology, authentication, the AI request flow, the database schema, per-adapter TypeScript contracts, and environment variables — enough detail to build either target without re-deriving the design.

Request lifecycle (side-by-side)

The same HTTP request flows through symmetric layers on both targets. The middle band — Next.js routing, server actions, adapter interfaces — is identical. Only the outer edges differ: what's in front of Next.js (Vercel CDN vs X2P/Proxygen) and what's behind the adapters (Supabase vs XDB, OpenAI vs Plugboard, Liveblocks vs a WS sidecar). That's the entire architectural payoff of the adapter model.

PUBLIC · Vercel META · Nest CLIENT Browser · @ai-sdk/react · liveblocks/client public internet Browser · @ai-sdk/react · (no liveblocks) corp network only EDGE Vercel CDN / Edge Network TLS termination · static caching · geo X2P · Proxygen corp ingress · WebSocket upgrade · CAT/OIDC AUTH GATE middleware.ts · supabase cookie refresh redirects /settings → /auth/login if unauthed proxy.ts · @nest/intern-auth · useOIDC: true injects unixname/FBID into request headers SHARED · identical on both targets Next.js route handler · server action app/api/* · lib/actions/* lib/{db, auth, llm, storage, realtime, analytics, shortcuts} · adapter interfaces IMPL lib/*/impl/public/ @supabase/* · @ai-sdk/openai · @liveblocks/* · ai lib/*/impl/meta/ @nest/drizzle-xdb · @nest/llm · @nest/ent · @nest/intern-auth DATA Supabase Postgres · Storage OpenAI · Liveblocks rooms XDB MySQL · Manifold Plugboard (Claude/Llama/GPT) · own WS OPS Vercel Cron (1m) · @vercel/analytics → DROPPED /api/cron/webhook-deliveries · own analytics module Chronos cron · ODS / Scuba (optional tee) curl bearer-checked route · same analytics module All requests flow top-to-bottom. Only the OUTER bands change between targets; the middle (purple-dashed) is one codebase. DEPLOY_TARGET=public|meta selects which impl folder gets bundled at build time.
Figure 3. Symmetric request flow on both targets. The shared band is the same code, the same imports, the same routes — only the edges (CDN/proxy) and the leaf services (DB, LLM, storage, realtime) differ.

Deployment topology

Where things actually run, and what crosses which network boundary.

PUBLIC · global internet Public internet end users · @vercel/analytics CDN third-party referrers Vercel Edge Network TLS · static caching · geo routing html-docs.com · *.vercel.app previews Vercel Serverless Functions · Node 22 Next.js 16 (Turbopack) · maxDuration up to 120s Vercel Cron · 1m → /api/cron/webhook-deliveries env: SUPABASE_* · OPENAI_API_KEY · LIVEBLOCKS_SECRET_KEY External SaaS providers → Supabase: Postgres + Auth + Storage → OpenAI: chat completions + responses API → Liveblocks: realtime rooms + presence → Google: docs/slides public export (scraped) → Yahoo Finance / Ticketmaster (optional) Each crosses the public internet from Vercel functions; env-var keyed; rate limits per vendor's plan. META · corp network only Corp browser Meta-issued laptop · corp VPN / dogfood unixname session via SSO X2P · Proxygen edge OIDC handshake · CAT injection · WS upgrade *.nest.x2p.facebook.net · internalmeta.com proxy Nest FaaS · Tupperware container Next.js 16 (Turbopack) · Node 22 · scale-to-zero Chronos cron · curl bearer-checked route env via nest env set · secrets via Configerator Meta internal services → XDB: MySQL shard via @nest/drizzle-xdb → Manifold: blob storage via @nest/ent (CDN URLs) → Plugboard: Claude / Llama / GPT routing → Scuba/ODS (optional tee for analytics_events) → Service Users / Keychain (for API tokens) All flow over corp backbone; no public internet egress. Data ownership stays inside Meta.
Figure 4. Two parallel topologies. Public uses a CDN+functions+SaaS pattern; meta uses corp-ingress+FaaS+internal services. Same app, different blast radius.

Authentication flows

Both targets land the viewer's identity in the same shape ({ id, name, email }) before any server action runs — but via very different handshakes.

StepPublic (Supabase)Meta (OIDC)
1. First request Browser sends cookies if present Browser sends cookies if present
2. Unauthed? middleware.ts redirects /settings/auth/login X2P bounces to OIDC IdP, sets short-lived CAT, redirects back
3. Login email/password (or OAuth) → Supabase issues JWT in cookie OIDC IdP issues 24h token → X2P stores in CAT
4. Authed request middleware.ts calls supabase.auth.getUser() → refreshes cookie proxy.ts calls @nest/intern-auth → injects unixname/FBID headers
5. Server action reads viewer createClient().auth.getUser(){ id: uuid } getViewer(){ id: fbid, unixname, name, email }
6. Session lifetime ~1 hour token, refresh-on-touch 24 hours (OIDC default)
7. Sign out supabase.auth.signOut() → clears cookie X2P drops CAT; user re-handshakes on next visit

The lib/auth/ adapter normalizes both into a single getViewer(): Promise<Viewer | null> call so server actions don't branch on target.

AI request flow (Docsmith chat)

Most architecturally interesting path because it spans the client (streaming SSE), the LLM adapter (provider selection), and tool calling (which gates writes through user-approval cards).

Browser chat-panel.tsx Route handler app/api/chat/route.ts LLM adapter lib/llm SDK + impl impl/{public, meta} Provider OpenAI / Plugboard POST /api/chat (SSE) messages[], documentId, sources[] streamText({ model: llm.getModel('chat'), tools }) SDK call: AI SDK streamText public: openai('gpt-5.5') · meta: 'claude-sonnet-4.5' HTTPS / Thrift tools serialized to provider's native schema stream tokens (SSE chunks pass through all 5) UIMessage parts emitted as they arrive tool-call event: proposeEdits no execute() — gated by user's accept/reject card backfillToolResults() — synthesize "delivered" output onFinish → admin.from('ai_chat_messages').insert via @/lib/db once that migration lands — user clicks Accept on the card — applyChatEdit(documentId, regionKey, newHtml) server action → db.regions.update → bust cache
Figure 5. Docsmith chat sequence. The middle three actors (route, adapter, SDK) are identical on both targets — only the rightmost provider changes. Tool calls are gated by client-side accept/reject; the model never sees writes happen automatically.

Database schema

21 tables across 7 domain groups. Postgres has foreign-key constraints and one trigger (on document_events) that auto-records mutations; XDB drops the FK CONSTRAINTs (app-level invariants instead) and replaces the trigger with an explicit recordEvent() helper called from each mutation.

Core editor documents (21 cols) editable_regions document_versions document_events (trigger-fed) Comments document_comments document_suggestions Sharing shared_documents document_collaborators AI chat ai_chat_messages chat_sessions chat_session_messages Knowledge account_knowledge document_appendix_files Auth / API (public-only) users (auth.users) agent_api_keys personal_access_tokens Webhooks webhooks webhook_deliveries Analytics page_views analytics_events (new) core (high traffic) standard added by this plan FK (Postgres) / app-invariant (XDB)
Figure 6. Schema grouped by domain. Group color = ownership boundary. Lines = foreign-key relationships in Postgres; in XDB these become app-level checks since MyRocks/MySQL is being deployed without FK CONSTRAINTs at Meta.

Per-adapter TypeScript contracts

Each adapter ships a types.ts defining the shape both impls must satisfy. Snippets below show the minimum-viable shape after the first slice has landed; surfaces grow as more action files migrate.

// lib/llm/types.ts — LLM adapter
export type LLMCapability = 'chat' | 'refine' | 'review' | 'vision'

export interface LLMAdapter {
  getModel(capability: LLMCapability): LanguageModel
  webSearchTool(): Tool | null
}
// lib/db/types.ts — DB adapter (grows per table)
export interface DBAdapter {
  documents: {
    getById(id: string): Promise<Document | null>
    listByOwner(ownerId: string): Promise<Document[]>
    getIdByShareCode(code: string): Promise<string | null>
    updateHtmlContent(id: string, html: string): Promise<void>
    getKnowledge(id: string): Promise<string>
    getOwnerId(id: string): Promise<string | null>
    updateKnowledge(id: string, knowledge: string): Promise<void>
  }
  regions: {
    listByDocument(documentId: string): Promise<EditableRegion[]>
  }
  accountKnowledge: {
    get(userId: string): Promise<string>
    upsert(userId: string, content: string): Promise<void>
  }
}
// lib/auth/types.ts — Auth adapter (planned)
export interface Viewer {
  id: string             // uuid (public) | fbid (meta)
  name: string
  email: string
  unixname?: string      // meta only
}

export interface AuthAdapter {
  getViewer(): Promise<Viewer | null>
  requireViewer(): Promise<Viewer> // throws if unauthed
}
// lib/storage/types.ts — Storage adapter (planned)
export interface StorageAdapter {
  uploadDocImage(documentId: string, file: Buffer, mime: string):
    Promise<{ url: string; path: string }>
  uploadPdfOriginal(documentId: string, file: Buffer):
    Promise<{ path: string }>
  getPdfDownloadUrl(path: string): Promise<string>
}
// lib/realtime/types.ts — Realtime adapter (stubbed for MVP)
export interface RealtimeHooks {
  RoomProvider: React.FC<{ roomId: string; children: React.ReactNode }>
  useMyPresence(): [Presence, (next: Partial<Presence>) => void]
  useOthers(): readonly OtherUser[]
  useBroadcastEvent(): (event: RoomEvent) => void
  useEventListener(handler: (event: RoomEvent) => void): void
}
// lib/analytics/types.ts — Shared module (no public/meta split)
export function track(eventName: string, properties?: Record<string, unknown>): void

// Written through lib/db; same call sites, same schema, both targets.

Environment variables per target

The variable space partitions cleanly. Most public secrets disappear on meta (the corresponding service is internal); meta-specific config is owned by Configerator and the Nest CLI.

VariablePublicMetaPurpose
DEPLOY_TARGETpublic (default)metaBuild-time impl dispatch
NEXT_PUBLIC_SUPABASE_URLSupabase project URL
NEXT_PUBLIC_SUPABASE_ANON_KEYSupabase anon key (client-safe)
SUPABASE_SERVICE_ROLE_KEYSupabase admin key (server-only)
LIVEBLOCKS_SECRET_KEYLiveblocks room auth
OPENAI_API_KEYOpenAI API key
OPENAI_MODELoptionalModel override (public capabilities only)
POSTGRES_URL / DATABASE_URLPostgres connection
CRON_SECRET✓ (Vercel)✓ (Chronos)Bearer for cron auth — same route on both
ADMIN_USER_EMAILS / ADMIN_USER_IDSoptionalAdmin allowlist (use unixnames on meta)
NEXT_PUBLIC_COLLAB_ENABLEDtruefalse (Phase 4: true)Stubs realtime hooks when off
NEXT_PUBLIC_SHARE_BASE_URLoptionalCustom share URL on Vercel preview
Plugboard configvia @nest/llm defaultsModel routing + actor identity
XDB connectionvia @nest/drizzle-xdbAuto-provisioned per Nest app
Manifold bucketvia Manifold portal + @nest/ent configImage + PDF blob storage
OIDC configvia proxy.ts + ConfigeratorEmployee auth at the edge

Target architecture (per-adapter)

Concern Public impl Meta impl Adapter shape
HostingVercelNestBuild pipeline level
AuthSupabase email/password@nest/intern-auth OIDCgetViewer() facade
DatabaseSupabase PostgresXDB MySQL + DrizzleDrop FK CONSTRAINTs; trigger → recordEvent()
File storageSupabase StorageManifold via @nest/entSame { url, path } shape
LLMOpenAI (gpt-4o, gpt-5.5)@nest/llm — Claude Sonnet 4.5 / Llama 4 MaverickModel-string + actorId
RealtimeLiveblocksStubbedNEXT_PUBLIC_COLLAB_ENABLED=falseFuture: own ws server
Cronvercel.jsonChronos → curl webhook routeBearer-checked route, no app change
AnalyticsOwn implementation — single shared lib/analytics/ module on both targetsNo split needed
ShortcutsCentral registry + useShortcut() hook — pure editor-core, identical on bothNo split needed

Concrete file changes

A. New scaffolding files Nest target

Adapter scaffolding (both targets)

B. Data layer — refactor public, add meta alongside two-step

B.1 — Refactor existing public code (no behavior change)

B.2 — Add the meta implementation

Auth pieces

C. LLM adapter thin

Public impl re-exports @ai-sdk/openai as-is. Meta impl wraps @nest/llm/server behind the same interface — same streamText / generateText signatures, same tool definition shape. Callers import from @/lib/llm only.

Model selection: each call site passes a capability tag ('chat', 'refine', 'vision'); each impl maps capability → concrete model id (public: gpt-4o / gpt-5.5; meta: claude-sonnet-4.5 / llama4-maverick). Tool definitions stay identical.

D. Realtime stub no consumer changes

Move existing liveblocks.config.ts exports into lib/realtime/impl/public/. Meta impl provides hooks with identical TypeScript signatures but no-op runtime: RoomProvider renders children, useMyPresence returns default state, useOthers returns [], broadcast/listener hooks are no-ops.

Consumer components need no changes. app/api/liveblocks-auth/route.ts stays gated for PUBLIC. UI affordances read NEXT_PUBLIC_COLLAB_ENABLED and hide when off.

E. Config both targets coexist

F. Webhook system kept as-is

lib/webhooks-ops.ts and app/api/cron/webhook-deliveries/route.ts stay shared. Both targets call them. Bearer check (CRON_SECRET) is already there — Vercel Cron and Chronos both pass the same header.

Analytics — own implementation, shared module

lib/analytics/
single shared module · writes through lib/db

Rather than treating analytics as a sixth public/meta adapter, we own the analytics layer end-to-end. One shared lib/analytics/ module that writes to the local DB on both targets via the existing lib/db adapter. This collapses what would have been an adapter into editor-core territory: no parity drift, one schema, identical behavior on Vercel and Nest.

Why own it

What we accept losing vs Vercel Analytics

None are blockers for an editor where the interesting questions are "did anyone use Docsmith?" / "which docs got viewed?" / "what's the p95 chat latency?"

Schema

One new table on both targets — identical shape in Postgres and MySQL:

analytics_events {
  id            uuid / bigint pk
  user_id       uuid / fbid    nullable   // null = anon viewer on public
  session_id    text
  event_name    text           // 'docsmith_message_sent', 'pdf_imported', etc.
  properties    jsonb / json
  path          text
  referrer      text           nullable
  user_agent    text
  occurred_at   timestamptz
}

Index on (event_name, occurred_at) and (user_id, occurred_at).

Call-site convention

import { track } from '@/lib/analytics'

track('docsmith_message_sent', { documentId, model, hasAttachments: true })
track('pdf_imported', { pageCount, viaVision: true })
track('document_published', { documentId, visibility })

Single import, no public/meta split needed at the call site. Implementation calls lib/db, which routes to Supabase or XDB.

Files

Future escape valve

If Meta volume outgrows the XDB-write path, tee from lib/db/impl/meta/analytics.ts into Scribe (→ Scuba). Call sites don't change.

Keyboard shortcuts

lib/shortcuts/
pure editor-core · no impl split

A new editor feature that ships to both targets unchanged — exercising the adapter architecture as designed. Central registry + a useShortcut(id, handler, opts?) React hook. A single <ShortcutProvider> at the app root owns the registry and the underlying tinykeys listener (~400 bytes, no deps). Tooltips throughout the UI read from the registry so labels are platform-aware (⌘K on Mac, Ctrl+K elsewhere) without each component knowing the binding.

Scope rules

Initial shortcut set (v1)

Combo Action Scope
⌘KOpen command palette (uses existing cmdk dep)global
⌘/Toggle Docsmith chatdocument
⌘↵Submit chat / formmodal
⌘SCapture version snapshotdocument
⌘⇧HOpen version historydocument
⌘⇧SOpen share / publish dialogdocument
⌘.Comment on current selectiondocument
⌘FFind in documentdocument
⌘⇧FFind & replacedocument
/Slash menu (insert block at cursor)region
EscClose active dialog / menumodal
?Show shortcut cheatsheetglobal

Files

Phased rollout

Phase 0 Adapter refactor in public codebase ~1 week
Pure refactor PR shipped to Vercel with no behavior change. Introduce lib/{db,auth,llm,storage,realtime}/ dispatchers, move existing Supabase/Liveblocks/OpenAI code under impl/public/, swap all call sites. Add ESLint rule. Prerequisite to everything else.
Phase 1 Nest walking skeleton ~1 week
nest init under fbsource/nest/apps/html-docs/. Landing page + read-only doc page rendering on *.nest.x2p.facebook.net with OIDC. META impls return mocks.
Phase 2 Meta data layer ~2–3 weeks
Drizzle schema, nest xdb migrate, file-by-file adapter port. Manifold for storage. App becomes fully read/write on Meta build.
Phase 2.5 Analytics module (alongside Phase 2) ~3–4 days
Build lib/analytics/, drop @vercel/analytics, generalize app/api/track/route.ts, add analytics_events migration on both targets, instrument ~20 high-value sites, ship /admin/stats on public + Unidash on meta. Lands as soon as lib/db exists on both.
Phase 3 Meta AI + cron ~1 week
@nest/llm impl + Chronos cron. Smoke-test Docsmith, vision, comment AI, beautify. PUBLIC still calls OpenAI unchanged.
Phase 3.5 Keyboard shortcuts (parallelizable from Phase 1) ~2–3 days
Pure editor-core feature — can land any time after Phase 0. Build lib/shortcuts/, wire <ShortcutProvider>, ship v1 set + cheatsheet + command palette, wire to analytics.
Phase 4 Realtime (deferred) later
ws server in Nest container; swap lib/realtime/impl/meta/ stub for real impl. Flip NEXT_PUBLIC_COLLAB_ENABLED=true. PUBLIC still uses Liveblocks unchanged.
Ongoing Feature workflow forever
New editor features: write once. New shortcuts: register in lib/shortcuts/ only. New tracked events: one track() call. New infra-touching features: update both impl/public/ and impl/meta/, test both.

Verification

  1. Phase 0 (public, no behavior change): existing tests + Vercel preview smoke. grep -r "@supabase\|@liveblocks\|@ai-sdk/openai" app components empty outside lib/*/impl/.
  2. Local dev (meta): nest dev; internalmeta.com proxy serves home page; authed /documents populates useInternAuth().user.unixname.
  3. DB (meta): nest xdb migrate + UI create-doc; verify row in nest xdb shell; recordEvent() produces document_events rows.
  4. Storage (meta): Upload PNG; returned URL is scontent.xx.fbcdn.net and renders.
  5. AI (meta): Open Docsmith, stream message; SSE works; actorId in @nest/llm logs.
  6. PDF vision (meta): Upload 5-page PDF; llama4-maverick produces editable HTML.
  7. Cron (meta): curl --cert ... -H "Authorization: Bearer $CRON_SECRET" .../api/cron/webhook-deliveries returns 200; Chronos UI shows minute cadence.
  8. Realtime stub (meta): two tabs on same doc — edits do NOT appear live (expected); both save without errors.
  9. Analytics (both): Open doc + send Docsmith message + publish doc → confirm 3 corresponding analytics_events rows with correct user_id / event_name / properties. PUBLIC /admin/stats shows counts; META Unidash returns rows.
  10. Web Vitals (both): Scroll + navigate; web_vital rows for LCP / CLS / INP show up.
  11. Shortcuts (both): ⌘K opens command palette. ? shows cheatsheet. ⌘/ opens Docsmith. Confirm shortcut_used event lands in analytics_events.
  12. Per-target clean build: DEPLOY_TARGET=public next build ships no @nest/*. nest build ships no @supabase/* or @liveblocks/*.

What stays unchanged across both targets

Editor core — untouched

Identical-on-both-targets-by-design (new shared modules)