Status: v0.1 (drafted 2026-05-16) — pre-1.0 Architect-lane ADR per MUX/UI Round 2 (Surface 5 user-facing search is post-1.0; this ADR commits to the index architecture before 1.0 so new surfaces have known indexing shape) Date: 2026-05-16 (v0.1 — third ADR in the MUX/UI Round 2 sequence: ADR-062 (e2e suite) → ADR-063 (audit-envelope read) → ADR-064 (search index)) Supersedes: None (extends existing fragmented search surfaces with a coherent project-wide architecture) Issues: #786 (GLUE-HISTORY-DIFF — existing conversation search via title; predecessor); #1090 (MUX/UI gap — Round 2 ratified Surface 5 deferral with pre-1.0 index ADR commitment) Related: ADR-054 (Cross-Session Memory Architecture — Layer 3 User History uses a similar text-search shape and is a prior reference instance), ADR-062 (Project-Scope E2E Suite — Phase 5 cross-host extension informs BYOC-distributed indexing), ADR-063 (User-Facing Audit Envelope Read Surface — audit envelope searchability is a forward-question this ADR scopes), Pattern-072 (Registries that Grow into Architectural Shapes, Proven — per-surface indexing declarations are same-shape registry pattern) Deciders: Chief Architect (drafted); Lead Developer (implementation refinement when Surface 5 ships); CIO (methodology shelf consideration for per-surface indexing declarations)
The project has accumulated fragmented search surfaces across multiple domains:
| Surface | Current implementation | Index type |
|---|---|---|
| Conversation list filter | web/api/routes/conversations.py:262 — search: str query param; Postgres LIKE on title |
Text (title only) |
| User history search | web/api/routes/user_history.py:109 — /api/v1/users/me/history/search (title/preview/topics) |
Text (multi-field) |
| Knowledge graph query | web/api/routes/knowledge_graph.py:266 — search_term on node names/descriptions |
Text (graph nodes) |
| Document ingestion | services/knowledge_graph/ingestion.py — ChromaDB vector store + Postgres FTS for metadata |
Vector + Text |
| Editorial draft/calendar | services/editorial/{draft,calendar}.py — Postgres FTS |
Text |
Each surface chose its own indexing shape based on local needs. No project-wide commitment exists for:
MUX/UI Round 2 ratified Surface 5 (user-facing search interface) as post-1.0 because the unified-search UX is its own project. The architect-lane commitment that lands pre-1.0 is what this ADR provides: the index architecture, so when new surfaces ship between now and 1.0, they have a known indexing shape to follow rather than each surface negotiating an ad-hoc indexing decision at filing time.
Three reasons the index decision can’t wait:
task_type, safe_surface(), probe registry). Naming the registry shape pre-1.0 means new declarations land in a consistent place rather than re-discovering the shape per surfaceSearch index architecture is a per-surface declaration following a project-wide registry, layered across Postgres FTS (text-structured) + ChromaDB (vector-semantic), with query-time access control filtering and synchronous-text-async-vector freshness model. Cross-host search distribution is deferred to BYOC Phase 5 (per ADR-062’s cross-host trigger) but the architecture is forward-compatible.
The principle is a synthesis of three commitments: layered storage, declarative registry, and access control discipline.
When a new surface ships, three questions decide its indexing:
Q1 — Should this surface be searchable?
Default: NO (every surface added to the search index adds maintenance + freshness + access-control cost). Surfacing requires explicit decision based on three criteria:
Surfaces NOT searched: internal request IDs, audit envelopes (Pattern-071 defensive posture), system telemetry, transient state.
Q2 — Which index type does this surface use?
Two layers, chosen by data shape:
tsvector columns; queries via to_tsquery with rankingDefault to Postgres FTS unless semantic similarity is the use case.
Q3 — Freshness model for this surface?
Two models, chosen by index type:
tsvector columns regenerated in the same transaction as the write (via Postgres trigger or service-layer code). Query consistency: immediateFollowing Pattern-072 (Registries that Grow into Architectural Shapes, Proven via #1094), each searchable surface declares its index shape in a central registry. Proposed location: services/search/index_declarations.py (or analogous).
@dataclass
class IndexDeclaration:
surface: str # the surface name (e.g., "conversations", "user_history", "knowledge_graph_nodes")
enabled: bool # whether this surface is in the project-wide search index
index_type: Literal["postgres_fts", "chromadb_vector", "both"]
freshness: Literal["sync_on_write", "async_eager", "async_lazy"]
access_control: Callable # query-time filter; takes (user_id, raw_results) → filtered_results
notes: str # rationale for inclusion / exclusion / configuration choices
The registry serves as:
notes field records the Q1/Q2/Q3 reasoning so future-author confidence has a referenceThe registry is third+ application of the Pattern-072 shape (after task_type registry and probe registry from ADR-062). Pattern recognition trigger for promotion of the registry shape to “standard architectural primitive” has fired multiple times across distinct surface domains.
Search results are post-filtered by JWT user authorization at query time, not at index time alone.
Rationale:
Each surface’s access_control: Callable in the registry takes raw results and filters per user. Common shape: rejoin results against the source table with user-ownership check; drop entries the user can’t access.
Exception (acceptable index-time filtering): partition indices by user_id when the data is structurally per-user (e.g., user history is naturally user-scoped; the index is queryable only with user_id key). Cross-user-shared indices (knowledge graph nodes; document corpus) require query-time filtering.
Server-side indexing remains canonical; cross-host search distribution is deferred to BYOC Phase 5 per ADR-062’s cross-host trigger.
When BYOC MCP server packaging ships:
This keeps the server as single-source-of-truth for index state; clients are stateless consumers. Per-host content (e.g., Slack messages that haven’t been mirrored to server-side substrate) is not in the unified search until the substrate-sync question is resolved (separate ADR or BYOC-side decision).
IndexDeclaration is third+ application of the same architectural primitive; Pattern-072 (Proven) recognition reinforcesenabled=False default; declaration is a one-line entry. Cost is non-zero but bounded.sync_on_write if needed (most surfaces don’t need it)The architecture is grounded by five existing search-adjacent implementations, each demonstrating one aspect of the principle:
| Instance | Validates |
|---|---|
Conversation list filter (conversations.py:262) |
Postgres LIKE → upgrade path to Postgres FTS; per-user partitioning works |
User history search (user_history.py:109) |
Multi-field Postgres-based text search; per-user query-time filtering |
Knowledge graph node query (knowledge_graph.py:266) |
Cross-entity text search; query-time access control |
Document ingestion (knowledge_graph/ingestion.py) |
ChromaDB vector store + Postgres FTS for metadata; layered storage in production |
Editorial draft + calendar (editorial/{draft,calendar}.py) |
Postgres FTS for structured text content |
Phase 2 implementation (when Surface 5 ships) folds these into the IndexDeclaration registry as the first five entries.
The IndexDeclaration registry is the third+ application of the registry-as-architectural-shape primitive Pattern-072 names (Proven via #1094 close-out 2026-05-15):
| Instance | Surface |
|---|---|
| 1 | task_type registry → model + handler dispatch |
| 2 | safe_surface() registry → permission-gating |
| 3 | Probe registry (ADR-062 Layer 1) → e2e suite |
| 4 | IndexDeclaration registry (this ADR) → search corpus management |
Pattern-072’s recognition discipline (typed enum, documented consumer set, explicit default policy, register-time validation) applies cleanly: IndexDeclaration is a dataclass (typed); the registry is a single file (documented consumers); enabled=False is the explicit default; registry-time validation at Phase 2 confirms required fields present.
ADR-054 Layer 3 (User History) uses the same text-search shape this ADR generalizes. The User History search at user_history.py:109 is one of the five reference instances above; ADR-054’s Layer 3 commitment to per-user text search is the structural precedent for ADR-064’s per-surface declarative approach.
docs/internal/architecture/current/adrs/adr-054-cross-session-memory-architecture.mddocs/internal/architecture/current/adrs/adr-062-project-scope-e2e-suite.mddocs/internal/architecture/current/adrs/adr-063-user-facing-audit-envelope-read-surface.mddocs/internal/architecture/current/patterns/pattern-072-registries-that-grow-into-architectural-shapes.mddocs/internal/architecture/current/patterns/pattern-070-cleanup-job-with-cancellation-hygiene.mdmailboxes/{cohort}/inbox/mux-ui-gap-cxo-round-2-synthesis-2026-05-15.mdmailboxes/arch/sent/memo-arch-to-cxo-lead-comms-ppm-cc-ceo-pa-exec-mux-ui-round-2-ceo-ratification-2026-05-16.mdweb/api/routes/conversations.py:262 (search param)web/api/routes/user_history.py:109 (history search)web/api/routes/knowledge_graph.py:266 (node query)services/knowledge_graph/ingestion.py (ChromaDB + Postgres FTS)services/editorial/{draft,calendar}.py (Postgres FTS)sync_on_write for text + async_eager for vector; high-write surfaces may need async_lazy for cost reduction. Phase 2 per-surface decisions.— Chief Architect, 2026-05-16 v0.1 (Pre-1.0 Architect-lane ADR per MUX/UI Round 2 Surface 5 ratification; commits to project-wide search index architecture before Surface 5 user-facing search ships post-1.0)