Questions for Technical System Architect

Purpose: Prepare for a meeting with a technical system architect to discuss Piper Morgan.
Audience: PM / product lead meeting the architect.
Source: Derived from codebase review, ADRs, and open suggestions (Jan 2026).


How to Use This


Current Architecture (For Context)


1. Scaling & Deployment

  1. Multi-worker / horizontal scaling
    We moved ServiceContainer off a singleton to app state (ADR-048). What’s the next set of steps you’d recommend to run multiple uvicorn workers or scale out behind a load balancer (e.g. session affinity, shared Redis, anything we might have missed)?

  2. Stateful vs stateless
    Where is our remaining state that would break under multiple instances (in-memory caches, background jobs, WebSockets)? How would you make the app “stateless enough” for horizontal scaling?

  3. Kubernetes vs simpler orchestration
    We have Docker Compose for staging (ADR-007). For a first production deployment, would you advise going straight to Kubernetes, or a simpler path (e.g. single-node + Compose, or managed app platform)? What would change in our architecture for K8s?


2. Data Model & Persistence

  1. Single source of truth for schema
    DB schema lives in services/database/models.py (SQLAlchemy) and alembic migrations; domain and docs are in other places. We’re considering a single data dictionary / data model document that can also drive an ER diagram. What format and workflow would you recommend (e.g. schema-first codegen, doc-from-DB, or doc-as-source)?

  2. Cross-session memory and retention
    We have a three-layer memory model (conversation, history, composted learning—ADR-054). How would you approach retention, archival, and GDPR-style “right to be forgotten” without breaking learning and context? Any patterns (e.g. soft delete, anonymization, separate analytics store)?

  3. Postgres + Redis + ChromaDB
    We use Postgres for relational data, Redis for cache/sessions, ChromaDB for embeddings. For a future multi-region or higher-availability deployment, what would you change (replication, failover, or consolidating/splitting stores)?


3. Integrations & Federation

  1. “Colleague that shows up where you are”
    Vision is Piper in Slack, web, future email/IDE. We have a single backend that multiple frontends call. How would you architect for Slack-first vs web-first vs adding email/IDE later (e.g. event-driven, webhooks, adapter layer, or separate edge services)?

  2. Swappable backends (Jira vs GitHub vs Linear)
    We have an integration swappability pattern (Pattern-040). When we add Jira/Linear alongside GitHub, how would you avoid the “N+1 integration” maintenance trap (unified abstraction vs thin adapters, testing strategy, feature flags)?

  3. MCP and agent protocols
    We use MCP for some integrations. Where would you draw the line between “our API + MCP adapters” vs “MCP-first and we’re just one tool”? Any pitfalls or standards we should align with?


4. Security & Multi-Tenancy

  1. Enforcing user isolation
    ADR-058 is tightening multi-tenancy (user-scoped tokens, RequestContext, owner_id). What’s your checklist for “tenant isolation” in a system like ours (auth at edge, query-level filtering, audit logging, secrets per tenant)?

  2. API keys and secrets
    We store provider API keys (OpenAI, etc.) in a keychain; user OAuth tokens in DB. How would you segment app vs user secrets, rotation, and audit in a way that stays simple for a small team?

  3. Auth and identity
    Today: JWT, optional Slack/OAuth for some flows. If we add “login with Google/Microsoft” or SSO, what would you change in our auth layer (identity provider abstraction, token format, refresh)?


5. Observability & Operations

  1. LLM observability
    We have minimal tracing on the LLM adapter (no spans, run IDs, or cost attribution per request). We’re considering something LangChain-style or OTLP. What’s the minimal observability you’d want before production (per-request latency, token usage, errors, optional sampling)?

  2. Health and dependency checks
    We have health endpoints and a setup wizard that checks DB, Redis, ChromaDB, etc. What would you add for “production readiness” (dependency health, circuit breakers, graceful degradation, or runbooks)?

  3. Feature flags and config
    Config is a mix of env, PIPER.user.md, and DB. How would you introduce feature flags and environment-specific config without turning it into a big migration (e.g. single config service, flags in DB, or external system)?


6. Strategic / Future

  1. Meta-platform (practitioner / demonstrator / enabler)
    ADR-000 describes Piper as practitioner (do PM work), demonstrator (show orchestration), enabler (let PMs orchestrate agents). From a system architecture perspective, what would you build first to support “enabler” (e.g. workflow engine, agent registry, sandbox)?

  2. Tech debt and refactor order
    We have a large web/app.py, some remaining direct ServiceContainer use (deprecated per Issue #322), and plugin loading. If you had to pick one refactor to do before scaling or adding major features, what would it be and why?

  3. Documentation and onboarding
    We have many ADRs and internal docs. What would make the system easiest for a new architect or senior engineer to understand in one day (e.g. one-pager, C4 diagrams, runbook, or “architecture decision index”)?


References (If Architect Wants to Go Deeper)

Topic Location
Meta-platform vision ADR-000
ServiceContainer lifecycle (no singleton) ADR-048
Multi-tenancy isolation ADR-058
Cross-session memory ADR-054
Intent as universal entry ADR-032, ADR-039
Integration swappability Pattern-040
Staging / Docker ADR-007
PR/FAQ (product narrative) docs/internal/design/piper-morgan-prfaq.md
Piper vs Jira (positioning) docs/internal/design/piper-morgan-by-analogy.md
Open suggestions (observability, data dictionary) suggestions/SUGGESTIONS_ted.md