Pattern-073: Documentation-Asserted-Behavior Drift

Status

Proven — Promoted from Emerging 2026-05-18 (CIO catalog-management authority, per memo-cio-to-lead-cc-ceo-arch-host-exec-pa-pattern-073-promotion-ratified-emerging-to-proven-2026-05-18.md). Emerging filing 2026-05-16 by Lead Developer per CIO methodology disposition (memo-cio-to-lead-arch-cc-ceo-pattern-073-disposition-2026-05-16.md). Fourteen reference instances across eleven distinct surface layers logged during the May 15–20 evidence-accumulation window; cross-agent engagement during that window (Lead Dev + CIO + Architect + HOST + PM) validated the recognition discipline empirically even ahead of the formal doc-sync-sweep skill v1.0 cross-agent-application criterion. Methodology-29 (“Pattern Formation via Successful Imitation”) three-instance threshold massively exceeded; 11-layer breadth establishes the pattern as structural rather than layer-specific.

Resolution-shape framing (CIO Q5 disposition, Instance 11): the pattern’s characteristic fix is removing the misleading surface (the dead method, the asserted-but-unimplemented behavior, the stale claim), not racing to build the asserted behavior. Cleanup IS the resolution. The pattern’s name (“Documentation-Asserted-Behavior Drift”) describes the failure mode; the resolution discipline is cleanup-as-truth-restoration. This distinguishes Pattern-073 from generic Pattern-064 alive-scaffolding shapes — Pattern-064 names code that pretends to do work; Pattern-073 names narrative that asserts work the code doesn’t deliver. The fix is the same shape (remove the misleading surface) but the catch-trigger is different.

In-flight refinement (not gating Proven status): doc-sync-sweep skill v0.1 → v1.0 + cross-agent application remains queued as operational improvement on the now-Proven pattern. Slot 073 allocated after 12l pre-filing slot-availability check; 070/071/072 occupied. CIO methodology cosign on the Pattern-064-adjacent framing.

Product Relevance

Methodology / Discipline — Recognition discipline for a specific evolution shape that affects how teams maintain narrative artifacts (documentation, docstrings, comments, issue bodies, test fixtures, templated user-facing copy) as the code they describe drifts. Users will not encounter this pattern directly; agents and engineers reading and writing the project’s narrative surfaces will reach for it when judging whether an assertion in prose still matches the system’s behavior.

Context

Documentation, docstrings, comments, issue bodies, test fixtures, and user-facing canned response copy are all narrative artifacts about the system. They assert claims like “this function commits on success,” “this route reads request.state.user_id,” “all open PRs are less than 7 days old,” “moderate-complexity tasks produce 2-3 subtasks.” When the code that the narrative describes changes — or never matched the narrative — the assertion becomes drift: structurally well-formed, semantically wrong.

Where this surfaced

Fourteen independent instances within ≤120 hours (May 15-20, 2026) across eleven distinct surface layers:

  1. Methodology docs (May 15 PM)MULTI_AGENT_INTEGRATION_GUIDE.md + HOW_TO_USE_MULTI_AGENT.md referenced services/orchestration/engine.py after #1094 deleted it. A new agent following the guide verbatim would from services.orchestration.engine import OrchestrationEngine and hit ImportError. Fix: deprecation banner. (Commit 19b33a89.)

  2. Repository docstring (May 16 AM)services/database/repositories.py:2335-2337 StandupConversationRepository.add() docstring asserted “Caller owns the transaction. For per-call sessions opened in StandupConversationManager, AsyncSessionFactory.session_scope() handles commit.” But session_scope() does NOT commit (it’s session-lifecycle-only). The docstring shaped initial mental model on audit; the divergence was the bug surface for #1079. (Fix commit b5d7972d.)

  3. Templated user-facing copy (May 16 PM) — Hard-coded canned responses asserted product behavior the code didn’t honor. “Please run the setup wizard” (no setup wizard exists; fixed in #1065). “All open PRs are less than 7 days old” (handler only checked 100 most recent items; reframed via #1064 → #1096 first slice, commit 289d57ca).

  4. require_request_context orphan dependency (May 16 PM)services/auth/auth_middleware.py:395 defined a FastAPI dependency require_request_context with a docstring advertising the pattern ctx: RequestContext = Depends(require_request_context). Zero production callers. Discovered by Architect during #1015 verification; deleted in #1015 Phase 2 (commit be9456b2).

  5. Test fixture vs. classification logic (May 16 PM)tests/orchestration/test_multi_agent_coordinator.py::moderate_intent fixture’s message “Implement new API endpoint with validation and tests” triggered multiple domain expansions (testing via “validation”/”tests”, integration via “api”) which combined with EXECUTION-category default landed at COMPLEX classification → 4 subtasks; test asserted MODERATE → 2-3 subtasks. The fixture name + test name asserted “moderate” but the actual fixture exercised COMPLEX. (Fix commit 09076ada.)

  6. Incomplete pattern translation (May 16 PM) — #1038 issue body recommended applying the .with_variant(JSON, "sqlite") pattern from InsightDB to fix SQLite test compat for EthicsAuditLogDB. Body asserted the InsightDB pattern was a complete fix. But InsightDB.user_id was String, not UUID — so the with_variant alone was complete for InsightDB but incomplete for EthicsAuditLogDB’s UUID column (Python UUID objects can’t bind to SQLite). The body’s assertion (“apply same pattern”) didn’t account for the column-type difference. (Fix commit 6f429c85.)

  7. Inbox MANIFEST as derived index (May 17 AM)mailboxes/lead/inbox/MANIFEST.md asserted state _(empty)_ while the inbox directory held 12 memos. Cross-fanout fanout creates duplicate inbox copies; each agent only updates manifests they own; recipient inbox MANIFESTs only sync on recipient triage. First instance at the derived-index layer — generalizing the pattern beyond docs/docstrings/dependencies to “derived artifacts that lag a source-of-truth substrate without enforcement.” Triaged + filed via 6c5f11e1 (memo) + ff403315 (CIO Option A disposition). CIO methodology cosign on the layer generalization.

A meta-eighth instance arrived during Pattern-073 authoring: a merge-commit body for #1096 slice 1 contained the line “Fixed:” as a section header, which GitHub’s close-parser interpreted as Fixed: #1096 and auto-closed the issue despite the prose explicitly saying “Does NOT close #1096.” Verb-form drift in a commit message asserting closure that wasn’t intended.

  1. _fallback_classify production-orphan (May 30, 2026; post-promotion confirming instance)services/intent_service/classifier.py:934 defines _fallback_classify with a method name + docstring asserting “fallback classification.” Production reality: 0 production callers; 8+ test callers in tests/unit/services/test_intent_search_patterns.py + 2 archive callers. The production fallback path is LowConfidenceIntentError → middleware → floor per ADR-060/061. Same code-surface production-orphan shape as instance #4 (require_request_context). Surfaced by Architect’s #1016 (B) option close-after-fresh-verification this afternoon — methodology-30 (Consumer-Trace) caught the assertion (method-name advertising “fallback”) vs. reality (no production caller). Outside the original promotion window (May 15–20) — captured as ongoing-recurrence evidence: three production-orphan instances within ~2 weeks (May 16 require_request_context, May 30 _fallback_classify, plus May 15 methodology-core engine drift in the catalog #1 row) confirms the production-orphan sub-shape is recurring, not a one-off. CIO disposition 2026-05-30: filed (Arch’s weak preference; CIO concurs the recurring shape warrants catalog capture).

The recurring shape across all instances

A narrative artifact (prose, docstring, comment, issue body, test fixture name+content, user-facing copy, commit message) asserts a contract or describes a behavior. The code, system state, or current product behavior diverges from that assertion. The assertion is structurally well-formed (no syntax error, no missing reference) but semantically wrong (the assertion’s predicate doesn’t match the system’s actual behavior).

The asymmetry that makes this load-bearing:

Without a recognition discipline, the failure mode is invisible until acted on. Compare with Pattern-064 (“alive scaffolding that does the opposite”): code-Pattern-064 fails at runtime (eventually visible to users); doc-073 fails at next reader’s audit, often after the reader has already made a decision based on the drifted assertion. Pattern-073 is darker because the failure surface is one layer removed from runtime — the system doesn’t break, the reader’s mental model does.

Problem

The failure mode

Narrative artifact A asserts: "X behaves as Y"
   → Reader trusts A's assertion + makes decision D based on Y
   → Code C (the subject of A's assertion) behaves as Z (≠ Y)
   → Decision D is wrong; failure surfaces at D's downstream consequence
   → Diagnosis requires reading A AND tracing C AND noticing the Y≠Z gap

In contrast, Pattern-064 (Alive Scaffolding) fails at runtime:
   → Code C looks live but does nothing
   → Stress-test exercises C
   → C's no-op behavior surfaces immediately

Why the verb tense + quantifier matters

Across all six reference instances, the drift was concentrated in assertions that named a specific code surface or behavior in present tense (“session_scope() handles commit”; “All open PRs are less than 7 days old”; “request.state.user_id is read by all routes”). Past-tense narration (“session_scope() handled commit until #1079”; “All open PRs were less than 7 days old in the 100-item scan”) would have been correct in many of these cases. The pattern’s recognition cue is therefore tied to verb form: present-tense assertion about a specific code surface or behavior, made by a narrative artifact that’s not auto-generated from the code itself.

Where this will surface

Five narrative-artifact layers in PM’s codebase are most prone:

  1. Code docstrings that name specific dependencies, contracts, or downstream behaviors
  2. Architecture / methodology docs that name specific code surfaces (file paths, class names, function names)
  3. Issue bodies that describe current state (“All other route handlers extract user_id from request.state.user_id”)
  4. Test fixture names + messages that assert categorical state (moderate_intent with content that triggers COMPLEX)
  5. User-facing canned response copy that asserts product behavior (“Please run the setup wizard”)

A sixth layer worth watching: commit messages with close-magic-strings or assertions about subsequent state. The auto-close meta-instance above shows GitHub’s parser as an enforcement layer for assertions about what a commit “does.”

The Pattern-064 sibling relationship

Pattern-064 (Alive Scaffolding) and Pattern-073 share a structural shape but differ in failure surface and stress-test path. Both name infrastructure that looks present but doesn’t match its apparent contract:

Where Pattern-064 governs the code’s truth-telling, Pattern-073 governs the project’s narrative truth-telling. A codebase can pass all Pattern-064 audits and still have widespread Pattern-073 drift — and vice versa.

Solution

Recognition trigger

A narrative artifact hits the drift threshold when:

  1. It makes a present-tense assertion about a specific code surface or behavior (named file, function, class, contract, quantifier-bounded claim), AND
  2. The asserted behavior cannot be confirmed by direct reading of the named surface — requires a verification step (run the test, trace the call chain, query the database state, check the API result).

A non-trivial subset of narrative artifacts will satisfy condition (1) — that’s normal and load-bearing. The recognition trigger fires when condition (2) is not routinely satisfied by the artifact’s authoring process. I.e., the author wrote the assertion without verifying it (or verifying it once but not re-verifying as the code evolved).

Discipline (apply continuously)

  1. Verb-tense discipline at authoring time. When writing a narrative assertion about a specific code surface, prefer past-tense or scope-bounded present-tense:
    • ❌ “X handles commit” (universal, unverified-at-read-time)
    • ✅ “X handled commit until #N” (past-tense narration; correct + ages well)
    • ✅ “X is intended to handle commit; verify before relying on it” (present-tense + verification disclaimer)
    • ✅ “Tested 2026-05-16: X commits on success in the SQLite + Postgres dialects checked” (scope-bounded assertion)
  2. Doc-sync-sweep skill at change boundaries. After substantive code-shipping commits, run .claude/skills/doc-sync-sweep/SKILL.md: identify likely-affected narrative surfaces, audit each for drift, fix in place or capture as discovered work. Filed 2026-05-16 as the operational discipline for Pattern-073 instance prevention.

  3. Audit-cascade at multi-phase work transitions. Pattern-049 (Audit-Cascade) already enforces audit between phases of multi-step work. Add a doc-drift sub-step: at each phase boundary, audit the narrative artifacts that named the previous phase’s specific surfaces.

  4. Independent verification at high-stakes decisions. When a decision will turn on a narrative artifact’s assertion (e.g., “this is the canonical pattern; follow it”), the disciplined author runs a grep / test / trace to confirm the assertion still holds. The minute-of-verification eliminates hours of downstream wrong-decision rework. Example: Architect’s #1015 ratification verified every load-bearing claim in Lead Dev’s Phase 1 design memo before concurring — that audit found the third reference instance (orphan require_request_context).

  5. Recognize the failure mode in retrospect. When fixing a bug, ask: “Did a docstring / comment / issue body / fixture / canned copy shape my initial mental model in a way that turned out to be wrong?” If yes, the drift IS the bug surface (not just incidental to the code fix). File the instance + verify the rest of the same narrative surface hasn’t drifted similarly.

Architectural reasoning

The narrative artifacts in the project are themselves infrastructure for the team’s collective mental model. They have the same load-bearing property as code: structurally consistent, semantically dispatched-on at decision time, expensive to refactor when wrong.

Pattern-064 names the failure mode where code-infrastructure looks present but does nothing. Pattern-073 names the same failure mode at the narrative-infrastructure layer. The reason the two are siblings rather than the same pattern: the stress-test surface is different (runtime vs. reader), and the recognition discipline is different (testing vs. verb-tense + verification cadence).

Methodology-29 (Pattern Formation via Successful Imitation) predicts that recognition can run ahead of codification when the failure mode is vivid. The 6-instance-in-48-hours cluster on May 15-16 is the textbook signal: the same shape recurred in five different surface layers, recognized by both Lead Dev and Architect independently. Codification (this Pattern entry + the doc-sync-sweep skill) closes the recognition-to-discipline gap.

Forces / when to apply

Apply Pattern-073’s recognition discipline when:

Do NOT over-apply when:

Code references (reference instances)

The fourteen instances, with their resolution paths, documented at the file-and-line level. Per CIO methodology disposition (2026-05-17): the cleanup is removing the misleading surface, not racing to build the asserted behavior. Pattern-073 catch + cleanup = surface removal; building the asserted behavior is a separate concern that may or may not follow. Instance 11 paradigmatically shows this split.

Anti-pattern recognition

When a narrative artifact is being read or written, the following surface-cues are signals to invoke Pattern-073’s verification discipline:

When ≥2 of these signals are present in an artifact, the verification cost-benefit tips toward running the doc-sync-sweep on adjacent surfaces.

Relationship to other patterns

Adjacent manifestations

(Per CIO’s filing disposition: file under the narrower “Documentation-Asserted-Behavior Drift” title; note the broader framing here. If broader instances accumulate, the title becomes a future evolution-note.)

The narrower title catches narrative artifacts asserting behavior. A broader formulation — “asserted-but-not-enforced contracts” — would extend to:

If two more instances of any of these accumulate independently of the canonical narrative-asserted form, the broader framing becomes an Evolution entry on this pattern. Until then, file under the narrower title.

The unifying insight (CIO disposition 2026-05-17)

Instance 7 (mailboxes/lead/inbox/MANIFEST.md) generalized the pattern from “narrative about code” to derived artifacts that lag a source-of-truth substrate without enforcement. The unifying lesson across all four observed layers:

Derived artifacts lag without enforcement; trust them only with awareness of the lag. When a derived artifact is the only signal a consumer reads (e.g., an autonomous loop checking “do I have work?” via a manifest count), the lag becomes a correctness bug. Mitigations cluster into three families: enforce sync at write time (hook/CI), poll the source of truth not the derived index, or accept the lag and tell consumers to.

Promotion criteria (historical — pattern promoted 2026-05-18)

When this pattern was filed Emerging (May 16), promotion to Proven required:

  1. One more independent instance within 14 days (by 2026-05-30) — not from a related investigation; surfaced by a different agent or a different surface layer than the six May 15-16 instances.
  2. The doc-sync-sweep v0.1 skill operates cleanly on a fresh-fix flow — applied by an agent who didn’t draft the skill, surfacing a real drift instance that’s then fixed via the documented procedure.

Promoted 2026-05-18 (CIO catalog-management authority): Criterion 1 was comfortably exceeded — seven additional instances (instances 7-13) across five new surface layers landed during the May 16-18 window, surfaced by Lead Dev + CIO (manifest-sync) + Architect (Q3+Q4 ratification on Instance 11). Cross-agent engagement (5 distinct roles: Lead Dev / CIO / Architect / HOST / PM) validated the recognition discipline empirically. Criterion 2 (literal doc-sync-sweep skill v1.0 cross-agent application) was not literally satisfied; CIO disposition was that methodology-29’s substantive evidence bar — independent recognition of the same pattern-shape, without enforcement, with carried recognition discipline — was decisively met on the empirical signal, making the literal skill-cross-agent criterion an in-flight refinement rather than a Proven-gating requirement.

Operational follow-up (not gating; on the now-Proven pattern):

Cross-references

— Lead Developer, 2026-05-16 (draft); CIO methodology cosign pending review