ADR-037: Test-Driven Locking Strategy

Status: Accepted Date: September 26, 2025 Deciders: Christian Crumlish (PM), Claude Opus (Chief Architect) Category: Testing, Quality, Methodology

Context

The 75% pattern revealed a critical weakness in our development process: completed work can become disabled through “temporary” comments, TODO markers, or partial implementations. During CORE-GREAT-1, we discovered QueryRouter had been 75% complete but disabled with TODO comments for months, blocking 80% of features.

We need a systematic approach to prevent completed work from regressing, being accidentally disabled, or degrading in quality. This approach must balance preventing regression with maintaining development velocity.

Decision

We will implement a comprehensive Test-Driven Locking Strategy that makes regression impossible without deliberate test modification. Each completed component or epic will have multiple lock mechanisms that prevent various forms of degradation.

Lock Categories

1. Existence Locks

Tests that verify a component exists and is initialized:

Component cannot be None/null
Initialization cannot be commented out
Required imports must succeed
Core functionality must be callable

2. Performance Locks

Tests that prevent performance degradation:

Baseline performance metrics established
Tolerance threshold defined (typically 20%)
CI/CD fails if thresholds exceeded
Realistic targets based on actual performance, not aspirations

3. Coverage Locks

Tiered enforcement based on component status:

Completed components: 80% minimum coverage
Active development: 25% minimum coverage
Legacy/baseline: 15% minimum (prevent further degradation)
Coverage must increase with each epic

4. Integration Locks

Tests that verify component connections:

API contracts cannot be broken
Integration points must remain functional
Data flow between components verified
Plugin interfaces maintain compatibility

5. Quality Locks

Automated checks that maintain standards:

Pre-commit hooks for TODO format compliance
Link checkers for documentation
Configuration validation
Pattern detection (prevent dual implementations)

Implementation Requirements

For Each Epic/Component

During Development
- Write tests for new functionality
- Establish performance baselines
- Document expected behavior
At Completion
- Create regression test suite
- Set coverage thresholds
- Configure CI/CD gates
- Remove old patterns physically
Post-Completion
- Monitor for violations
- Update baselines only with justification
- Maintain lock tests through refactors

Lock Lifecycle

Creation

Locks are created when a component reaches “complete” status
Must be part of epic completion criteria
Reviewed during Phase Z (completion) of epic

Modification

Requires architectural review
Must maintain or strengthen protection
Changes documented in commit messages

Removal

Only when component is being replaced
Replacement must have equivalent or better locks
Transition period may have dual locks

CI/CD Integration

All locks must be enforced in CI/CD pipeline:

- name: Run Lock Tests
  run: |
    pytest tests/regression/ -v
    pytest tests/unit/ --cov --cov-fail-under=80
    pytest tests/performance/ --benchmark-fail-if-slower=threshold

Developer Experience

Locks must not significantly impede development:

Fast feedback (lock tests run quickly)
Clear error messages explaining violations
Documentation on how to work with locks
Escape hatches for legitimate changes (with review)

Consequences

Positive

Regression Prevention: Completed work cannot be accidentally disabled
Quality Maintenance: Performance and coverage cannot degrade silently
Confidence: Changes can be made knowing locks will catch breaks
Documentation: Lock tests serve as executable specifications
75% Pattern Prevention: Incomplete work becomes immediately visible

Negative

Initial Overhead: Time required to create comprehensive locks
Maintenance Burden: Lock tests must be updated with legitimate changes
False Positives: Overly strict locks may block valid improvements
Learning Curve: Developers must understand lock patterns

Mitigation

Start with critical locks, add others incrementally
Use pragmatic thresholds based on reality, not perfection
Document lock bypass procedures for emergencies
Provide templates and examples for common lock patterns

Examples from Implementation

GREAT-1C QueryRouter Locks

# Existence Lock
def test_queryrouter_must_be_enabled_in_orchestration_engine():
    engine = OrchestrationEngine()
    assert engine.query_router is not None
    assert callable(engine.query_router.route_query)

# Performance Lock
def test_performance_requirement_queryrouter_initialization_under_500ms():
    times = []
    for _ in range(10):
        start = time.time()
        QueryRouter()
        times.append(time.time() - start)
    assert sum(times) / len(times) < 0.5

CI/CD Configuration

# Coverage Lock
- run: pytest tests/ --cov=services/orchestration/queryrouter --cov-fail-under=80

# Performance Lock
- run: pytest tests/performance/ --benchmark-fail-threshold=5400ms

ADR-035: Inchworm Protocol - Sequential completion methodology
ADR-036: QueryRouter Resurrection Strategy - First application of locks
ADR-032: Intent-Based Architecture - Requires lock protection

Review Schedule

This decision should be reviewed after:

Completion of CORE-GREAT-5 (full epic sequence)
First major refactor under lock protection
Six months of operational experience

Notes

The test-driven locking strategy emerged from painful experience with the 75% pattern. It represents a shift from trust-based development (“this should work”) to evidence-based development (“this is proven to work and cannot be broken without deliberate action”).

The key insight: regression tests are not just about detecting breaks, they’re about making breaks impossible without conscious override.

“Tests make incomplete work visible. Locks prevent ‘temporary’ disabling.”