PM-056: Domain/Database Schema Validator

Status: ✅ Complete and Production Ready Implementation Time: 2.5 hours Prevention Focus: Catches schema drift that causes runtime errors

Overview

The Schema Validator is an automated tool that prevents domain/database model drift by comparing field names, types, and structures between domain models (services/domain/models.py) and SQLAlchemy database models (services/database/models.py).

This tool would have caught the object_id vs object_position mismatch that caused debugging complexity in PM-078’s Slack spatial adapter.

Key Features

✅ Field Validation

✅ Type Compatibility

✅ Enum Validation

✅ Relationship Validation

Usage Instructions

For Developers

  1. Run validation to check domain/database model alignment
  2. Review domain models (docs/architecture/models-architecture.md) for field details
  3. Check recent updates (docs/development/domain-model-updates-2025-07-31.md) for changes

For Code Team

  1. Address database issues identified by validator
  2. Resolve SQLAlchemy conflicts (metadata field naming)
  3. Add missing database fields for complete alignment

For CI/CD Integration

  1. Use –ci flag for automated validation in pipelines
  2. Check exit codes (0 = success, 1 = critical issues)
  3. Monitor validation results for schema drift

Usage

Command Line Interface

# Validate all models
python tools/schema_validator.py

# Validate specific model
python tools/schema_validator.py --model Product

# CI mode (exits with error code on critical issues)
python tools/schema_validator.py --ci

# Verbose output with model mappings
python tools/schema_validator.py --verbose

Makefile Integration

# Run schema validation
make validate-schema

# Full validation suite
make validate-all

# CI validation (for pipelines)
make ci-validate

Test Suite

# Run validator tests
PYTHONPATH=. python -m pytest tests/test_schema_validator.py -v

# Integration test
PYTHONPATH=. python tools/demonstrate_validator.py

Output Format

Issue Categories

Current Status (July 31, 2025)

Recent Improvements

Known Issues

Example Output

🔍 PM-056: Domain/Database Schema Validator
   Domain models: 26
   Database models: 10
   Model mappings: 10

🚨 Issues Found:
   [ERROR] Task.result: Field 'result' exists in domain model but not in database model
  Suggestion: Add Column('result', ...) to Task database model

   [WARNING] Task.updated_at: Field 'updated_at' exists in database model but not in domain model
  Suggestion: Add 'updated_at' field to Task domain model

📊 Schema Validation Summary:
   Errors: 15
   Warnings: 24
   Info: 9
   Total Issues: 48

Model Discovery

Automatic Mapping

The validator automatically discovers model pairs:

# Direct mappings
Product  Product
Feature  Feature
Workflow  Workflow

# Special naming patterns
Project  ProjectDB
ProjectIntegration  ProjectIntegrationDB
UploadedFile  UploadedFileDB

SQLAlchemy Introspection

Uses SQLAlchemy’s mapper system to accurately extract:

Prevention Examples

Object ID vs Position Issue

Problem from PM-078: Spatial adapter used object_id (string) instead of object_position (integer)

How Validator Catches This:

[ERROR] SpatialModel.object_position: Field exists in domain but not database
[ERROR] SpatialModel.object_id: Field exists in database but not domain

Type Mismatch Detection

# Domain model
class Task:
    result: Optional[Dict[str, Any]] = None

# Database model (incorrect)
class Task(Base):
    result = Column(String)  # Should be JSON!

Validator Output:

[ERROR] Task.result: Type mismatch: domain expects dict, database has String
Suggestion: Change database column to one of: ['JSON']

Enum Consistency

# Domain uses enum
class Task:
    status: TaskStatus = TaskStatus.PENDING

# Database uses string (missing enum)
class Task(Base):
    status = Column(String)  # Should be Enum(TaskStatus)!

Validator Output:

[ERROR] Task.status: Domain model uses enum TaskStatus but database uses String
Suggestion: Change database column to Enum(TaskStatus)

CI/CD Integration

Exit Codes

GitHub Actions Example

- name: Validate Schema
  run: make ci-validate

Pre-commit Hook

#!/bin/sh
PYTHONPATH=. python tools/schema_validator.py --ci

Architecture

Class Structure

class SchemaValidator:
    def __init__(self):
        self.domain_models = self._discover_domain_models()
        self.db_models = self._discover_db_models()
        self.model_mappings = self._create_model_mappings()

class ValidationIssue:
    def __init__(self, severity, category, model, field, description, suggestion):
        # Structured issue representation

Validation Pipeline

  1. Discovery: Find all domain and database models
  2. Mapping: Create domain ↔ database model pairs
  3. Field Extraction: Get fields from both models using introspection
  4. Validation: Compare fields, types, and enums
  5. Reporting: Generate structured issue reports with suggestions

Testing Strategy

Unit Tests (tests/test_schema_validator.py)

Integration Tests

Performance Tests

Success Metrics

Implementation Achievement

Prevention Capability

Strategic Impact

Future Enhancements

Cursor’s CI/CD Phase (Ready for Implementation)

Advanced Features (Phase 2)

Conclusion

PM-056 Schema Validator is production-ready and successfully prevents the type of domain/database drift that caused debugging complexity in PM-078. The tool provides:

The validator exemplifies our Systematic Excellence methodology - rather than fixing issues reactively, we now prevent them proactively through automated validation.