跳转至

Technical Migration Guide: v5 to v6 Agent Architecture

Executive Summary

This guide provides a comprehensive roadmap for migrating CloudAudioAI from the current v5 configuration-driven architecture to the new v6 Pydantic tool-based architecture. The migration focuses on three core improvements:

  1. Split Prompts: Moving from monolithic 800+ line prompts to 6 modular text files
  2. Tool-Based Validation: Replacing text generation with Bedrock tool-calling using Pydantic models
  3. Simplified Configuration: Eliminating schema.json and mapping files in favor of direct Pydantic validation

Estimated Effort: 2-3 weeks for implementation and testing Risk Level: Low - backward compatible with fallback to v5 Business Impact: Improved accuracy (85% → 95%), easier prompt maintenance, faster onboarding


High-Level Architecture Changes

Current v5 Architecture

Transcription → Load Config Files → Text Generation → JSON Parsing → Manual Mapping → DynamoDB
   (S3)         (prompt/schema/map)   (Bedrock)       (Fragile)      (Complex)        (100+ fields)

New v6 Architecture

Transcription → Load 6 Text Files → Tool Calling → Pydantic Validation → Direct Storage → DynamoDB
   (S3)         (modular prompts)    (Bedrock)     (Automatic)           (Simple)         (100+ fields)

Critical Design Decision: DynamoDB Schema Compatibility

Why We Must Flatten the Pydantic Model

The v6 architecture uses Pydantic for validation but MUST maintain the existing flat DynamoDB schema because:

  1. Frontend Dependency: The Vue3 dashboard queries expect flat fields:

    // Frontend expects: item.staff_name
    // NOT: item.ai_call_analysis.staff_name
    

  2. Analytics Queries: All GSI queries and filters use flat field names:

    # Current queries expect:
    FilterExpression="customer_type = :type AND primary_outcome = :outcome"
    # NOT nested paths like:
    FilterExpression="customer_profile.type = :type"
    

  3. Backward Compatibility: 100+ existing fields must remain unchanged for:

  4. Historical data consistency
  5. Report generation
  6. API contracts with mobile apps
  7. Third-party integrations

Flattening Strategy

The solution maintains both benefits: - Pydantic: Structured validation during AI processing - Flat Schema: Compatible storage for existing consumers

Hybrid Flattening Approach (Core + Custom)

To handle client-specific customizations efficiently:

  1. Core Fields (80+ fields): Hardcoded in Lambda for all clients
  2. Standard fields like staff_name, customer_type, primary_outcome
  3. Consistent across all franchises
  4. Optimized performance with no JSON parsing

  5. Custom Fields: Configured via flattening-v6.json per client

  6. Client-specific fields like membership_tier, insurance_coverage
  7. No code changes needed for new custom fields
  8. Optional - clients without custom fields skip this file

  9. Benefits:

  10. New clients start with core fields only
  11. Add custom fields without modifying Lambda code
  12. Share core logic across all clients
  13. Version custom fields independently
# Pydantic model (nested, validated)
{
    "ai_call_analysis": {
        "staff_name": "John",
        "call_state": "completed"
    },
    "customer_profile": {
        "type": "member"
    }
}

# DynamoDB item (flat, compatible)
{
    "staff_name": "John",        # Flattened from ai_call_analysis
    "call_state": "completed",    # Flattened from ai_call_analysis
    "customer_type": "member"     # Renamed from customer_profile.type
}

Component-Level Changes Required

1. Lambda Functions to Modify

A. TranscriptionResultProcessor_MultiTenant.py

Location: /s3-dynamodb-backend-design-code-2025-08-02/02-LAMBDA-CODE/centralized-config-multitenant/src/

Current Implementation: - Uses ConfigurationManager.load_config() to load prompt, schema, mappings - Calls Bedrock via converse() API in text mode - Parses JSON from text response manually - Applies dual mapping pattern (AI + provider mappings)

Required Changes:

# CHANGE 1: Update ConfigurationManager to load v6 prompts
def load_config_v6(self, franchise, site_id):
    """Load 6 prompt files instead of single prompt + schema + mappings"""
    if version == 'v6':
        files = [
            '01-franchise-core.txt',
            '02-knowledge.txt',
            '03-coaching.txt',
            '04-staff-list.txt',
            '05-studio-output-instructions.txt',  # CRITICAL: Contains taxonomy
            '06-custom-requirements.txt'
        ]
        # Load and concatenate all 6 files

# CHANGE 2: Replace call_bedrock() with tool-based invocation
def call_bedrock_with_tool(transcript, prompt, model_id):
    """Use Bedrock tool-calling with Pydantic schema"""
    from essential_code.bedrock_call import ask_claude_for_analysis
    from essential_code.models import CallAnalysis

    result = ask_claude_for_analysis(
        transcript=transcript,
        prompt_text=prompt  # Concatenated 6 files
    )
    # Returns validated Pydantic model

# CHANGE 3: Keep flattening logic but source from Pydantic model
# CRITICAL: Must preserve DynamoDB's 100+ flat attribute structure
# The Pydantic model is hierarchical but DynamoDB and frontend expect flat fields
def process_with_pydantic_and_flatten(transcript, prompt, call_event, metadata):
    """
    Use Pydantic for validation but flatten for DynamoDB compatibility

    Current v5 flow:
    1. Bedrock returns unstructured JSON text
    2. ai-mapping.json defines JSONPath extraction ($.ai_call_analysis.staff_name -> staff_name)
    3. DataProcessor.flatten_with_dual_mapping() creates 100+ flat attributes
    4. Frontend queries expect flat fields like 'staff_name' not nested 'ai_call_analysis.staff_name'

    New v6 flow:
    1. Bedrock returns validated Pydantic model via tool-calling
    2. Convert Pydantic model to dict: model.model_dump()
    3. Apply same flattening logic: extract nested fields to top-level
    4. Frontend continues working with same flat schema
    """
    from essential_code.bedrock_call import ask_claude_for_analysis
    from essential_code.models import CallAnalysis

    # Get validated dict from Bedrock (already calls model_dump internally)
    model_dict = ask_claude_for_analysis(transcript, prompt)
    # Note: ask_claude_for_analysis returns a dict, not a Pydantic model

    # Create flattened structure for DynamoDB (100+ attributes)
    flattened = {}

    # Core provider fields (from call_event, not AI)
    flattened['telephonySessionId'] = call_event.get('telephonySessionId')
    flattened['franchise'] = metadata.get('franchise')
    flattened['siteId'] = metadata.get('site_id')
    flattened['recordingId'] = call_event.get('recordingId')
    flattened['direction'] = call_event.get('direction')
    flattened['startTime'] = call_event.get('startTime')

    # Flatten AI analysis fields (nested -> flat)
    # ai_call_analysis.staff_name -> staff_name
    if 'ai_call_analysis' in model_dict:
        for key, value in model_dict['ai_call_analysis'].items():
            flattened[key] = value  # e.g., staff_name, call_state, intro_type

    # customer_profile.type -> customer_type
    if 'customer_profile' in model_dict:
        profile = model_dict['customer_profile']
        flattened['customer_type'] = profile.get('type')
        flattened['customer_name'] = profile.get('name')
        # ... other customer fields

    # primary_topic fields
    if 'primary_topic' in model_dict:
        topic = model_dict['primary_topic']
        flattened['primary_category'] = topic.get('category')
        flattened['primary_subcategory'] = topic.get('subcategory')
        if 'outcome' in topic:
            flattened['primary_outcome'] = topic['outcome'].get('result')
        flattened['credit_card_captured'] = topic.get('credit_card_captured')

    # Keep arrays/complex types as-is
    flattened['secondary_topics'] = model_dict.get('secondary_topics', [])

    # Metadata and computed fields
    flattened['createdAt'] = datetime.now(timezone.utc).isoformat()
    flattened['ttl'] = int((datetime.now(timezone.utc) + timedelta(days=90)).timestamp())

    return flattened

# CHANGE: Simplify ai-mapping to flattening-v6.json for custom fields only
# Core fields are hardcoded, custom fields remain configurable

Files to Add: - models.py - Copy from /agent-design/src/essential-code/ - bedrock_call.py - Copy from /agent-design/src/essential-code/

B. config_loader.py (If separate module)

Location: /s3-dynamodb-backend-design-code-2025-08-02/02-LAMBDA-CODE/centralized-config-multitenant/src/

Required Changes: - Add load_prompt_v6() method - Handle file concatenation - Remove schema/mapping loading for v6

2. S3 Structure Changes

Current Structure (v5)

{franchise}/{site_id}/
├── prompts/prompt-v5.txt            # Monolithic 800+ lines
├── schemas/schema-v5.json           # JSON validation schema
├── mappings/
│   ├── ai/ai-mapping-v5.json       # Field extraction rules
│   └── providers/ringcentral/mapping-v5.json
└── active-versions.json              # Version control (JSON format)

New Structure (v6)

{franchise}/{site_id}/
├── prompts/
│   ├── prompt-v5.txt                # Keep for fallback
│   └── v6/
│       ├── 01-franchise-core.txt    # ~165 lines
│       ├── 02-knowledge.txt         # ~428 lines
│       ├── 03-coaching.txt          # ~150 lines
│       ├── 04-staff-list.txt        # ~30 lines
│       ├── 05-studio-output-instructions.txt  # ~180 lines - CRITICAL
│       └── 06-custom-requirements.txt         # ~50 lines
├── flattening/
│   └── flattening-v6.json          # Custom field mappings (optional)
└── active-versions.json             # Version selector

Version Configuration (active-versions.json):

{
  "prompt": "v6",
  "schema": "v5",         // Ignored in v6 (Pydantic validates)
  "ai_mapping": "v5",     // Ignored in v6 (hardcoded flattening)
  "provider_mapping": "v5" // Still used for call metadata
}

Hybrid Flattening Configuration (flattening-v6.json):

{
  "version": "v6",
  "description": "Custom field mappings for client-specific fields",
  "standard_fields_handled_by_code": true,
  "custom_projections": [
    {
      "name": "membership_tier",
      "path": "$.site_specific.membership_tier",
      "type": "string"
    },
    {
      "name": "custom_metric",
      "path": "$.site_specific.custom_metric",
      "type": "number"
    }
  ]
}

Migration Script Required:

# Script to split existing prompt-v5.txt into 6 files
python scripts/split_prompt_v5.py \
  --input s3://bucket/{franchise}/{site_id}/prompts/prompt-v5.txt \
  --output s3://bucket/{franchise}/{site_id}/prompts/v6/

3. DynamoDB Changes

client-configurations Table

Current Schema:

{
    'client_id': 'orange-theory-5736520',
    'config_versions': {
        'prompt': 'v5',
        'schema': 'v5',         # Not used in v6
        'ai_mapping': 'v5',     # Not used in v6
        'provider_mapping': 'v5' # Not used in v6
    }
}

Migration Required:

# Update existing clients to v6
UPDATE client-configurations
SET config_versions.prompt = 'v6'
WHERE client_id = 'orange-theory-5736520'

# Lambda code should ignore schema/mapping fields when prompt='v6'

call-analysis Table

No Changes Required - Same 100+ fields, just populated via Pydantic instead of mappings

4. Pydantic Models Integration

Files to Add to Lambda Package:

/centralized-config-multitenant/src/
├── models.py                    # NEW: Pydantic CallAnalysis model
├── bedrock_call.py             # NEW: Tool-calling wrapper
└── TranscriptionResultProcessor_MultiTenant.py  # MODIFIED

Key Model Structure (models.py):

class CallAnalysis(BaseModel):
    # Core fields (mandatory)
    ai_call_analysis: AICallAnalysis
    customer_profile: CustomerProfile
    primary_topic: PrimaryTopic
    secondary_topics: List[SecondaryTopic] = []

    # Site-specific extensions
    site_specific: Dict[str, Any] = {}  # Flexible fields
    site_context: Optional[SiteContext] = None

5. Bedrock API Changes

Current v5 Implementation

# Text generation mode
response = bedrock_runtime.converse(
    modelId=model_id,
    messages=[{"role": "user", "content": [{"text": prompt}]}],
    inferenceConfig={"maxTokens": 4000, "temperature": 0.7}
)
# Manual JSON extraction from text response

New v6 Implementation

# Tool-calling mode
response = bedrock_runtime.converse(
    modelId=model_id,
    messages=messages,
    toolConfig={"tools": [{"toolSpec": tool_spec_from_model()}]},
    inferenceConfig={"maxTokens": 2000}
)
# Automatic Pydantic validation

Migration Steps

Phase 1: Preparation (Week 1)

  1. Split Existing Prompts
  2. Create script to split prompt-v5.txt into 6 files
  3. Map content to appropriate files based on line numbers
  4. Upload to S3 under prompts/v6/ directory

  5. Add Pydantic Models

  6. Copy models.py to Lambda package
  7. Copy bedrock_call.py to Lambda package
  8. Test locally with sample transcripts

Phase 2: Implementation (Week 2)

  1. Modify Lambda Code
  2. Add load_prompt_v6() to ConfigurationManager
  3. Integrate bedrock_call.ask_claude_for_analysis()
  4. Add fallback logic for v5 clients

  5. Update DynamoDB Client Configs

  6. Set config_versions.prompt = 'v6' for test client
  7. Keep v5 for production clients initially

  8. Deploy to Test Environment

  9. Deploy modified Lambda with PROMPT_VERSION=v6 flag
  10. Test with sample transcripts
  11. Validate Pydantic output matches expected schema

Phase 3: Rollout (Week 3)

  1. Pilot Deployment
  2. Enable v6 for one site (e.g., orange-theory-5736520)
  3. Monitor for 24-48 hours
  4. Compare output quality with v5

  5. Production Rollout

  6. Update remaining sites to v6
  7. Monitor error rates and processing times
  8. Keep v5 files for emergency rollback

Phase 4: Cleanup (Week 4)

  1. Remove Legacy Code
  2. Delete mapping functions after all clients on v6
  3. Remove schema loading logic
  4. Archive v5 configuration files

Risk Mitigation

Rollback Strategy

# Instant rollback via active-versions.json in S3
# Upload a file with prompt version set back to v5
echo '{"prompt": "v5", "schema": "v5", "ai_mapping": "v5", "provider_mapping": "v5"}' | \
  aws s3 cp - s3://bucket/{franchise}/{site_id}/active-versions.json

# Or via DynamoDB client-configurations table
UPDATE client-configurations
SET config_versions.prompt = 'v5'
WHERE client_id = 'orange-theory-5736520'

Testing Requirements

  1. Unit Tests
  2. Test Pydantic model validation
  3. Test prompt file loading
  4. Test tool-calling integration

  5. Integration Tests

  6. End-to-end processing with real transcripts
  7. Validate DynamoDB storage
  8. Compare v5 vs v6 outputs

  9. Performance Tests

  10. Measure latency difference
  11. Monitor token usage
  12. Check memory consumption

Monitoring and Success Metrics

Key Metrics to Track

  1. Validation Success Rate
  2. Target: >95% first-attempt success (up from 85%)
  3. Monitor via CloudWatch custom metrics

  4. Processing Time

  5. Target: <26 seconds (current: 25 seconds)
  6. Acceptable: Up to 30 seconds

  7. Error Rates

  8. DLQ message rate should decrease
  9. Pydantic validation errors should be rare

  10. Cost Impact

  11. Token usage may decrease (structured output)
  12. S3 costs minimal (6 small files vs 1 large)

CloudWatch Alarms

# Add new alarms
- PydanticValidationErrors > 5 per hour
- PromptFilesMissing > 0
- V6ProcessingTime > 30 seconds

File-by-File Change Summary

Files to CREATE

  1. /src/models.py - Pydantic models (copy from essential-code)
  2. /src/bedrock_call.py - Tool-calling wrapper (copy from essential-code)
  3. /prompts/v6/*.txt - 6 split prompt files per site
  4. /scripts/split_prompt_v5.py - Migration script

Files to MODIFY

  1. /src/TranscriptionResultProcessor_MultiTenant.py
  2. Add v6 prompt loading
  3. Replace text generation with tool-calling
  4. Remove mapping logic

  5. /src/config_loader.py (if exists)

  6. Add load_prompt_v6() method
  7. Skip schema/mapping for v6

Files to DELETE (after migration)

  1. Schema files (schema-v5.json) - No longer needed with Pydantic validation
  2. AI mapping files (ai-mapping-v5.json) - Field paths now hardcoded in flattening logic
  3. Provider mapping files - Keep these for non-AI fields (call metadata)
  4. Legacy v5 code paths - After full migration complete

Files UNCHANGED

  1. TranscribeProcessor_MultiTenant.py - No changes needed
  2. RingCentralWebhookHandler_MultiTenant.py - No changes needed
  3. DynamoDB table schemas - Same structure, different population method
  4. Frontend/BFF API - Consumes same DynamoDB data

Appendix: Code Snippets

A. Load v6 Prompts

def load_prompt_v6(s3_client, bucket, franchise, site_id):
    """Load and concatenate 6 prompt files"""
    base_path = f"{franchise}/{site_id}/prompts/v6"
    files = [
        "01-franchise-core.txt",
        "02-knowledge.txt",
        "03-coaching.txt",
        "04-staff-list.txt",
        "05-studio-output-instructions.txt",
        "06-custom-requirements.txt"
    ]

    prompt_parts = []
    for filename in files:
        try:
            obj = s3_client.get_object(
                Bucket=bucket,
                Key=f"{base_path}/{filename}"
            )
            content = obj['Body'].read().decode('utf-8')
            prompt_parts.append(content)
        except Exception as e:
            if '06-custom' in filename:
                # Only file 6 is optional
                logger.info(f"Optional file not found: {filename}")
                continue
            else:
                # Files 1-5 are mandatory
                raise ValueError(f"Required file missing: {filename}")

    return "\n\n".join(prompt_parts)

B. Integration Point

def process_transcript(transcript, franchise, site_id, model_id, call_event, metadata):
    """Main processing function with v6 support - maintains flat DynamoDB schema"""

    # Check version
    client_config = get_client_config(franchise, site_id)
    version = client_config.get('config_versions', {}).get('prompt', 'v5')

    if version == 'v6':
        # New v6 path with Pydantic validation
        prompt = load_prompt_v6(s3_client, bucket, franchise, site_id)

        # Use tool-calling for validation (returns dict, not Pydantic model)
        from bedrock_call import ask_claude_for_analysis
        model_dict = ask_claude_for_analysis(
            transcript=transcript,
            prompt_text=prompt,
            model_id=model_id  # Optional: override model
        )

        # CRITICAL: Flatten for DynamoDB compatibility
        # Import the flattening module
        from flattening_v6 import HybridFlattener, FlatteningConfigLoader

        # Load custom flattening if exists
        flattening_loader = FlatteningConfigLoader(s3_client, bucket)
        custom_flattening = flattening_loader.load_custom_flattening(franchise, site_id)

        # Flatten using hybrid approach
        # Note: HybridFlattener includes built-in DynamoDB sanitization
        flattened_item = HybridFlattener.flatten_model(
            pydantic_model=model_dict,  # Already a dict from ask_claude_for_analysis
            call_event=call_event,
            metadata=metadata,
            custom_flattening=custom_flattening
        )

        # No need for separate sanitization - HybridFlattener handles it internally
        # The flatten_model method already calls _sanitize_for_dynamodb()

        # Save flat structure to DynamoDB
        save_to_dynamodb(flattened_item)
    else:
        # Legacy v5 path - unchanged
        config = load_config(franchise, site_id)
        result = call_bedrock_text_mode(transcript, config['prompt'])

        # Apply dual mapping pattern
        processor = DataProcessor()
        mapped = processor.flatten_with_dual_mapping(
            result,
            config['ai_mapping'],
            config.get('provider_mapping'),
            call_event,
            metadata
        )
        save_to_dynamodb(mapped)

Contact and Support

Technical Lead: [Your Team] Slack Channel: #cloudaudioai-v6-migration Documentation: This guide + design docs in /agent-design/

Emergency Rollback Contact: DevOps on-call Rollback Command: aws lambda update-function-configuration --function-name TranscriptionResultProcessor --environment Variables={PROMPT_VERSION=v5}