Site-Specific Output Architecture - Balancing Standardization with Customization (v1.1)¶

The Challenge¶

Each site may want to track different metrics while maintaining a validated, structured output. How do we balance: - Standardization: Core fields that all sites need (for aggregation/reporting) - Customization: Site-specific fields (local promotions, special programs) - Validation: Pydantic schema enforcement via Bedrock tool-calling (no schema.json needed)

Recommended Architecture: Core + Extensions¶

1. Pydantic Model Structure¶

# models.py - Modified to support site-specific extensions

from pydantic import BaseModel, Field
from typing import Dict, Any, Optional, List, Literal
from enum import Enum

class CallAnalysis(BaseModel):
    """Core fields required for ALL sites"""

    # Standard fields every site needs
    ai_call_analysis: AICallAnalysis
    customer_profile: CustomerProfile
    primary_topic: PrimaryTopic
    secondary_topics: List[SecondaryTopic] = []
    revenue_priority: RevenuePriority
    summary: str
    follow_up: FollowUp
    practical_coaching: PracticalCoaching

    # NEW: Site-specific extensions
    site_specific: Optional[Dict[str, Any]] = Field(
        default_factory=dict,
        description="Site-specific custom fields"
    )

    # NEW: Captured site context
    site_context: Optional[SiteContext] = None

class SiteContext(BaseModel):
    """Structured site-specific information"""
    promotions_mentioned: List[str] = []
    special_programs_discussed: List[str] = []
    competitor_mentioned: Optional[str] = None
    local_events_referenced: List[str] = []
    custom_metrics: Dict[str, Any] = {}

2. How Sites Define Their Requirements¶

Important: No schema.json file is needed. The Pydantic models define validation, and the output instructions guide the AI on what to populate.

File Path: `{franchise}/{siteId}/prompts/v6/05-studio-output-instructions.txt`¶

STUDIO-SPECIFIC OUTPUT REQUIREMENTS FOR WEST HARLEM:

REQUIRED SITE-SPECIFIC TRACKING:
You must capture the following in the site_context field:

promotions_mentioned:
- "student_discount" (20% off for Columbia students)
- "healthcare_worker_special" (Hospital partnerships)
- "new_year_challenge" (January only)

special_programs_discussed:
- "6am_warriors" (Early morning program)
- "columbia_crew" (Student group classes)
- "transformation_challenge" (8-week program)

competitor_mentioned:
- Track if these competitors are mentioned:
  - "equinox" (Premium competitor 2 blocks away)
  - "planet_fitness" (Budget alternative)
  - "ymca" (Community option)

custom_metrics:
- "spanish_language_request": true/false
- "parking_concern_raised": true/false
- "student_status_mentioned": true/false

IMPORTANT: Always populate site_context when these items are discussed.

File Path: `{franchise}/{siteId}/prompts/v6/06-custom-requirements.txt`¶

CUSTOM BUSINESS CONTEXT FOR WEST HARLEM:

CURRENT ACTIVE PROMOTIONS (as of January 2025):
- Student Special: 20% off unlimited for verified Columbia/City College students
- Healthcare Heroes: $99/month for hospital employees (usually $179)
- New Year Challenge: Sign up in January, get February 50% off

OUR UNIQUE PROGRAMS:
- 6AM Warriors: M/W/F intensive sessions for finance professionals
- Columbia Crew: Thursday college night classes with DJ
- Transformation Challenge: 8-week program with nutrition coaching ($299 add-on)

WHAT TO LISTEN FOR:
- Time constraints due to work schedules (suggest 6AM Warriors)
- Student budget concerns (offer student discount)
- Comparison shopping with Equinox (emphasize community over luxury)

SALES PRIORITIES:
1. Convert intro to Elite membership (highest margin)
2. Upsell Transformation Challenge to new members
3. Capture student market before semester starts

3. The Key Architectural Split¶

Content Type	Which File	Purpose
What to track	`05-studio-output-instructions.txt`	Tells AI what site-specific fields to populate
Business context	`06-custom-requirements.txt`	Provides context about programs/promotions
How to analyze	`06-custom-requirements.txt`	Business logic and priorities
Output structure	`05-studio-output-instructions.txt`	Field definitions and requirements

4. Implementation Options¶

Option A: Flexible Dict Field (Recommended for v1.0)¶

class CallAnalysis(BaseModel):
    # ... core fields ...

    site_specific: Dict[str, Any] = Field(
        default_factory=dict,
        description="Site-specific fields defined in studio-output-instructions"
    )

Pros: - Maximum flexibility - No code changes for new sites - Sites can add fields immediately

Cons: - No strict validation on site fields - Potential inconsistency between sites

Option B: Structured Site Context (Better for v2.0)¶

class WestHarlemExtensions(BaseModel):
    """Site-specific model for West Harlem"""
    student_discount_mentioned: bool = False
    competitor_equinox: bool = False
    spanish_request: bool = False
    morning_class_interest: bool = False

class TotowaExtensions(BaseModel):
    """Site-specific model for Totowa"""
    suburban_parking_discussed: bool = False
    family_membership_interest: bool = False
    kids_program_mentioned: bool = False

# Dynamic loading based on site_id
def get_site_model(site_id: str):
    models = {
        "5736520": WestHarlemExtensions,
        "Totowa": TotowaExtensions
    }
    return models.get(site_id, BaseModel)

Pros: - Full validation per site - Type safety - Clear documentation

Cons: - Requires code deployment for new fields - More complex to maintain

Option C: Hybrid Approach (Recommended Long-term)¶

class CallAnalysis(BaseModel):
    # Core validated fields (mandatory)
    ai_call_analysis: AICallAnalysis
    customer_profile: CustomerProfile
    # ... other core fields ...

    # Semi-structured site fields (validated categories)
    site_context: SiteContext  # Promotions, programs, competitors

    # Fully flexible site fields (anything goes)
    site_custom: Dict[str, Any] = {}  # For experimental metrics

5. How AI Populates Site-Specific Fields¶

The AI reads 05-studio-output-instructions.txt and knows to populate:

{
  "ai_call_analysis": { ... },
  "customer_profile": { ... },
  "primary_topic": { ... },

  "site_context": {
    "promotions_mentioned": ["student_discount"],
    "special_programs_discussed": ["6am_warriors"],
    "competitor_mentioned": "equinox",
    "custom_metrics": {
      "spanish_language_request": false,
      "student_status_mentioned": true
    }
  },

  "site_specific": {
    "intro_scheduled_time": "6:00 AM",
    "commute_time_mentioned": "20 minutes",
    "price_sensitivity_score": "high"
  }
}

6. Validation Strategy with Bedrock Tool-Calling¶

# No schema.json loading - Pydantic models ARE the schema
from models import CallAnalysis
from bedrock_call import ask_claude_for_analysis

def process_with_validation(transcript: str, prompt: str) -> CallAnalysis:
    """
    Use Bedrock tool-calling for automatic Pydantic validation
    No schema.json needed - the tool definition includes the schema
    """
    # 1. Call Bedrock with tool (automatic validation)
    try:
        result = ask_claude_for_analysis(
            transcript=transcript,
            prompt_text=prompt  # Concatenated 6 files
        )
        # Result is already validated by Pydantic!
        return result

    except ValidationError as e:
        # One retry with error feedback
        error_prompt = f"{prompt}\n\nFIX THESE ERRORS:\n{e.errors()}"
        result = ask_claude_for_analysis(transcript, error_prompt)
        return result

    # 2. Site fields are flexibly validated (warnings only)
    if site_id == "5736520" and "site_specific" in result:
        expected = ["student_discount_mentioned", "morning_interest"]
        for field in expected:
            if field not in result["site_specific"]:
                logger.warning(f"Expected site field missing: {field}")

7. Dashboard/Reporting Considerations¶

# Core fields: Aggregatable across all sites
def get_franchise_metrics(franchise: str):
    # These work for all sites
    return query("""
        SELECT
            COUNT(*) as total_calls,
            SUM(CASE WHEN primary_topic.outcome = 'success' THEN 1 ELSE 0 END) as successes
        FROM call_analysis
        WHERE franchise = ?
    """, franchise)

# Site fields: Site-specific queries
def get_site_metrics(site_id: str):
    if site_id == "5736520":  # West Harlem
        return query("""
            SELECT
                SUM(site_specific.student_status_mentioned) as student_interests,
                SUM(site_context.promotions_mentioned LIKE '%student%') as student_promo_mentions
            FROM call_analysis
            WHERE site_id = ?
        """, site_id)

Recommendations¶

For Initial Implementation (v1.0):¶

Use Option A (flexible Dict field)
Put tracking requirements in {franchise}/{siteId}/prompts/v6/05-studio-output-instructions.txt
Put business context in {franchise}/{siteId}/prompts/v6/06-custom-requirements.txt
Let sites experiment with fields without code changes
No schema.json needed - Pydantic models handle all validation via tool-calling

For Future Evolution (v2.0):¶

Analyze which site fields are commonly used
Promote common fields to site_context (semi-structured)
Keep site_specific dict for experimentation
Consider site-specific Pydantic models only if strict validation becomes critical

Example Migration Path:¶

Month 1: West Harlem experiments

site_specific: {
    "student_mentioned": true,
    "morning_preference": true,
    "price_concern": "high"
}

Month 3: Pattern emerges, promote to structured

site_context: {
    "customer_segment": "student",
    "preferred_time": "morning",
    "price_sensitivity": "high"
}

Month 6: Standardize across franchise

# Add to core CustomerProfile
customer_profile: {
    "type": "prospect",
    "segment": "student",  # Now standard field
    "price_sensitivity": "high"  # Now standard field
}

Summary¶

The simplified architecture: - All files under → {franchise}/{siteId}/prompts/v6/ (for now) - What to output → 05-studio-output-instructions.txt - Business context → 06-custom-requirements.txt - Core validation → Pydantic models via Bedrock tool-calling (strict) - Site flexibility → Dict fields (flexible) - No schema.json → Pydantic models ARE the schema

This gives each site the ability to: 1. Track their unique metrics immediately (no code changes) 2. Maintain core data quality (automatic Pydantic validation) 3. Experiment and iterate (flexible dict fields) 4. Promote successful experiments to standard fields over time 5. Avoid complexity of separate schema files