跳转至

CloudAudioAI Simplified Technical Requirements - v1.1

Overview

This document specifies the minimal technical changes needed for v1.1: split prompts (all under site path), staff list in prompt file, and Bedrock tool validation without schema.json. We explicitly defer complex features and maximize reuse of existing infrastructure. The key simplification is keeping all 6 files under {franchise}/{siteId}/prompts/v6/ initially, with migration to franchise-level sharing only when scale demands it.

Three Core Changes Only

1. Split Prompt Files (6 Files from 1)

1.1 File Creation from prompt-v5.txt

Create these files in S3 (all under site path for simplicity):

s3://{bucket}/{franchise}/{siteId}/prompts/v6/
├── 01-franchise-core.txt
├── 02-knowledge.txt
├── 03-coaching.txt
├── 04-staff-list.txt
├── 05-studio-output-instructions.txt
└── 06-custom-requirements.txt

Note: No schema.json needed - Pydantic models define the schema

Content Mapping from prompt-v5.txt (or prompt-v6.txt):

01-franchise-core.txt:
- Lines 1-165: Role, task, processing steps, categorization

02-knowledge.txt:
- Lines 168-596: Sales, cancellation, freeze modules
- Keep sections clearly separated with headers

03-coaching.txt:
- Lines 270-347, 389-402, 527-540, 583-595: All scenarios
- Group by topic (sales, billing, cancellation, freeze)

04-staff-list.txt:
- Simple text list of staff names with roles
- Include aliases and clarifications
- No JSON format - plain text instructions

05-studio-output-instructions.txt:
- Lines 599-619: JSON structure introduction
- Lines 620-739: TAXONOMY section (category/subcategory mappings)
- Lines 740-781: ENUMS, terminology, outcome classification
- Add at end: "INSTRUCTIONS: You MUST respond by calling the tool `record_call_analysis`"

06-custom-requirements.txt:
- Site-specific context, promotions, local market info
- Business priorities for the quarter
- Competitive positioning notes

Important: The taxonomy section (lines 620-739) is CRITICAL as it defines the valid values for the Pydantic enum fields. This ensures the AI knows exactly which subcategories are allowed for each category.

1.2 Lambda Code Change

File to Modify: TranscriptionResultProcessor_MultiTenant.py

Current Code (around line 1135):

prompt_content = load_versioned_config(
    s3_client, bucket, franchise, site_id,
    'prompts', f'prompt-{version}.txt'
)

New Code (Simplified - All Files in Site Path):

def load_prompt_v6(s3_client, bucket, franchise, site_id):
    """Load and concatenate v6 split prompt files from site directory"""
    # Simple single-path loading - all files under site
    base_path = f"{franchise}/{site_id}/prompts/v6"
    files = ['01-franchise-core.txt', '02-knowledge.txt', '03-coaching.txt',
             '04-staff-list.txt', '05-studio-output-instructions.txt',
             '06-custom-requirements.txt']
    prompt_parts = []

    for filename in files:
        try:
            obj = s3_client.get_object(
                Bucket=bucket,
                Key=f"{base_path}/{filename}"
            )
            content = obj['Body'].read().decode('utf-8')
            prompt_parts.append(content)
            logger.info(f"[PROMPT] Loaded {filename}")
        except:
            # CRITICAL: File 05 contains taxonomy - MUST be present for tool validation
            if '06-custom' in filename:
                logger.info(f"[PROMPT] Optional file not found: {filename}")
                continue  # Only file 6 is truly optional
            else:
                # Files 1-5 are REQUIRED - fallback defeats split architecture
                logger.error(f"[PROMPT] Required file missing: {filename}")
                raise ValueError(f"Required prompt file missing: {filename}. Files 1-5 are mandatory for v6.")

    return '\n\n'.join(prompt_parts)

# In main processing:
version = os.environ.get('PROMPT_VERSION', 'v5')
if version == "v6":
    prompt_content = load_prompt_v6(s3_client, bucket, franchise, site_id)
else:
    prompt_content = load_prompt_v5(...)  # existing code

2. Staff Registry in Prompt File

2.1 Include Staff in 04-staff-list.txt (Simplest Approach)

Location Pattern: s3://{bucket}/{franchise}/{siteId}/prompts/v6/04-staff-list.txt

File Content Example (plain text, not JSON):

KNOWN STAFF MEMBERS:
The following staff members work at this location:
- Drew (Sales Manager, works M-F mornings)
- Nadia (Head Coach, specializes in strength training)
- Jess (Front Desk Lead, handles billing issues)
- Holly (Assistant Manager, membership specialist)
- Ryanne (Sales Associate, Spanish speaker)
- Ryan (Front Desk, part-time evenings)
- Will (Coach, runs transformation challenges)
- Coach Marcus (Lead Coach, nutrition certified)

STAFF NAME CLARIFICATIONS:
- "Ryan" and "Ryanne" are different people
- "Jessica" is the same person as "Jess"
- "Marcus" should be identified as "Coach Marcus"

IDENTIFICATION RULE:
If you cannot identify the staff member with >80% confidence,
use "unidentified staff member" rather than guessing.

Benefits of Text Format: 1. No JSON parsing needed 2. Easy to edit directly in S3 console 3. Natural language instructions for AI 4. Included in prompt concatenation automatically

2.2 Future Enhancement: Lambda-Based Updates

For Franchise-Wide Updates (When Needed):

def update_shared_files_across_sites(franchise, updated_files):
    """
    Update files 1-3 across all sites for testing
    Keeps site-level isolation for safe rollout
    """
    sites = list_all_sites(franchise)

    # Test on one site first
    test_site = sites[0]
    for filename in ['01-franchise-core.txt', '02-knowledge.txt', '03-coaching.txt']:
        if filename in updated_files:
            s3.put_object(
                Bucket=bucket,
                Key=f"{franchise}/{test_site}/prompts/v6/{filename}",
                Body=updated_files[filename]
            )

    # After validation, roll out to other sites
    if test_successful:
        for site in sites[1:]:
            for filename, content in updated_files.items():
                s3.put_object(
                    Bucket=bucket,
                    Key=f"{franchise}/{site}/prompts/v6/{filename}",
                    Body=content
                )

This maintains site-level isolation while enabling controlled rollouts!

3. Bedrock Tool-Based Validation

3.1 Add Existing Files to Lambda

Copy these files to Lambda package: - /agent-design/new prompt/models.py - Pydantic schema (no changes) - /agent-design/new prompt/bedrock_call.py - Tool calling logic (minimal changes)

3.2 Modify bedrock_call.py

Changes needed:

# Line 9 - Update model ID for production
MODEL_ID = os.environ.get('BEDROCK_MODEL_ID', 'us.deepseek.r1-v1:0')

# Line 20 - Load prompt from parameter instead of file
def ask_claude_for_analysis(transcript: str, prompt_text: str, context: str = None) -> dict:
    # Change line 20 from:
    # SYSTEM_PROMPT = open("prompt.txt", "r").read()
    # To:
    SYSTEM_PROMPT = prompt_text

    # Rest of function remains the same

3.3 Replace Bedrock Invocation in Lambda

Current Code (around line 1250):

# Invoke Bedrock
response = bedrock_client.invoke_model(
    modelId=model_id,
    contentType='application/json',
    accept='application/json',
    body=json.dumps({
        "prompt": full_prompt,
        "max_tokens": 4000,
        "temperature": 0.7
    })
)

New Code (Clean, Deterministic):

# Use tool-based invocation
from bedrock_call import ask_claude_for_analysis

# IMPORTANT: prompt_content is the CONCATENATED prompt from all 4 files
# This includes: 01-core + 02-knowledge + 03-coaching + 04-output-instructions
# Including the taxonomy (lines 620-739 from prompt-v6.txt) in 04-output-instructions

try:
    # First attempt with tool validation
    # The concatenated prompt becomes a single system message
    analysis_result = ask_claude_for_analysis(
        transcript=transcript_text,
        prompt_text=prompt_content,  # All 4 files concatenated with \n\n
        context=None  # No memory in core pipeline
    )

    # Result is already validated by Pydantic!
    logger.info(f"[TOOL_SUCCESS] Analysis completed with validation")

except ValidationError as e:
    logger.warning(f"[VALIDATION_ERROR] First attempt failed: {str(e)}")

    # One retry with error feedback
    try:
        error_prompt = f"{prompt_content}\n\nPREVIOUS ERRORS TO FIX:\n{str(e.errors())}"
        analysis_result = ask_claude_for_analysis(
            transcript=transcript_text,
            prompt_text=error_prompt,
            context=None
        )
    except:
        # Send to DLQ for manual review
        logger.error(f"[VALIDATION_FAILED] Moving to DLQ after retry")
        raise

Testing Requirements

1. Pre-Production Testing

Test Data Preparation

# Select 10 test transcripts covering:
- 3 intro bookings
- 2 cancellations
- 2 service calls
- 3 general inquiries

# Store in: s3://{bucket}/test-transcripts/

Validation Tests

# Test script to run locally
def test_prompt_splitting():
    # Load v5 prompt
    # Load and concatenate v6 prompts
    # Verify staff placeholder exists
    # Compare structure (should be identical except placeholder)

def test_tool_validation():
    # Load test transcript
    # Call ask_claude_for_analysis
    # Verify Pydantic validation passes
    # Check all required fields populated

def test_staff_registry():
    # Load staff JSON
    # Replace in prompt
    # Verify staff identification improved

2. Production Rollout Testing

Feature Flag Implementation

# Add to Lambda environment variables
PROMPT_VERSION = "v6"  # or "v5" for rollback

# In Lambda code:
version = os.environ.get('PROMPT_VERSION', 'v5')
if version == "v6":
    # Use new logic
else:
    # Use existing logic

Monitoring During Rollout

  • Track validation success rate in CloudWatch
  • Compare output quality between v5 and v6
  • Monitor processing time (should be within 100ms of v5)

Files to Modify Summary

S3 Files (New)

  1. Create 4 prompt files under prompts/v6/
  2. Create staff/staff-registry.json per tenant

Lambda Files (Modify)

  1. TranscriptionResultProcessor_MultiTenant.py:
  2. Add prompt loading function (20 lines)
  3. Add staff registry loading (10 lines)
  4. Replace Bedrock invocation (30 lines)
  5. Total: ~60 lines of code changes

  6. Add to Lambda package:

  7. models.py (existing, no changes)
  8. bedrock_call.py (2 line changes)

Environment Variables (Add)

  • PROMPT_VERSION: "v5" or "v6"
  • Already exists: BEDROCK_MODEL_ID

Current v1.0 Approach Clarification

What We're Doing NOW:

  • Single Tool: One CallAnalysis Pydantic model for all calls
  • Single Pass: One AI invocation with concatenated prompts
  • Simple Concatenation: Just join 4 prompt files with \n\n
  • AI Determines Everything: Call type, category, subcategory from transcript
  • Industry: Only fitness/gym for now

What We're NOT Doing (Explicitly Deferred)

Not Creating in v1.0:

  • Module manifest files
  • Composition engines
  • Multiple tool definitions (deferred to v2.0)
  • New Lambda functions
  • New DynamoDB tables (yet)
  • Multi-turn conversations
  • Dynamic prompt selection based on call metadata

Not Implementing:

  • Memory system (prepared but not active)
  • Complex retry logic (just one retry)
  • A/B testing framework
  • Aggregation services
  • Cross-franchise module sharing
  • Tool catalogs
  • Control flow abstractions

Migration Checklist

Week 1: Preparation

  • Split prompt-v5.txt into 4 files locally
  • Create staff-registry.json for both franchises
  • Test concatenation produces same structure
  • Add models.py and bedrock_call.py to Lambda package

Week 2: Deployment

  • Upload v6 prompt files to S3 for one franchise
  • Upload staff registry to S3
  • Deploy Lambda with PROMPT_VERSION=v5 (safe default)
  • Test with PROMPT_VERSION=v6 on test transcripts
  • Monitor CloudWatch for errors

Week 3: Rollout

  • Enable v6 for orange-theory-5736520
  • Monitor for 24 hours
  • Enable v6 for orange-theory-Totowa
  • Remove v5 code after 1 week stable

Success Criteria

Must Pass

  • Zero increase in DLQ messages
  • Validation success rate >95%
  • Processing time <26s (current: 25s)
  • Staff identification >85% (current: 75%)

Nice to Have

  • Reduced prompt update time (measure how long to edit)
  • Cleaner CloudWatch logs with validation status
  • Easier debugging with specific validation errors

Rollback Plan

Instant Rollback

# Change environment variable
aws lambda update-function-configuration \
  --function-name TranscriptionResultProcessor \
  --environment Variables={PROMPT_VERSION=v5}

Full Rollback (if needed)

  1. Set PROMPT_VERSION=v5
  2. Redeploy previous Lambda code
  3. Keep v6 files in S3 (no harm)
  4. Investigate issues before retry

Cost Impact

Minimal Additional Costs

  • S3 Storage: 4 additional files = ~$0.0001/month
  • S3 Requests: 3 extra GETs per invocation = ~$0.01/month
  • Lambda Runtime: +100ms concatenation = ~$0.05/month
  • Total Additional Cost: <$0.10/month

Cost Savings

  • Reduced development time for prompt updates
  • Fewer deployment cycles for staff changes
  • Less debugging time with clear validation errors

Architectural Principle: Core Pipeline Remains Deterministic

Why Keep Core Pipeline Simple

The transcription → AI analysis pipeline is the core product we deliver to clients. It must: - Remain stable and predictable - Never break due to new feature additions - Process every call the same way - Maintain consistent performance

Future Agent Features: Separate Orchestrator Lambda

When ready to add the 7 foundations (Tools, Control, Memory, Feedback), create a new orchestrator Lambda that: - Sits alongside the core pipeline (not within it) - Can call the core pipeline as one of its tools - Handles user queries through chat/API interface - Manages complex multi-agent workflows

Example Architecture:

Current (v1.0):
Webhook → SQS → TranscribeProcessor → S3 → TranscriptionResultProcessor → DynamoDB

Future (v2.0):
Webhook → SQS → TranscribeProcessor → S3 → TranscriptionResultProcessor → DynamoDB
                                                    (Analysis Results)
User Query → API Gateway → AgentOrchestrator → [Memory, Tools, Control, Feedback]
                          (Can query DynamoDB, call tools, manage context)

What Belongs Where

Core Pipeline (TranscriptionResultProcessor): - Prompt splitting and loading ✓ - Staff registry lookup ✓ - Bedrock tool validation ✓ - Simple retry logic ✓ - DynamoDB storage ✓

Future Orchestrator Lambda (Separate): - Memory system (customer context) - Tool catalog and execution - Control flow decisions - Human feedback capture - Cross-call analytics - Chatbot interface

This separation ensures the core pipeline remains a reliable, deterministic workflow while allowing unlimited innovation in the orchestrator layer.

Future Enhancements (Not Blocking v1.0)

v1.1 - Staff Registry to DynamoDB (When UI Ready)

When clients need self-service staff updates, migrate from S3 JSON to DynamoDB:

# DynamoDB table: staff-registry
{
  "tenantId": "orange-theory#5736520",  # PK
  "versionNumber": 1,                   # SK
  "staffList": ["Drew", "Nadia"],
  "aliases": {"Ryan": "Ryanne"},
  "updatedBy": "user@client.com",
  "updatedAt": "2025-03-01T10:00:00Z",
  "isActive": true
}

# Benefits of migration:
# - Clients can update via UI (no S3 access needed)
# - Audit trail (who changed what, when)
# - Atomic updates (no overwrite risk)
# - Version history (rollback capability)
# - Real-time updates (no cache delays)

Migration is seamless: The Lambda code already has fallback logic:

try:
    # Try DynamoDB first (v1.1+)
    staff_data = load_from_dynamodb(franchise, site_id)
except:
    # Fallback to S3 JSON (v1.0)
    staff_data = load_from_s3(franchise, site_id)

v1.2 - Basic Memory (When Ready)

# Simple memory - 2 operations only
def get_customer_context(phone_number):
    # Query DynamoDB for last 3 calls
    return {"customer_type": "existing", "last_outcome": "purchase"}

def save_interaction(phone_number, session_id, outcome):
    # Write to DynamoDB
    pass

v1.2 - Multi-Industry Support (Leveraging Existing DynamoDB)

Since we already have client-configurations DynamoDB table, adding industries is straightforward:

# 1. Add industry field to client-configurations
{
  "client_id": "wellsfargo-nyc-001",
  "industry": "banking",  # NEW FIELD
  "config_versions": {
    "prompt": "banking-v1",
    "schema": "banking-v1"
  }
}

# 2. Store industry templates in S3
s3://bucket/templates/
├── fitness/     # Current Orange Theory type
├── healthcare/  # Future: Medical clinics
└── banking/     # Future: Financial institutions

# 3. Lambda loads config based on industry
client_config = get_from_dynamodb(client_id)
industry = client_config.get('industry', 'fitness')
schema = load_template(f"templates/{industry}/schemas/v1")
AnalysisModel = build_model_for_industry(industry, schema)

This reuses ALL existing infrastructure - just adds industry awareness!

v1.3 - More Granular Modules

  • Split into 10-15 files for fine-grained control
  • Add conditional loading based on industry (not call type - AI determines that)
  • Still use simple concatenation

v2.0 - Multiple Tools Architecture (PREFERRED FUTURE)

This is the preferred evolution path:

# Different tools for different schemas
if industry == "fitness":
    tools = [
        {"name": "analyze_fitness_sales", "schema": FitnessSalesSchema},
        {"name": "analyze_fitness_retention", "schema": FitnessRetentionSchema}
    ]
elif industry == "banking":
    tools = [
        {"name": "analyze_banking_service", "schema": BankingServiceSchema},
        {"name": "analyze_banking_fraud", "schema": BankingFraudSchema}
    ]

# AI selects ONE tool based on transcript
# Returns only that tool's specific fields

Benefits: - Each industry has unique fields (no pollution) - AI intelligently routes to correct tool - Clean separation of concerns - Still single-pass (not multi-turn)

v3.0 - Advanced Features (Only if Complexity Demands)

  • Multi-turn conversations (only for very complex analysis)
  • Step Functions orchestration
  • Real-time feedback capture
  • Cross-franchise module marketplace