CloudAudioAI Simplified Technical Requirements - v1.1¶
Overview¶
This document specifies the minimal technical changes needed for v1.1: split prompts (all under site path), staff list in prompt file, and Bedrock tool validation without schema.json. We explicitly defer complex features and maximize reuse of existing infrastructure. The key simplification is keeping all 6 files under {franchise}/{siteId}/prompts/v6/ initially, with migration to franchise-level sharing only when scale demands it.
Three Core Changes Only¶
1. Split Prompt Files (6 Files from 1)¶
1.1 File Creation from prompt-v5.txt¶
Create these files in S3 (all under site path for simplicity):
s3://{bucket}/{franchise}/{siteId}/prompts/v6/
├── 01-franchise-core.txt
├── 02-knowledge.txt
├── 03-coaching.txt
├── 04-staff-list.txt
├── 05-studio-output-instructions.txt
└── 06-custom-requirements.txt
Note: No schema.json needed - Pydantic models define the schema
Content Mapping from prompt-v5.txt (or prompt-v6.txt):
01-franchise-core.txt:
- Lines 1-165: Role, task, processing steps, categorization
02-knowledge.txt:
- Lines 168-596: Sales, cancellation, freeze modules
- Keep sections clearly separated with headers
03-coaching.txt:
- Lines 270-347, 389-402, 527-540, 583-595: All scenarios
- Group by topic (sales, billing, cancellation, freeze)
04-staff-list.txt:
- Simple text list of staff names with roles
- Include aliases and clarifications
- No JSON format - plain text instructions
05-studio-output-instructions.txt:
- Lines 599-619: JSON structure introduction
- Lines 620-739: TAXONOMY section (category/subcategory mappings)
- Lines 740-781: ENUMS, terminology, outcome classification
- Add at end: "INSTRUCTIONS: You MUST respond by calling the tool `record_call_analysis`"
06-custom-requirements.txt:
- Site-specific context, promotions, local market info
- Business priorities for the quarter
- Competitive positioning notes
Important: The taxonomy section (lines 620-739) is CRITICAL as it defines the valid values for the Pydantic enum fields. This ensures the AI knows exactly which subcategories are allowed for each category.
1.2 Lambda Code Change¶
File to Modify: TranscriptionResultProcessor_MultiTenant.py
Current Code (around line 1135):
prompt_content = load_versioned_config(
s3_client, bucket, franchise, site_id,
'prompts', f'prompt-{version}.txt'
)
New Code (Simplified - All Files in Site Path):
def load_prompt_v6(s3_client, bucket, franchise, site_id):
"""Load and concatenate v6 split prompt files from site directory"""
# Simple single-path loading - all files under site
base_path = f"{franchise}/{site_id}/prompts/v6"
files = ['01-franchise-core.txt', '02-knowledge.txt', '03-coaching.txt',
'04-staff-list.txt', '05-studio-output-instructions.txt',
'06-custom-requirements.txt']
prompt_parts = []
for filename in files:
try:
obj = s3_client.get_object(
Bucket=bucket,
Key=f"{base_path}/{filename}"
)
content = obj['Body'].read().decode('utf-8')
prompt_parts.append(content)
logger.info(f"[PROMPT] Loaded {filename}")
except:
# CRITICAL: File 05 contains taxonomy - MUST be present for tool validation
if '06-custom' in filename:
logger.info(f"[PROMPT] Optional file not found: {filename}")
continue # Only file 6 is truly optional
else:
# Files 1-5 are REQUIRED - fallback defeats split architecture
logger.error(f"[PROMPT] Required file missing: {filename}")
raise ValueError(f"Required prompt file missing: {filename}. Files 1-5 are mandatory for v6.")
return '\n\n'.join(prompt_parts)
# In main processing:
version = os.environ.get('PROMPT_VERSION', 'v5')
if version == "v6":
prompt_content = load_prompt_v6(s3_client, bucket, franchise, site_id)
else:
prompt_content = load_prompt_v5(...) # existing code
2. Staff Registry in Prompt File¶
2.1 Include Staff in 04-staff-list.txt (Simplest Approach)¶
Location Pattern: s3://{bucket}/{franchise}/{siteId}/prompts/v6/04-staff-list.txt
File Content Example (plain text, not JSON):
KNOWN STAFF MEMBERS:
The following staff members work at this location:
- Drew (Sales Manager, works M-F mornings)
- Nadia (Head Coach, specializes in strength training)
- Jess (Front Desk Lead, handles billing issues)
- Holly (Assistant Manager, membership specialist)
- Ryanne (Sales Associate, Spanish speaker)
- Ryan (Front Desk, part-time evenings)
- Will (Coach, runs transformation challenges)
- Coach Marcus (Lead Coach, nutrition certified)
STAFF NAME CLARIFICATIONS:
- "Ryan" and "Ryanne" are different people
- "Jessica" is the same person as "Jess"
- "Marcus" should be identified as "Coach Marcus"
IDENTIFICATION RULE:
If you cannot identify the staff member with >80% confidence,
use "unidentified staff member" rather than guessing.
Benefits of Text Format: 1. No JSON parsing needed 2. Easy to edit directly in S3 console 3. Natural language instructions for AI 4. Included in prompt concatenation automatically
2.2 Future Enhancement: Lambda-Based Updates¶
For Franchise-Wide Updates (When Needed):
def update_shared_files_across_sites(franchise, updated_files):
"""
Update files 1-3 across all sites for testing
Keeps site-level isolation for safe rollout
"""
sites = list_all_sites(franchise)
# Test on one site first
test_site = sites[0]
for filename in ['01-franchise-core.txt', '02-knowledge.txt', '03-coaching.txt']:
if filename in updated_files:
s3.put_object(
Bucket=bucket,
Key=f"{franchise}/{test_site}/prompts/v6/{filename}",
Body=updated_files[filename]
)
# After validation, roll out to other sites
if test_successful:
for site in sites[1:]:
for filename, content in updated_files.items():
s3.put_object(
Bucket=bucket,
Key=f"{franchise}/{site}/prompts/v6/{filename}",
Body=content
)
This maintains site-level isolation while enabling controlled rollouts!
3. Bedrock Tool-Based Validation¶
3.1 Add Existing Files to Lambda¶
Copy these files to Lambda package:
- /agent-design/new prompt/models.py - Pydantic schema (no changes)
- /agent-design/new prompt/bedrock_call.py - Tool calling logic (minimal changes)
3.2 Modify bedrock_call.py¶
Changes needed:
# Line 9 - Update model ID for production
MODEL_ID = os.environ.get('BEDROCK_MODEL_ID', 'us.deepseek.r1-v1:0')
# Line 20 - Load prompt from parameter instead of file
def ask_claude_for_analysis(transcript: str, prompt_text: str, context: str = None) -> dict:
# Change line 20 from:
# SYSTEM_PROMPT = open("prompt.txt", "r").read()
# To:
SYSTEM_PROMPT = prompt_text
# Rest of function remains the same
3.3 Replace Bedrock Invocation in Lambda¶
Current Code (around line 1250):
# Invoke Bedrock
response = bedrock_client.invoke_model(
modelId=model_id,
contentType='application/json',
accept='application/json',
body=json.dumps({
"prompt": full_prompt,
"max_tokens": 4000,
"temperature": 0.7
})
)
New Code (Clean, Deterministic):
# Use tool-based invocation
from bedrock_call import ask_claude_for_analysis
# IMPORTANT: prompt_content is the CONCATENATED prompt from all 4 files
# This includes: 01-core + 02-knowledge + 03-coaching + 04-output-instructions
# Including the taxonomy (lines 620-739 from prompt-v6.txt) in 04-output-instructions
try:
# First attempt with tool validation
# The concatenated prompt becomes a single system message
analysis_result = ask_claude_for_analysis(
transcript=transcript_text,
prompt_text=prompt_content, # All 4 files concatenated with \n\n
context=None # No memory in core pipeline
)
# Result is already validated by Pydantic!
logger.info(f"[TOOL_SUCCESS] Analysis completed with validation")
except ValidationError as e:
logger.warning(f"[VALIDATION_ERROR] First attempt failed: {str(e)}")
# One retry with error feedback
try:
error_prompt = f"{prompt_content}\n\nPREVIOUS ERRORS TO FIX:\n{str(e.errors())}"
analysis_result = ask_claude_for_analysis(
transcript=transcript_text,
prompt_text=error_prompt,
context=None
)
except:
# Send to DLQ for manual review
logger.error(f"[VALIDATION_FAILED] Moving to DLQ after retry")
raise
Testing Requirements¶
1. Pre-Production Testing¶
Test Data Preparation¶
# Select 10 test transcripts covering:
- 3 intro bookings
- 2 cancellations
- 2 service calls
- 3 general inquiries
# Store in: s3://{bucket}/test-transcripts/
Validation Tests¶
# Test script to run locally
def test_prompt_splitting():
# Load v5 prompt
# Load and concatenate v6 prompts
# Verify staff placeholder exists
# Compare structure (should be identical except placeholder)
def test_tool_validation():
# Load test transcript
# Call ask_claude_for_analysis
# Verify Pydantic validation passes
# Check all required fields populated
def test_staff_registry():
# Load staff JSON
# Replace in prompt
# Verify staff identification improved
2. Production Rollout Testing¶
Feature Flag Implementation¶
# Add to Lambda environment variables
PROMPT_VERSION = "v6" # or "v5" for rollback
# In Lambda code:
version = os.environ.get('PROMPT_VERSION', 'v5')
if version == "v6":
# Use new logic
else:
# Use existing logic
Monitoring During Rollout¶
- Track validation success rate in CloudWatch
- Compare output quality between v5 and v6
- Monitor processing time (should be within 100ms of v5)
Files to Modify Summary¶
S3 Files (New)¶
- Create 4 prompt files under
prompts/v6/ - Create
staff/staff-registry.jsonper tenant
Lambda Files (Modify)¶
TranscriptionResultProcessor_MultiTenant.py:- Add prompt loading function (20 lines)
- Add staff registry loading (10 lines)
- Replace Bedrock invocation (30 lines)
-
Total: ~60 lines of code changes
-
Add to Lambda package:
models.py(existing, no changes)bedrock_call.py(2 line changes)
Environment Variables (Add)¶
PROMPT_VERSION: "v5" or "v6"- Already exists:
BEDROCK_MODEL_ID
Current v1.0 Approach Clarification¶
What We're Doing NOW:¶
- Single Tool: One CallAnalysis Pydantic model for all calls
- Single Pass: One AI invocation with concatenated prompts
- Simple Concatenation: Just join 4 prompt files with \n\n
- AI Determines Everything: Call type, category, subcategory from transcript
- Industry: Only fitness/gym for now
What We're NOT Doing (Explicitly Deferred)¶
Not Creating in v1.0:¶
- Module manifest files
- Composition engines
- Multiple tool definitions (deferred to v2.0)
- New Lambda functions
- New DynamoDB tables (yet)
- Multi-turn conversations
- Dynamic prompt selection based on call metadata
Not Implementing:¶
- Memory system (prepared but not active)
- Complex retry logic (just one retry)
- A/B testing framework
- Aggregation services
- Cross-franchise module sharing
- Tool catalogs
- Control flow abstractions
Migration Checklist¶
Week 1: Preparation¶
- Split prompt-v5.txt into 4 files locally
- Create staff-registry.json for both franchises
- Test concatenation produces same structure
- Add models.py and bedrock_call.py to Lambda package
Week 2: Deployment¶
- Upload v6 prompt files to S3 for one franchise
- Upload staff registry to S3
- Deploy Lambda with PROMPT_VERSION=v5 (safe default)
- Test with PROMPT_VERSION=v6 on test transcripts
- Monitor CloudWatch for errors
Week 3: Rollout¶
- Enable v6 for orange-theory-5736520
- Monitor for 24 hours
- Enable v6 for orange-theory-Totowa
- Remove v5 code after 1 week stable
Success Criteria¶
Must Pass¶
- Zero increase in DLQ messages
- Validation success rate >95%
- Processing time <26s (current: 25s)
- Staff identification >85% (current: 75%)
Nice to Have¶
- Reduced prompt update time (measure how long to edit)
- Cleaner CloudWatch logs with validation status
- Easier debugging with specific validation errors
Rollback Plan¶
Instant Rollback¶
# Change environment variable
aws lambda update-function-configuration \
--function-name TranscriptionResultProcessor \
--environment Variables={PROMPT_VERSION=v5}
Full Rollback (if needed)¶
- Set PROMPT_VERSION=v5
- Redeploy previous Lambda code
- Keep v6 files in S3 (no harm)
- Investigate issues before retry
Cost Impact¶
Minimal Additional Costs¶
- S3 Storage: 4 additional files = ~$0.0001/month
- S3 Requests: 3 extra GETs per invocation = ~$0.01/month
- Lambda Runtime: +100ms concatenation = ~$0.05/month
- Total Additional Cost: <$0.10/month
Cost Savings¶
- Reduced development time for prompt updates
- Fewer deployment cycles for staff changes
- Less debugging time with clear validation errors
Architectural Principle: Core Pipeline Remains Deterministic¶
Why Keep Core Pipeline Simple¶
The transcription → AI analysis pipeline is the core product we deliver to clients. It must: - Remain stable and predictable - Never break due to new feature additions - Process every call the same way - Maintain consistent performance
Future Agent Features: Separate Orchestrator Lambda¶
When ready to add the 7 foundations (Tools, Control, Memory, Feedback), create a new orchestrator Lambda that: - Sits alongside the core pipeline (not within it) - Can call the core pipeline as one of its tools - Handles user queries through chat/API interface - Manages complex multi-agent workflows
Example Architecture:
Current (v1.0):
Webhook → SQS → TranscribeProcessor → S3 → TranscriptionResultProcessor → DynamoDB
Future (v2.0):
Webhook → SQS → TranscribeProcessor → S3 → TranscriptionResultProcessor → DynamoDB
↓
(Analysis Results)
↓
User Query → API Gateway → AgentOrchestrator → [Memory, Tools, Control, Feedback]
↓
(Can query DynamoDB, call tools, manage context)
What Belongs Where¶
Core Pipeline (TranscriptionResultProcessor): - Prompt splitting and loading ✓ - Staff registry lookup ✓ - Bedrock tool validation ✓ - Simple retry logic ✓ - DynamoDB storage ✓
Future Orchestrator Lambda (Separate): - Memory system (customer context) - Tool catalog and execution - Control flow decisions - Human feedback capture - Cross-call analytics - Chatbot interface
This separation ensures the core pipeline remains a reliable, deterministic workflow while allowing unlimited innovation in the orchestrator layer.
Future Enhancements (Not Blocking v1.0)¶
v1.1 - Staff Registry to DynamoDB (When UI Ready)¶
When clients need self-service staff updates, migrate from S3 JSON to DynamoDB:
# DynamoDB table: staff-registry
{
"tenantId": "orange-theory#5736520", # PK
"versionNumber": 1, # SK
"staffList": ["Drew", "Nadia"],
"aliases": {"Ryan": "Ryanne"},
"updatedBy": "user@client.com",
"updatedAt": "2025-03-01T10:00:00Z",
"isActive": true
}
# Benefits of migration:
# - Clients can update via UI (no S3 access needed)
# - Audit trail (who changed what, when)
# - Atomic updates (no overwrite risk)
# - Version history (rollback capability)
# - Real-time updates (no cache delays)
Migration is seamless: The Lambda code already has fallback logic:
try:
# Try DynamoDB first (v1.1+)
staff_data = load_from_dynamodb(franchise, site_id)
except:
# Fallback to S3 JSON (v1.0)
staff_data = load_from_s3(franchise, site_id)
v1.2 - Basic Memory (When Ready)¶
# Simple memory - 2 operations only
def get_customer_context(phone_number):
# Query DynamoDB for last 3 calls
return {"customer_type": "existing", "last_outcome": "purchase"}
def save_interaction(phone_number, session_id, outcome):
# Write to DynamoDB
pass
v1.2 - Multi-Industry Support (Leveraging Existing DynamoDB)¶
Since we already have client-configurations DynamoDB table, adding industries is straightforward:
# 1. Add industry field to client-configurations
{
"client_id": "wellsfargo-nyc-001",
"industry": "banking", # NEW FIELD
"config_versions": {
"prompt": "banking-v1",
"schema": "banking-v1"
}
}
# 2. Store industry templates in S3
s3://bucket/templates/
├── fitness/ # Current Orange Theory type
├── healthcare/ # Future: Medical clinics
└── banking/ # Future: Financial institutions
# 3. Lambda loads config based on industry
client_config = get_from_dynamodb(client_id)
industry = client_config.get('industry', 'fitness')
schema = load_template(f"templates/{industry}/schemas/v1")
AnalysisModel = build_model_for_industry(industry, schema)
This reuses ALL existing infrastructure - just adds industry awareness!
v1.3 - More Granular Modules¶
- Split into 10-15 files for fine-grained control
- Add conditional loading based on industry (not call type - AI determines that)
- Still use simple concatenation
v2.0 - Multiple Tools Architecture (PREFERRED FUTURE)¶
This is the preferred evolution path:
# Different tools for different schemas
if industry == "fitness":
tools = [
{"name": "analyze_fitness_sales", "schema": FitnessSalesSchema},
{"name": "analyze_fitness_retention", "schema": FitnessRetentionSchema}
]
elif industry == "banking":
tools = [
{"name": "analyze_banking_service", "schema": BankingServiceSchema},
{"name": "analyze_banking_fraud", "schema": BankingFraudSchema}
]
# AI selects ONE tool based on transcript
# Returns only that tool's specific fields
Benefits: - Each industry has unique fields (no pollution) - AI intelligently routes to correct tool - Clean separation of concerns - Still single-pass (not multi-turn)
v3.0 - Advanced Features (Only if Complexity Demands)¶
- Multi-turn conversations (only for very complex analysis)
- Step Functions orchestration
- Real-time feedback capture
- Cross-franchise module marketplace