CloudAudioAI Architecture Evolution Diagrams¶
%% This document visualizes the evolution from current monolithic system to v1.0 with Pydantic validation, then to future AgentOrchestrator architecture %%
1. Current System Architecture (Production Today)¶
graph TB
subgraph "Telephony Providers"
RC[RingCentral]
TW[Twilio]
end
subgraph "AWS Infrastructure - Current State"
subgraph "Entry & Queue"
API[API Gateway<br/>**Webhook Handler**]
SQS[SQS Queue<br/>**5min delay**]
DLQ[Dead Letter Queue]
end
subgraph "Processing Pipeline"
TP[TranscribeProcessor<br/>Lambda]
TRP[TranscriptionResultProcessor<br/>Lambda]
end
subgraph "Storage"
S3[(S3 Bucket<br/>Recordings/Transcripts/Analysis)]
DB[(DynamoDB<br/>call-events & call-analysis)]
end
subgraph "Configuration Files"
CF[**MONOLITHIC**<br/>prompt-v5.txt<br/>800+ lines<br/>Static staff list]
end
subgraph "AI Services"
TR[AWS Transcribe]
BR[AWS Bedrock<br/>**Text Generation Mode**<br/>❌ No validation]
end
subgraph "Frontend"
BFF[BFF API Lambda]
VUE[Vue3 Dashboard]
end
end
RC -->|Webhook| API
TW -->|Webhook| API
API -->|Queue| SQS
SQS -->|Process| TP
SQS -.->|Failed| DLQ
TP -->|Download audio| S3
TP -->|Transcribe| TR
TR -->|Store transcript| S3
S3 -->|S3 Event| TRP
CF -->|Load prompt| TRP
TRP -->|Generate analysis| BR
BR -->|**Unstructured JSON**| TRP
TRP -->|Store results| S3
TRP -->|Update status| DB
DB -->|Query| BFF
BFF -->|API| VUE
style BR fill:#ff9999
style CF fill:#ffcc99
Current System Characteristics¶
- Monolithic Prompt: Single 800+ line prompt file
- Static Staff List: Hardcoded in prompt
- Unstructured Output: Bedrock returns JSON as text, prone to malformation
- No Validation: ~15% of responses have missing/invalid fields
- Manual Updates: Requires code deployment for staff changes
2. New v1.0 Architecture (With Pydantic Validation)¶
graph TB
subgraph "Telephony Providers"
RC[RingCentral]
TW[Twilio]
end
subgraph "AWS Infrastructure - v1.0 Changes"
subgraph "Entry & Queue - **Unchanged**"
API[API Gateway<br/>Webhook Handler]
SQS[SQS Queue<br/>5min delay]
DLQ[Dead Letter Queue]
end
subgraph "Processing Pipeline - **Enhanced**"
TP[TranscribeProcessor<br/>Lambda - **Unchanged**]
TRP[TranscriptionResultProcessor<br/>**+ Pydantic Models**<br/>**+ Tool Calling**]
end
subgraph "Storage - **Unchanged**"
S3[(S3 Bucket)]
DB[(DynamoDB)]
end
subgraph "Configuration - **Modularized (All Site-Level)**"
subgraph "{franchise}/{siteId}/prompts/v6/"
P1[01-franchise-core.txt<br/>Role & Task]
P2[02-knowledge.txt<br/>Sales/Cancel]
P3[03-coaching.txt<br/>Scenarios]
P4[04-staff-list.txt<br/>**Staff Names (Text)**]
P5[05-studio-output.txt<br/>**Taxonomy & Format**]
P6[06-custom-requirements.txt<br/>Local Context]
end
end
subgraph "AI Services - **Tool Mode**"
TR[AWS Transcribe]
BR[AWS Bedrock<br/>**Tool Calling Mode**<br/>✅ With Pydantic Schema]
end
subgraph "Validation Layer - **NEW**"
PY[**Pydantic CallAnalysis Model**<br/>Enforced Schema<br/>Type Validation]
end
subgraph "Frontend - **Unchanged**"
BFF[BFF API Lambda]
VUE[Vue3 Dashboard]
end
end
RC -->|Webhook| API
TW -->|Webhook| API
API -->|Queue| SQS
SQS -->|Process| TP
SQS -.->|Failed + Retry| DLQ
TP -->|Download audio| S3
TP -->|Transcribe| TR
TR -->|Store transcript| S3
S3 -->|S3 Event| TRP
P1 & P2 & P3 & P4 & P5 & P6 -->|**Concatenate All 6**| TRP
P4 -->|**Staff in Text File**| TRP
TRP -->|**Tool Request**| BR
BR -->|**Structured Output**| PY
PY -->|**Validated**| TRP
TRP -.->|**Retry on Error**| BR
TRP -->|Store results| S3
TRP -->|Update status| DB
DB -->|Query| BFF
BFF -->|API| VUE
style BR fill:#99ff99
style PY fill:#99ccff
style SR fill:#ffff99
style P4 fill:#ffffcc
v1.0 Improvements¶
- Split Prompts: 6 modular text files for easier maintenance
- Staff in Prompt File: Plain text list in 04-staff-list.txt, no JSON parsing
- Tool-Based Validation: Bedrock calls Pydantic tool with schema
- >95% Valid Output: Pydantic validation with self-correction retry
- No schema.json: Pydantic models define all validation rules
3. Future Architecture with AgentOrchestrator (v2.0+)¶
graph TB
subgraph "Core Pipeline - Deterministic & Unchanged"
subgraph "Telephony"
RC[RingCentral]
TW[Twilio]
end
API[API Gateway]
SQS[SQS Queue]
TP[TranscribeProcessor]
TRP[TranscriptionResultProcessor<br/>with Tool Validation]
S3[(S3 Bucket)]
DB[(DynamoDB<br/>call-events<br/>call-analysis)]
end
subgraph "New Agent Layer - Flexible & Innovative"
subgraph "Orchestrator Components"
AO[AgentOrchestrator Lambda<br/>Command Router]
subgraph "Seven Foundations"
MEM[Memory System<br/>Customer Context]
TOOLS[Tool Catalog<br/>Multiple Schemas]
CTRL[Control Flow<br/>Decision Engine]
FDBK[Feedback Loop<br/>Human Input]
INTEL[Intelligence<br/>Prompt Library]
VAL[Validation<br/>Multi-Schema]
REC[Recovery<br/>Error Handling]
end
end
subgraph "New Tools"
FUT[Follow-Up Checklist Tool]
DIG[Staff Digest Tool]
PERF[Performance Report Tool]
CHAT[Chat Interface Tool]
AGG[Analytics Aggregation Tool]
end
subgraph "New Storage"
CTX[(Context Memory DB<br/>Customer History)]
TASK[(Task Queue<br/>Follow-ups)]
SR2[(Staff Registry DB)]
CC2[(Client Config DB)]
end
UAPI[User API Gateway<br/>Chat/Query Interface]
USR[Users/Managers]
end
RC & TW --> API
API --> SQS --> TP --> TRP --> DB
USR -->|Query| UAPI
UAPI --> AO
AO <--> MEM
AO <--> TOOLS
AO <--> CTRL
AO <--> FDBK
AO -->|Execute| FUT & DIG & PERF & CHAT & AGG
FUT & DIG & PERF -->|Return JSON| AO
DB -.->|Read Only| AO
AO <-->|get/save context| CTX
AO -->|Queue tasks| TASK
AO -.->|Read staff/aliases| SR2
AO -.->|Read industry/config| CC2
AO -->|Store outputs| S3
style AO fill:#ff99ff
style MEM fill:#99ffcc
style TOOLS fill:#99ffcc
Future Architecture Benefits¶
- Separation of Concerns: Core pipeline remains stable while agent layer innovates
- No Risk to Core: Agent features can fail without affecting call processing
- Flexible Tools: Different tools for different industries/use cases
- Memory System: Track customer interactions across calls
- Human Feedback: Incorporate manager corrections into future analyses
4. Detailed Tool Flow in Future Architecture¶
sequenceDiagram
participant U as User/Manager
participant API as API Gateway
participant AO as AgentOrchestrator
participant DB as DynamoDB
participant CONFIG as Client Config
participant SR as Staff Registry
participant MEM as Memory System
participant BR as Bedrock AI
participant S3 as S3 Storage
U->>API: "Generate follow-up checklist"
API->>AO: Route Command
Note over AO: Load tenant configuration
AO->>CONFIG: Get industry/config
CONFIG-->>AO: Return industry type
AO->>SR: Get staff list/aliases
SR-->>AO: Staff data
AO->>DB: Query call-analysis<br/>(last 24h, follow_up_needed=yes)
DB-->>AO: Return matching records
AO->>MEM: Get customer context
MEM-->>AO: Previous interactions
Note over AO: Select tool based on industry
AO->>BR: Execute tool with<br/>Industry-specific prompt<br/>+ Pydantic schema
BR-->>AO: Validated JSON output
Note over AO: Orchestrator persists output
AO->>S3: Store checklist JSON
AO->>MEM: Save interaction context
AO->>API: Return checklist
API->>U: Display results
5. Multi-Industry Tool Architecture (v2.0+)¶
graph LR
subgraph "Industry Detection"
CC[Client Config<br/>DynamoDB]
IND{Industry<br/>Router}
end
subgraph "Fitness Tools"
FT1[Sales Analysis Tool<br/>intro_type, package_offered]
FT2[Retention Tool<br/>cancel_reason, save_attempt]
FT3[Service Tool<br/>issue_type, resolved]
end
subgraph "Banking Tools"
BT1[Account Service Tool<br/>account_type, transaction]
BT2[Fraud Detection Tool<br/>risk_level, flags]
BT3[Loan Application Tool<br/>loan_type, status]
end
subgraph "Healthcare Tools"
HT1[Appointment Tool<br/>specialty, time_slot]
HT2[Insurance Tool<br/>coverage, authorization]
HT3[Prescription Tool<br/>medication, refills]
end
CC -->|franchise: orange-theory| IND
CC -->|franchise: wellsfargo| IND
CC -->|franchise: mercy-health| IND
IND -->|fitness| FT1 & FT2 & FT3
IND -->|banking| BT1 & BT2 & BT3
IND -->|healthcare| HT1 & HT2 & HT3
style FT1 fill:#ffcc99
style BT1 fill:#99ccff
style HT1 fill:#ccff99
Multi-Industry Benefits¶
- Clean Separation: Each industry has unique fields and tools
- No Cross-Pollution: Banking fields don't appear in fitness analyses
- AI Tool Selection: Bedrock intelligently chooses the right tool
- Single Pass: Still deterministic, not multi-turn conversations
6. Configuration Evolution Path¶
graph TD
subgraph "Current (Monolithic)"
M1[prompt-v5.txt<br/>800 lines<br/>All-in-one]
end
subgraph "v1.0 (Split)"
S1[01-core.txt]
S2[02-knowledge.txt]
S3[03-coaching.txt]
S4[04-output.txt]
SR1[04-staff-list.txt]
end
subgraph "v1.1 (Dynamic)"
D1[Staff Registry<br/>DynamoDB Table]
D2[Memory Context<br/>DynamoDB Table]
end
subgraph "v2.0 (Industry Templates)"
T1[fitness/templates/]
T2[banking/templates/]
T3[healthcare/templates/]
end
subgraph "v3.0 (Marketplace)"
MP[Cross-Franchise<br/>Module Marketplace<br/>Shared Best Practices]
end
M1 -->|Split| S1 & S2 & S3 & S4
M1 -->|Extract| SR1
SR1 -->|Migrate| D1
S1 -->|Add Memory| D2
S1 & S2 & S3 & S4 -->|Templatize| T1 & T2 & T3
T1 & T2 & T3 -->|Share| MP
style M1 fill:#ffcccc
style MP fill:#ccffcc
Key Architecture Decisions¶
1. Why Keep Core Pipeline Deterministic¶
- Reliability: Every call processed identically
- Predictability: No surprises in production
- Performance: Consistent 25-second processing time
- Debugging: Clear, linear flow
2. Why Separate AgentOrchestrator¶
- Innovation Without Risk: New features don't affect core
- Flexible Deployment: Can disable agent features instantly
- Different SLAs: Core = 99.9% uptime, Agent = experimental
- Cost Control: Agent features are optional add-ons
3. Why Tool-Based Approach¶
- Schema Enforcement: Guaranteed valid output structure
- Industry Separation: Different tools for different verticals
- AI Intelligence: Let AI choose the appropriate tool
- Future Proof: Easy to add new tools without changing architecture
Summary¶
The architecture evolution from current state → v1.0 → future maintains clear principles:
Core Design Principles¶
- Core pipeline stays deterministic (reliable call processing)
- Innovation happens in separate layer (AgentOrchestrator)
- Tools provide structure (Pydantic validation)
- Configuration drives behavior (no code changes for business rules)
Key Architectural Decisions Visualized¶
| Aspect | Current | v1.0 | Future (v2.0+) |
|---|---|---|---|
| Prompt Structure | Monolithic 800+ lines | 6 split text files | Industry templates |
| Staff Management | Static in code | Text file in prompt | DynamoDB table |
| Output Validation | ~85% success | >95% with Pydantic | Multi-schema tools |
| AI Mode | Text generation | Single tool | Multiple tools per industry |
| Memory | None | None (prepared) | Customer context DB |
| Orchestration | Linear pipeline | Linear pipeline | Separate orchestrator |
Data Flow Improvements¶
Current → v1.0: - Tools return JSON directly to orchestrator (not to S3) - Orchestrator persists validated outputs - Bidirectional context memory (get/save) - Staff and config lookups for tenant awareness
v1.0 → Future: - Core pipeline remains unchanged - Agent layer adds flexibility without risk - Industry-specific tools prevent field pollution - Human feedback loop for continuous improvement
This approach ensures CloudAudioAI can scale to multiple industries while maintaining the stability that clients depend on for their daily operations.