Voice & Chat Routing + Episode Metadata Fix
Document Version: 1.0
Last Updated: December 31, 2025
Status: Implementation Complete
Problem
User reported that episodes and stories are loading correctly while logged in with Google account, but:
- Backend routing for voice and chat needs to be fixed
- Episode metadata needs to include user and project information correctly
Root Cause
- Chat WebSocket: Used hardcoded
dev_securityinstead of authenticating users from JWT token - Voice persist_conversation_turn: Used hardcoded user IDs instead of authenticated user
- Episode Metadata: Not consistently including user identity metadata (email, display_name) in all session creation points
Solution
1. Fixed Chat WebSocket Authentication
File: backend/api/routers/chat.py
Changes:
- Extract JWT token from query parameter (like voice WebSocket does)
- Validate token and extract user identity
- Create
SecurityContextwith authenticated user - Pass authenticated user to connection manager
Before:
dev_security = SecurityContext(user_id="ws-user", tenant_id="ws-tenant", roles=[Role.ANALYST], scopes=["*"])
await manager.connect(websocket, session_id, dev_security)
After:
# Extract token from query parameter
token_param = websocket.query_params.get("token")
# Validate token and create SecurityContext with authenticated user
security = SecurityContext(
user_id=user_id,
tenant_id=tenant_id,
roles=roles,
scopes=scopes,
session_id=session_id,
email=email,
display_name=display_name,
)
await manager.connect(websocket, session_id, security)
2. Fixed Voice persist_conversation_turn Authentication
File: backend/api/routers/voice.py
Changes:
- Added
user: SecurityContext = Depends(get_current_user)parameter - Use authenticated user instead of hardcoded user IDs
- Include full user metadata in session creation
Before:
async def persist_conversation_turn(turn: ConversationTurn):
# Hardcoded user IDs
security = SecurityContext(user_id="voice-user", ...)
After:
async def persist_conversation_turn(turn: ConversationTurn, user: SecurityContext = Depends(get_current_user)):
# Use authenticated user
security = SecurityContext(
user_id=user.user_id,
tenant_id=user.tenant_id,
email=user.email,
display_name=user.display_name,
...
)
3. Enhanced Episode Metadata
Files:
backend/api/routers/voice.py- Voice WebSocket and persist_conversation_turnbackend/memory/client.py- persist_conversation method
Changes:
- Ensure all session creation includes user identity metadata:
tenant_id(always included)email(if available)display_name(if available)
- Updated
persist_conversationto include user metadata in episode metadata
Before:
session_metadata = {
"turn_count": context.episodic.total_turns,
}
After:
session_metadata = {
"turn_count": context.episodic.total_turns,
"tenant_id": context.security.tenant_id,
}
if context.security.email:
session_metadata["email"] = context.security.email
if context.security.display_name:
session_metadata["display_name"] = context.security.display_name
User Identity Flow
Chat Flow
- REST Endpoint (
POST /api/v1/chat):- Uses
get_current_userdependency → extracts JWT from Authorization header - Creates
SecurityContextwith authenticated user - Passes to
get_or_create_session()→ createsEnterpriseContext enrich_context()ensures Zep session has full user metadatapersist_conversation()includes user metadata in episode metadata
- Uses
- WebSocket Endpoint (
WS /api/v1/chat/ws/{session_id}):- Extracts JWT token from query parameter (
?token=...) - Validates token and creates
SecurityContextwith authenticated user - Passes to connection manager
- Same flow as REST endpoint for memory operations
- Extracts JWT token from query parameter (
Voice Flow
- WebSocket Endpoint (
WS /api/v1/voice/voicelive/{session_id}):- Extracts JWT token from query parameter (
?token=...) - Validates token and creates
SecurityContextwith authenticated user - Creates
EnterpriseContextwith authenticated user _ensure_memory_session()includes full user metadata
- Extracts JWT token from query parameter (
- Persist Turn Endpoint (
POST /api/v1/voice/conversation/turn):- Uses
get_current_userdependency → extracts JWT from Authorization header - Creates
SecurityContextwith authenticated user - Includes full user metadata in session creation
- Uses
Episode Metadata Structure
Episodes now include the following metadata:
{
"session_id": "session-xxx",
"user_id": "d240186f-f80e-4369-9296-57fef571cd93",
"metadata": {
"tenant_id": "6684288a-b805-4161-bf41-ba2121e51c90",
"email": "derek.brent.moore@engramai.onmicrosoft.com",
"display_name": "derek brent moore",
"agent_id": "elena",
"channel": "chat|voice|voice-direct",
"turn_count": 5,
"summary": "Conversation summary...",
"topics": ["topic1", "topic2"]
}
}
Project/Department Metadata (Future)
For enterprise boundaries, project/department information can be added:
- Custom JWT Claims: Add
project_idanddepartment_idas custom claims in Entra ID - Extract in Auth Middleware: Read custom claims from JWT token
- Add to SecurityContext: Extend
SecurityContextto includeproject_idanddepartment_id - Include in Metadata: Add to session metadata when creating episodes
Example:
# In auth.py - extract custom claims
project_id = token.get("project_id")
department_id = token.get("department_id")
# In SecurityContext
project_id: Optional[str] = None
department_id: Optional[str] = None
# In session metadata
if security.project_id:
session_metadata["project_id"] = security.project_id
if security.department_id:
session_metadata["department_id"] = security.department_id
Verification
Test Chat Authentication
# Get JWT token from browser console or MSAL
TOKEN="<your-jwt-token>"
# Test REST endpoint
curl -X POST https://api.engram.work/api/v1/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"content": "hi", "session_id": "test-session"}'
# Test WebSocket (use wscat or similar)
wscat -c "wss://api.engram.work/api/v1/chat/ws/test-session?token=$TOKEN"
Test Voice Authentication
# Test WebSocket
wscat -c "wss://api.engram.work/api/v1/voice/voicelive/test-session?token=$TOKEN"
# Test persist turn endpoint
curl -X POST https://api.engram.work/api/v1/voice/conversation/turn \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"session_id": "test-session",
"agent_id": "elena",
"role": "user",
"content": "Hello"
}'
Verify Episode Metadata
# List episodes (should show user's episodes only)
curl -X GET https://api.engram.work/api/v1/memory/episodes \
-H "Authorization: Bearer $TOKEN"
# Get specific episode transcript
curl -X GET https://api.engram.work/api/v1/memory/episodes/{session_id} \
-H "Authorization: Bearer $TOKEN"
Related Documentation
docs/architecture/user-identity-flow-comprehensive.md- Complete user identity flowdocs/architecture/security-context-enterprise-architecture.md- SecurityContext architecturedocs/troubleshooting/user-identity-consistency-fix.md- User identity consistency fixesbackend/api/middleware/auth.py- Authentication middlewarebackend/core/context.py- SecurityContext definition
Summary
✅ Chat WebSocket: Now authenticates users from JWT token in query parameter
✅ Voice persist_conversation_turn: Now uses authenticated user from SecurityContext
✅ Episode Metadata: Now includes user_id, tenant_id, email, display_name consistently
✅ Project Metadata: Framework ready for project/department metadata (requires custom JWT claims)
All voice and chat routing now properly authenticates users and includes full user metadata in episodes, ensuring proper user attribution and project/department boundaries.