VoiceLive Avatar Integration Complete ✅

Status: ✅ Integration Complete
Date: January 2026
Feature: Real-time Voice + Avatar Video Synchronization


Overview

Elena’s avatar is now integrated with VoiceLive for real-time voice conversations with synchronized avatar video. When users interact with Elena via voice, they see her photorealistic avatar speaking in real-time.


Implementation

Backend Changes

  1. VoiceLive Session Configuration (backend/api/routers/voice.py)
    • Added Modality.VIDEO to session modalities for Elena
    • Added avatar configuration matching Foundry settings:
      avatar = {
          "avatar_id": "en-US-JennyNeural",  # Match Elena's voice
          "style": "professional",
          "emotion": "neutral",
          "resolution": "1080p",
          "background": "transparent",
      }
      
  2. Avatar Video Event Handling
    • Added handling for RESPONSE_VIDEO_DELTA events (streaming video chunks)
    • Added handling for RESPONSE_VIDEO_DONE events (final video URL)
    • Sends avatar video data to frontend via WebSocket
  3. Agent Switching
    • Avatar enabled automatically when switching to Elena
    • Avatar disabled for other agents (Marcus, Sage)

Frontend Changes

  1. VoiceChat Component (frontend/src/components/VoiceChat/VoiceChat.tsx)
    • Added onAvatarVideo callback prop
    • Handles avatar_video and avatar_video_url message types
    • Manages avatar video URL state
    • Notifies parent component when avatar video is available
  2. VisualPanel Component (frontend/src/components/VisualPanel/VisualPanel.tsx)
    • Added voiceAvatarVideoUrl state
    • Passes avatar video URL to AvatarDisplay component
    • Connects VoiceChat avatar video callback
  3. AvatarDisplay Component (frontend/src/components/AvatarDisplay/AvatarDisplay.tsx)
    • Already supports avatarVideoUrl prop
    • Displays video when available, falls back to static image

How It Works

Flow

User speaks via microphone
    ↓
VoiceLive processes audio
    ↓
Elena responds with:
  - Real-time audio (streaming)
  - Avatar video (streaming chunks or final URL)
    ↓
Backend sends to frontend:
  - {"type": "audio", "data": "..."}
  - {"type": "avatar_video", "data": "..."}  (chunks)
  - {"type": "avatar_video_url", "url": "..."}  (final)
    ↓
Frontend displays:
  - Audio playback
  - Avatar video in AvatarDisplay component
    ↓
User sees Elena speaking with synchronized avatar

Avatar Configuration

For Elena:

  • Avatar ID: en-US-JennyNeural (matches voice)
  • Style: Professional
  • Emotion: Neutral
  • Resolution: 1080p
  • Background: Transparent

For Other Agents:

  • Avatar disabled (audio only)

Features

Real-time Avatar Video

  • Avatar video streams during voice conversations
  • Synchronized with audio output
  • Natural lip-sync and expressions

Automatic Enable/Disable

  • Avatar enabled automatically for Elena
  • Disabled for other agents (Marcus, Sage)
  • No manual configuration needed

Graceful Fallback

  • Falls back to static image if video fails
  • No errors if avatar not available
  • Works with or without avatar

Seamless Integration

  • Works with existing VoiceLive infrastructure
  • No breaking changes
  • Backward compatible

Testing

Test VoiceLive Avatar

  1. Start Voice Conversation:
    • Navigate to Engram website
    • Click “Activate Voice” for Elena
    • Start speaking
  2. Verify Avatar:
    • Elena’s avatar should appear
    • Avatar should speak in sync with audio
    • Video should be smooth and natural
  3. Check Console:
    • Look for “VoiceLive avatar enabled for elena” log
    • Check for “Avatar video URL received” messages
    • Verify no errors

Test Agent Switching

  1. Switch to Elena:
    • Avatar should be enabled
    • Video should appear
  2. Switch to Marcus/Sage:
    • Avatar should be disabled
    • Only audio should play

Configuration

Backend

Avatar is automatically enabled when:

  • Agent is Elena (agent_id == "elena")
  • VoiceLive is configured
  • Session includes Modality.VIDEO

Frontend

No additional configuration needed. Avatar video is automatically:

  • Received from VoiceLive
  • Displayed in AvatarDisplay component
  • Synchronized with audio

Performance Considerations

Video Streaming

  • Chunked Streaming: Avatar video sent as chunks for real-time display
  • Final URL: Complete video URL sent when available
  • Bandwidth: 1080p resolution for good quality/size balance

Latency

  • Real-time: Avatar video streams in real-time with audio
  • Synchronization: Audio and video synchronized by VoiceLive
  • Optimization: Transparent background reduces file size

Troubleshooting

Avatar Not Appearing

  1. Check Backend Logs:
    # Look for "VoiceLive avatar enabled for elena"
    # Check for avatar video events
    
  2. Verify Agent:
    • Ensure agent is Elena (not Marcus/Sage)
    • Check session configuration includes VIDEO modality
  3. Check Frontend:
    • Verify avatar_video_url state is set
    • Check AvatarDisplay component receives URL
    • Look for console errors

Video Not Playing

  1. Check Video URL:
    • Verify URL is valid
    • Check URL accessibility
    • Test URL in browser
  2. Check Browser Support:
    • Ensure browser supports video playback
    • Check for CORS issues
    • Verify video format (MP4)

Summary

VoiceLive Avatar Integration Complete

  • Real-time avatar video for Elena
  • Synchronized with voice audio
  • Automatic enable/disable
  • Graceful fallback
  • No breaking changes

Elena’s avatar now works seamlessly with VoiceLive, providing users with an immersive real-time voice conversation experience.


Last Updated: January 2026