Chat Fix Status - January 1, 2026

Status: Code fix committed, deployment workflow updated, waiting for deployment

✅ Completed Fixes

1. Code Fix (Committed)

File: backend/agents/base.py
Changes:
- Use max_completion_tokens instead of max_tokens for gpt-5.1-chat
- Remove temperature parameter for gpt-5.1-chat (only default value 1 is supported)
Commit: Already in codebase

2. Deployment Workflow Fix (Just Committed)

File: .github/workflows/deploy.yml
Change: Set azureAiModelRouter= (empty string) instead of azureAiModelRouter=model-router
Commit: b40491a18 - “fix: Disable Model Router in deployment workflow”
Status: ✅ Committed and pushed

⏳ Next Steps

Step 1: Wait for Active Deployment to Complete

Current Status:

There’s an active deployment that caused a conflict
The failed deployment (00:01:39Z) was blocked by active deployment (started 00:01:56Z)
Wait for the active deployment to complete or fail

Check Status:

gh run list --workflow=deploy.yml --limit 1

Step 2: Wait for CI to Complete

The push will trigger:

CI workflow - Builds and tests the code
Deploy workflow - Deploys to Azure (only runs after CI succeeds)

Wait for: CI workflow to complete successfully

Step 3: Monitor Deployment

Once CI completes and Deploy workflow starts:

gh run watch

Expected:

✅ Infrastructure deployment succeeds (Model Router disabled)
✅ Backend container deploys with the code fix
✅ Configuration: AZURE_AI_MODEL_ROUTER is empty/not set

Step 4: Verify Configuration

After deployment succeeds, verify Azure Container Apps:

az containerapp show \
  --name staging-env-api \
  --resource-group zimax-ai \
  --query "properties.template.containers[0].env[?name=='AZURE_AI_MODEL_ROUTER' || name=='AZURE_AI_DEPLOYMENT' || name=='AZURE_AI_API_VERSION']" \
  --output table

Expected:

AZURE_AI_MODEL_ROUTER = (not present or empty) ✅
AZURE_AI_DEPLOYMENT = gpt-5.1-chat ✅
AZURE_AI_API_VERSION = 2024-12-01-preview ✅

Step 5: Test Chat Endpoint

python3 scripts/test-chat-debug.py \
  --token "YOUR_JWT_TOKEN" \
  --message "hi"

Expected Result:

HTTP 200 OK
Actual agent response (not error message)
tokens_used > 0 (indicates LLM call succeeded)

📋 Checklist

🔍 How the Fix Works

Before (Broken)

Model Router enabled → Used wrong endpoint format
max_tokens parameter → gpt-5.1-chat doesn’t support this
temperature parameter → gpt-5.1-chat doesn’t support custom temperature
Result: 400 Bad Request from LLM API

After (Fixed)

Model Router disabled → Direct model deployment (gpt-5.1-chat)
max_completion_tokens parameter → ✅ Supported by gpt-5.1-chat
No temperature parameter → ✅ Uses default (1)
Result: 200 OK with actual response ✅

⚠️ Important Notes

14-Minute Rule: Don’t trigger multiple deployments within 14 minutes
Wait for Active Deployment: The previous deployment must complete before new one
Configuration Consistency: Ensure Azure Container Apps matches the deployment parameters
Test After Deploy: Always test chat endpoint after deployment succeeds

docs/troubleshooting/gpt-5.1-chat-api-parameters-fix.md - Code fix details
docs/troubleshooting/deployment-failure-chat-fix.md - Deployment failure details
docs/troubleshooting/chat-deployment-issues-summary.md - Complete summary
docs/configuration/config-alignment.md - Configuration reference