Container Apps Revision Mode: Cost Implications
Overview
Azure Container Apps supports two revision modes:
- Single Revision Mode (Current): Only one active revision at a time
- Multiple Revision Mode: Multiple active revisions with traffic splitting
Current Configuration
Engram uses Single Revision Mode by default. This means:
- Only the latest revision is active
- Previous revisions are automatically deactivated on new deployments
- The same revision number is updated in place (e.g.,
0000144stays0000144)
Cost Implications
Single Revision Mode (Current)
Advantages:
- ✅ Lower Cost: Only one set of replicas running at any time
- ✅ Simpler Management: No traffic splitting complexity
- ✅ Scale-to-Zero Compatible: Idle cost = $0 (when minReplicas = 0)
- ✅ Predictable Billing: Single revision = single resource consumption
Cost Structure:
- Resource Consumption: Billed on vCPU-seconds and GiB-seconds
- Free Tier: First 180,000 vCPU-seconds and 360,000 GiB-seconds/month per subscription
- Idle Cost: $0 when scaled to zero replicas
- Active Cost: Pay only for running replicas (based on actual usage)
Example Cost (Single Mode):
Scenario: 1,000 conversations/day
- Average: 2 replicas active
- vCPU-seconds: ~2M/month
- GiB-seconds: ~4M/month
- Cost: ~$5-10/month (after free tier)
Multiple Revision Mode
Disadvantages:
- ❌ Higher Cost: Multiple revisions = multiple sets of replicas
- ❌ Complex Traffic Management: Requires traffic splitting configuration
- ❌ Increased Resource Usage: All active revisions consume resources
- ❌ Operational Overhead: More monitoring and management needed
Cost Structure:
- Resource Consumption: Each active revision consumes resources independently
- Traffic Splitting: 50/50 split = 2x replicas (both revisions active)
- Gradual Rollout: 10/90 split = 1.1x replicas (both revisions active)
Example Cost (Multiple Mode - 50/50 Traffic Split):
Scenario: 1,000 conversations/day, 50/50 traffic split
- Revision A: 2 replicas active
- Revision B: 2 replicas active
- Total: 4 replicas active
- vCPU-seconds: ~4M/month
- GiB-seconds: ~8M/month
- Cost: ~$10-20/month (after free tier)
- **Cost Increase: 2x compared to Single Mode**
Cost Comparison
| Scenario | Single Mode | Multiple Mode (50/50) | Cost Difference |
|---|---|---|---|
| Idle (0 replicas) | $0 | $0 | Same |
| Light (1 replica) | ~$2-5/month | ~$4-10/month | 2x |
| Medium (2 replicas) | ~$5-10/month | ~$10-20/month | 2x |
| Heavy (5 replicas) | ~$15-30/month | ~$30-60/month | 2x |
When to Use Multiple Revision Mode
Use Multiple Mode Only If:
- ✅ You need blue-green deployments (zero-downtime rollbacks)
- ✅ You need gradual rollouts (A/B testing, canary deployments)
- ✅ You need traffic splitting (testing new features with subset of users)
- ✅ You have budget for 2x resource consumption
For Engram:
- ✅ Single Mode is Optimal: We use scale-to-zero, so rollbacks are fast
- ✅ Cost-Conscious: Single mode aligns with FinOps strategy
- ✅ Simple Deployments: FastAPI deployments are lightweight
- ✅ No Traffic Splitting Needed: Single active revision is sufficient
Recommendations
Current Setup (Single Mode) ✅
Keep Single Revision Mode because:
- Cost Efficiency: Only pay for one set of replicas
- Scale-to-Zero: Idle cost = $0 (minReplicas = 0)
- Fast Deployments: Container Apps deployments are quick (~2-5 minutes)
- Simple Rollbacks: Deploy previous image if needed (no traffic splitting)
If You Need Multiple Mode
Consider Multiple Mode Only If:
- You need zero-downtime rollbacks (blue-green)
- You need A/B testing capabilities
- You have budget for 2x resource consumption
- You need gradual feature rollouts
Cost Impact:
- 2x resource consumption during traffic splits
- Additional monitoring for multiple revisions
- Operational complexity for traffic management
Cost Optimization Tips
Single Mode (Current)
- ✅ Scale-to-Zero: minReplicas = 0 (already configured)
- ✅ Right-Sizing: 0.5 vCPU, 1GB RAM per replica (already configured)
- ✅ Auto-Scaling: Max 10 replicas (prevents runaway costs)
- ✅ Idle Behavior: Scales to 0 after 30 minutes idle
Multiple Mode (If Needed)
- ⚠️ Minimize Traffic Splits: Keep splits short (minutes, not hours)
- ⚠️ Monitor Both Revisions: Track resource usage for both
- ⚠️ Quick Rollouts: Complete rollouts quickly to reduce overlap
- ⚠️ Budget Alerts: Set up alerts for 2x resource consumption
Monitoring
Key Metrics to Track
Single Mode:
- Active replicas (should be 0 when idle)
- vCPU-seconds per month
- GiB-seconds per month
- Deployment frequency
Multiple Mode:
- Active revisions count
- Traffic split percentages
- Resource consumption per revision
- Total resource consumption (sum of all revisions)
Cost Alerts
Set up Azure Cost Management alerts for:
- Container Apps resource consumption > threshold
- Multiple revisions active for > 1 hour
- Resource consumption spikes
Conclusion
Single Revision Mode is the optimal choice for Engram:
✅ Cost-Efficient: Only one set of replicas ✅ Scale-to-Zero Compatible: $0 idle cost ✅ Simple Management: No traffic splitting complexity ✅ Fast Deployments: Quick rollouts and rollbacks ✅ FinOps Aligned: Matches our cost-conscious architecture
Cost Impact of Current Setup:
- Idle: $0 (scales to zero)
- Light Usage: ~$2-5/month
- Medium Usage: ~$5-10/month
- Heavy Usage: ~$15-30/month
If switching to Multiple Mode:
- Cost Increase: 2x resource consumption during traffic splits
- Use Case: Only if blue-green deployments or A/B testing is required
- Recommendation: Keep Single Mode unless specific need arises