Architecture & Limitations
System architecture, design decisions, and current limitations of Ohlala SmartOps
Architecture & Limitations
Understanding the system design, architectural decisions, and current limitations of Ohlala SmartOps.
Looking for deployment instructions? See the Getting Started Guide for step-by-step deployment with screenshots.
🏗️ System Architecture
High-Level Overview
Ohlala SmartOps follows a containerized, serverless architecture designed for high availability and cost efficiency:
Container Architecture
Multi-Container Design with dedicated responsibilities:
Main Bot Container
- Purpose: Teams integration, conversation orchestration, Bedrock AI
- Port: 8000
- Resources: 768 CPU units, 1536MB memory
- Key Features:
- Microsoft Bot Framework integration
- Amazon Bedrock (Claude) orchestration
- Conversation state management
- Multi-language support
MCP AWS API Container
- Purpose: Secure AWS operations via Model Context Protocol
- Port: 8080
- Resources: 256 CPU units, 512MB memory
- Key Features:
- AWS service abstractions
- Permission-aware operations
- Rate limiting and retry logic
- Security-first design
🎯 Architecture Highlights
🚀 Fully Serverless
ECS Fargate + API Gateway eliminate infrastructure management overhead
- Zero server maintenance - AWS handles all patching and scaling
- Automatic scaling - Responds to demand without intervention
- Pay-per-use pricing - Only pay for actual compute time
- Note: ~30s cold start for new container instances
🔒 Security-First Design
Defense in depth with multiple security layers
- Private subnets - Containers have no direct internet exposure
- Isolated containers - Bot logic and AWS operations run separately
- JWT validation - Lambda authorizer validates all requests
- Secrets management - Credentials stored in AWS Secrets Manager
- Least privilege IAM - Each component has minimal required permissions
📦 Microservices Architecture
Multi-container pattern for better maintainability
- Main bot container - Handles Teams interactions and AI orchestration
- MCP AWS container - Provides secure AWS API access
- Clear boundaries - Each container has a single responsibility
- Independent updates - Deploy changes without affecting other components
💾 Stateless by Design
No persistent storage keeps architecture simple
- Reduced complexity - No database to manage or scale
- Lower costs - No database charges or backup requirements
- Horizontal scaling - Any container can handle any request
- Trade-off: Conversation context resets on container restart
🌍 Regional Flexibility
Deploy anywhere with single-region stacks
- Data sovereignty - Keep data in your required region
- Low latency - Deploy close to your EC2 instances
- Cost optimization - No cross-region data transfer fees
- Simple disaster recovery - Deploy multiple independent stacks
⚡ High-Performance Networking
Optimized for Teams integration with enterprise-grade networking
- Network Load Balancer - Layer 4 load balancing for minimal latency
- VPC Link - Secure private connection from API Gateway
- Auto-scaling - Network automatically handles traffic spikes
- Health checks - Automatic failover for unhealthy containers
📊 Performance Characteristics
Response Times
- Health Check: < 1 second
- Simple Commands: 2-5 seconds
- AI Analysis: 5-15 seconds
- SSM Operations: 10-60 seconds (depending on command)
Throughput Limits
- Concurrent Users: 1-20 (single task)
- Commands/Day: 10,00+ (with proper scaling)
- API Gateway: 10,000 requests/second (AWS limit)
- Bedrock: 20 requests/minute per model (AWS limit)
Scaling Behavior
- Auto-scaling: ECS service set to auto-heal (1 task)
- Cold start: ~30 seconds for new tasks
⚠️ Current Limitations
1. Session Management
- Issue: No persistent conversation history
- Impact: Context lost on container restart
- Workaround: Keep conversations short and focused
2. Multi-Region Support
- Issue: Single region deployment only
- Impact: No built-in disaster recovery
- Workaround: Deploy multiple stacks in different regions
5. Cold Start Latency
- Issue: 30+ second delay for new container starts
- Impact: First request after idle period is slow
- Workaround: Keep minimum 1 task running always
- Mitigation: ECS warmup targets available
🔒 Security Architecture
Network Security
- Private Subnets: Containers have no direct internet access
- Security Groups: Restrictive ingress/egress rules
- VPC Endpoints: Secure access to AWS services
Authentication & Authorization
- Teams Authentication: Microsoft Bot Framework JWT validation
- AWS Permissions: IAM roles with least-privilege access
- Inter-Container: Shared API key for MCP communication
- Secrets: AWS Secrets Manager for sensitive data
Data Protection
- Encryption in Transit: TLS 1.2+ for all communication
- Encryption at Rest: EBS volumes encrypted by default
- Logging: CloudWatch Logs with retention policies
- Audit Trail: All AWS API calls logged via CloudTrail
📖 Technical References
Container Images
- Registry: Amazon ECR
- Repository:
709825985650.dkr.ecr.us-east-1.amazonaws.com/ohlala-automation-solutions/
- Tags: Version-based (v1.0.0, v1.1.0, etc.)
Monitoring & Observability
- Metrics: CloudWatch Container Insights
- Logs: Structured JSON logging to CloudWatch
- Health Checks: HTTP endpoints on both containers
- Alarms: CPU, Memory, Error Rate monitoring
Backup & Recovery
- Container Images: Immutable, versioned in ECR
- Infrastructure: CloudFormation templates in version control
- Configuration: Environment variables and secrets
- No Persistent Data: Stateless design eliminates backup needs
📚 Additional Resources
- Deployment Reference - CloudFormation parameters and technical configuration
- Getting Started Guide - Step-by-step deployment
- Troubleshooting - Common issues and solutions
Need Help?
- 📧 Support: support@ohlala.cloud
- 📚 AWS Architecture Center: AWS Well-Architected Framework ↗️