Architecture & Limitations

System architecture, design decisions, and current limitations of Ohlala SmartOps

Architecture & Limitations

Understanding the system design, architectural decisions, and current limitations of Ohlala SmartOps.

🏗️ System Architecture

High-Level Overview

Ohlala SmartOps follows a containerized, serverless architecture designed for high availability and cost efficiency:

High-level architecture diagram showing user interaction with Teams, API Gateway, ECS Fargate, Bedrock, and AWS services

Container Architecture

Multi-Container Design with dedicated responsibilities:

Main Bot Container

  • Purpose: Teams integration, conversation orchestration, Bedrock AI
  • Port: 8000
  • Resources: 768 CPU units, 1536MB memory
  • Key Features:
    • Microsoft Bot Framework integration
    • Amazon Bedrock (Claude) orchestration
    • Conversation state management
    • Multi-language support

MCP AWS API Container

  • Purpose: Secure AWS operations via Model Context Protocol
  • Port: 8080
  • Resources: 256 CPU units, 512MB memory
  • Key Features:
    • AWS service abstractions
    • Permission-aware operations
    • Rate limiting and retry logic
    • Security-first design

🎯 Architecture Highlights

🚀 Fully Serverless

ECS Fargate + API Gateway eliminate infrastructure management overhead

  • Zero server maintenance - AWS handles all patching and scaling
  • Automatic scaling - Responds to demand without intervention
  • Pay-per-use pricing - Only pay for actual compute time
  • Note: ~30s cold start for new container instances

🔒 Security-First Design

Defense in depth with multiple security layers

  • Private subnets - Containers have no direct internet exposure
  • Isolated containers - Bot logic and AWS operations run separately
  • JWT validation - Lambda authorizer validates all requests
  • Secrets management - Credentials stored in AWS Secrets Manager
  • Least privilege IAM - Each component has minimal required permissions

📦 Microservices Architecture

Multi-container pattern for better maintainability

  • Main bot container - Handles Teams interactions and AI orchestration
  • MCP AWS container - Provides secure AWS API access
  • Clear boundaries - Each container has a single responsibility
  • Independent updates - Deploy changes without affecting other components

💾 Stateless by Design

No persistent storage keeps architecture simple

  • Reduced complexity - No database to manage or scale
  • Lower costs - No database charges or backup requirements
  • Horizontal scaling - Any container can handle any request
  • Trade-off: Conversation context resets on container restart

🌍 Regional Flexibility

Deploy anywhere with single-region stacks

  • Data sovereignty - Keep data in your required region
  • Low latency - Deploy close to your EC2 instances
  • Cost optimization - No cross-region data transfer fees
  • Simple disaster recovery - Deploy multiple independent stacks

⚡ High-Performance Networking

Optimized for Teams integration with enterprise-grade networking

  • Network Load Balancer - Layer 4 load balancing for minimal latency
  • VPC Link - Secure private connection from API Gateway
  • Auto-scaling - Network automatically handles traffic spikes
  • Health checks - Automatic failover for unhealthy containers

📊 Performance Characteristics

Response Times

  • Health Check: < 1 second
  • Simple Commands: 2-5 seconds
  • AI Analysis: 5-15 seconds
  • SSM Operations: 10-60 seconds (depending on command)

Throughput Limits

  • Concurrent Users: 1-20 (single task)
  • Commands/Day: 10,00+ (with proper scaling)
  • API Gateway: 10,000 requests/second (AWS limit)
  • Bedrock: 20 requests/minute per model (AWS limit)

Scaling Behavior

  • Auto-scaling: ECS service set to auto-heal (1 task)
  • Cold start: ~30 seconds for new tasks

⚠️ Current Limitations

1. Session Management

  • Issue: No persistent conversation history
  • Impact: Context lost on container restart
  • Workaround: Keep conversations short and focused

2. Multi-Region Support

  • Issue: Single region deployment only
  • Impact: No built-in disaster recovery
  • Workaround: Deploy multiple stacks in different regions

5. Cold Start Latency

  • Issue: 30+ second delay for new container starts
  • Impact: First request after idle period is slow
  • Workaround: Keep minimum 1 task running always
  • Mitigation: ECS warmup targets available

🔒 Security Architecture

Network Security

  • Private Subnets: Containers have no direct internet access
  • Security Groups: Restrictive ingress/egress rules
  • VPC Endpoints: Secure access to AWS services

Authentication & Authorization

  • Teams Authentication: Microsoft Bot Framework JWT validation
  • AWS Permissions: IAM roles with least-privilege access
  • Inter-Container: Shared API key for MCP communication
  • Secrets: AWS Secrets Manager for sensitive data

Data Protection

  • Encryption in Transit: TLS 1.2+ for all communication
  • Encryption at Rest: EBS volumes encrypted by default
  • Logging: CloudWatch Logs with retention policies
  • Audit Trail: All AWS API calls logged via CloudTrail

📖 Technical References

Container Images

  • Registry: Amazon ECR
  • Repository: 709825985650.dkr.ecr.us-east-1.amazonaws.com/ohlala-automation-solutions/
  • Tags: Version-based (v1.0.0, v1.1.0, etc.)

Monitoring & Observability

  • Metrics: CloudWatch Container Insights
  • Logs: Structured JSON logging to CloudWatch
  • Health Checks: HTTP endpoints on both containers
  • Alarms: CPU, Memory, Error Rate monitoring

Backup & Recovery

  • Container Images: Immutable, versioned in ECR
  • Infrastructure: CloudFormation templates in version control
  • Configuration: Environment variables and secrets
  • No Persistent Data: Stateless design eliminates backup needs

📚 Additional Resources

Need Help?