Monitoring & Observability
Set up comprehensive monitoring, logging, and alerting for production deployments.
Key Metrics to Monitor
Application
- • Request rate
- • Response time
- • Error rate
- • Queue length
System
- • CPU usage
- • Memory usage
- • Disk I/O
- • Network I/O
Database
- • Query performance
- • Connection pool
- • Storage usage
- • Replication lag
Business
- • API usage
- • Token consumption
- • Cost per request
- • Cache hit rate
Prometheus Setup
prometheus.ymlyaml
scrape_configs:
- job_name: 'cognitivex'
static_configs:
- targets: ['localhost:3000']
metrics_path: '/metrics'
scrape_interval: 15sGrafana Dashboards
Import pre-built Grafana dashboards for CognitiveX monitoring:
- • Application Performance Dashboard
- • System Resources Dashboard
- • Database Metrics Dashboard
- • Business Metrics Dashboard
Centralized Logging
docker-compose.logging.ymlyaml
version: '3.8'
services:
loki:
image: grafana/loki:latest
ports:
- "3100:3100"
promtail:
image: grafana/promtail:latest
volumes:
- /var/log:/var/log
- ./promtail-config.yml:/etc/promtail/config.ymlAlerting Rules
yaml
groups:
- name: cognitivex
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status="500"}[5m]) > 0.05
for: 5m
annotations:
summary: "High error rate detected"
- alert: HighMemoryUsage
expr: memory_usage_percent > 90
for: 5m
annotations:
summary: "Memory usage above 90%"Best Practices
- • Set up alerts for critical metrics
- • Use structured logging (JSON format)
- • Implement distributed tracing
- • Monitor both technical and business metrics
- • Set up dashboards for different teams
- • Regular review of metrics and alerts
- • Document runbooks for common issues