Amazon CloudWatch

Overview

Amazon CloudWatch provides monitoring and observability for AWS resources and applications. It collects logs, metrics, and events for real-time visibility.

Real-life Use Cases

  • Cloud Architect: Design dashboards for multi-account monitoring.

  • DevOps Engineer: Set up alarms for auto-scaling and incident response.

Terraform Example

resource "aws_cloudwatch_log_group" "app_logs" {
  name = "/aws/app/logs"
}

resource "aws_cloudwatch_metric_alarm" "cpu_alarm" {
  alarm_name          = "high-cpu"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "120"
  statistic           = "Average"
  threshold           = "80"
  alarm_description   = "This metric monitors high CPU usage"
  actions_enabled     = true
}

AWS CLI Example

aws cloudwatch put-metric-alarm --alarm-name high-cpu --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average --period 120 --threshold 80 --comparison-operator GreaterThanThreshold --evaluation-periods 2 --alarm-actions arn:aws:sns:us-east-1:123456789012:NotifyMe

Best Practices

  • Centralize logs using log groups.

  • Use metric filters for custom metrics.

  • Set actionable alarms.

Common Pitfalls

  • Not setting retention policies for logs.

  • Too many alarms causing alert fatigue.

Joke: Why did CloudWatch break up with EC2? Too many metrics, not enough commitment!

Last updated