Monitoring

Overview

Modern Azure monitoring combines multiple services to provide comprehensive observability across your cloud infrastructure. This guide covers best practices, implementation patterns, and real-world scenarios for different technology stacks.

Core Monitoring Services

1. Azure Monitor

Central service for collecting all monitoring data:

  • Metrics

  • Logs

  • Distributed traces

  • Changes

  • Security events

# Terraform example
resource "azurerm_monitor_action_group" "critical" {
  name                = "critical-alerts"
  resource_group_name = azurerm_resource_group.monitoring.name
  short_name          = "critical"

  email_receiver {
    name          = "ops-team"
    email_address = "ops@example.com"
  }
}
// Bicep example
resource actionGroup 'Microsoft.Insights/actionGroups@2023-01-01' = {
  name: 'critical-alerts'
  location: 'global'
  properties: {
    groupShortName: 'critical'
    emailReceivers: [
      {
        name: 'ops-team'
        emailAddress: 'ops@example.com'
      }
    ]
  }
}

2. Log Analytics Workspace

Central log repository with advanced query capabilities:

# Azure CLI example
az monitor log-analytics workspace create \
  --resource-group monitoring-rg \
  --workspace-name central-logs \
  --location westeurope \
  --sku PerGB2018

Technology Stack-Specific Monitoring

1. Containerized Applications (AKS)

  • Container Insights

  • Prometheus integration

  • Grafana dashboards

resource "azurerm_monitor_diagnostic_setting" "aks" {
  name                       = "aks-diagnostics"
  target_resource_id        = azurerm_kubernetes_cluster.main.id
  log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id

  log {
    category = "kube-apiserver"
    enabled  = true
  }
  
  metric {
    category = "AllMetrics"
    enabled  = true
  }
}

2. Serverless Applications

  • Application Insights

  • Function App monitoring

  • Distributed tracing

resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: 'serverless-ai'
  location: resourceGroup().location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalytics.id
  }
}

3. Traditional VM-based Applications

  • VM Insights

  • Dependency monitoring

  • Performance metrics

az vm extension set \
  --resource-group myResourceGroup \
  --vm-name myVM \
  --name AzureMonitorLinuxAgent \
  --publisher Microsoft.Azure.Monitor \
  --version 1.0

Monitoring Patterns by Use Case

1. High-Availability Applications

  • Multi-region health checks

  • Load balancer metrics

  • Failover monitoring

resource "azurerm_monitor_metric_alert" "latency" {
  name                = "high-latency"
  resource_group_name = azurerm_resource_group.main.name
  scopes              = [azurerm_application_gateway.main.id]
  description         = "Alert when latency exceeds threshold"

  criteria {
    metric_namespace = "Microsoft.Network/applicationGateways"
    metric_name      = "BackendResponseLatency"
    aggregation      = "Average"
    operator         = "GreaterThan"
    threshold        = 100
  }

  action {
    action_group_id = azurerm_monitor_action_group.critical.id
  }
}

2. Security and Compliance

  • Microsoft Defender for Cloud integration

  • Regulatory compliance monitoring

  • Security Center alerts

3. Cost Optimization

  • Budget alerts

  • Resource utilization tracking

  • Anomaly detection

resource budgetAlert 'Microsoft.Consumption/budgets@2021-10-01' = {
  name: 'monthly-budget'
  properties: {
    amount: 1000
    category: 'Cost'
    timeGrain: 'Monthly'
    notifications: {
      actual_gt_90: {
        enabled: true
        operator: 'GreaterThan'
        threshold: 90
        contactEmails: [
          'finance@example.com'
        ]
      }
    }
  }
}

Best Practices for 2025

  1. Unified Observability

    • Centralize all monitoring in Log Analytics

    • Enable cross-service correlation

    • Implement distributed tracing

  2. Automated Response

    • Use Logic Apps for automated remediation

    • Implement scaling based on metrics

    • Auto-heal configuration

  3. AI-Powered Monitoring

    • Smart anomaly detection

    • Predictive alerts

    • LLM-based log analysis

  4. Cost-Effective Monitoring

    • Data retention policies

    • Sampling for high-volume telemetry

    • Targeted verbose monitoring

Common Pitfalls

  • Over-collection of logs

  • Alert fatigue

  • Missing end-to-end tracing

  • Inadequate retention policies

References

Monitoring Joke: Why did the Azure Monitor go to therapy? Because it had too many unresolved issues with attachment!

Last updated