Service Mesh

A service mesh is an infrastructure layer that manages service-to-service communication in a microservices architecture. It provides traffic management, security, observability, and reliability features without requiring changes to application code.


Why Use a Service Mesh?

  • Traffic Management: Fine-grained control over routing, retries, timeouts, and circuit breaking

  • Security: mTLS encryption, service authentication, and policy enforcement

  • Observability: Distributed tracing, metrics, and logging for all service traffic

  • Reliability: Automatic retries, failover, and health checks

  • Zero-Trust Networking: Enforce least-privilege and secure-by-default communication


Pros and Cons

Pros
Cons

Enhanced security (mTLS, RBAC)

Added complexity and resource overhead

Consistent traffic policies

Steep learning curve for teams

Deep observability and tracing

May impact latency/performance

Platform-agnostic (multi-cloud)

Debugging can be harder

Enables progressive delivery (canary, blue/green)


  • Istio (open source, works on any Kubernetes, supported by GKE, AKS, EKS)

  • Linkerd (lightweight, easy to install, CNCF project)

  • Consul Connect (HashiCorp, integrates with VMs and Kubernetes)

  • AWS App Mesh (managed for EKS, ECS, EC2)

  • Azure Service Mesh (preview, managed for AKS)

  • Anthos Service Mesh (GCP, managed Istio)


Example: Installing Istio on Kubernetes (Cloud-Agnostic)

curl -L https://istio.io/downloadIstio | sh -
cd istio-*
export PATH=$PWD/bin:$PATH
istioctl install --set profile=demo -y
kubectl label namespace default istio-injection=enabled
  • For AKS: Use Azure CLI to create the cluster, then follow the above steps

  • For EKS: Use AWS CLI and eksctl to create the cluster, then follow the above steps

  • For GKE: Use gcloud to create the cluster, then follow the above steps


Example: Deploying a Sample App with Istio

kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml

Access the app via Istio ingress gateway (see Istio docs for cloud-specific instructions).


Example: Enabling mTLS for All Services

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system
spec:
  mtls:
    mode: STRICT

Best Practices (2025)

  • Start with a minimal mesh (e.g., Linkerd or Istio demo profile) and scale up

  • Use GitOps (ArgoCD, Flux) to manage mesh configuration and CRDs

  • Monitor mesh health with Prometheus, Grafana, and Jaeger

  • Use LLMs (Copilot, Claude) to generate and review mesh policies and manifests

  • Document mesh usage and onboarding for your team

Common Pitfalls

  • Overcomplicating the mesh with too many features at once

  • Not monitoring mesh resource usage (can impact cluster performance)

  • Failing to secure the mesh dashboard and control plane

  • Manual changes outside Git (causes drift in GitOps setups)


References

Last updated