Securing Kubernetes at enterprise scale requires a comprehensive approach spanning infrastructure, workloads, data, and access controls. This guide outlines security best practices and compliance strategies for production Kubernetes environments.
Defense-in-Depth Security Model
Enterprise Kubernetes security follows a layered approach:
Copy ┌─────────────────────────────────────────────────────┐
│ Cluster Infrastructure Security │
├─────────────────────────────────────────────────────┤
│ Kubernetes Control Plane Security │
├─────────────────────────────────────────────────────┤
│ Network Security & Segmentation │
├─────────────────────────────────────────────────────┤
│ Workload Security (Pods & Containers) │
├─────────────────────────────────────────────────────┤
│ Data Security & Secrets Management │
├─────────────────────────────────────────────────────┤
│ Authentication & Authorization (IAM) │
├─────────────────────────────────────────────────────┤
│ Audit Logging & Monitoring │
├─────────────────────────────────────────────────────┤
│ Compliance & Governance │
└─────────────────────────────────────────────────────┘
Cluster Infrastructure Hardening
Private Cluster Architecture
Implement security best practices at the infrastructure level:
Copy ┌─────────────────────────────────────────────────────────┐
│ Private VPC/VNET │
│ │
│ ┌─────────────────┐ │
│ │ Bastion Host/ │ │
│ │ VPN Gateway │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼─────────────────────────────────────┐ │
│ │ Private Kubernetes Cluster │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Control Plane│ │ Worker Nodes │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └──────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
Cloud-Specific Recommendations
AWS EKS :
Enable envelope encryption of EKS secrets using AWS KMS
Use Security Groups to restrict traffic between nodes
Implement private endpoint access for the EKS API
Use EC2 instances with IMDSv2 for node groups
Azure AKS :
Deploy AKS with Azure Private Link
Implement Azure Service Endpoints for service connections
Use Azure Policy for AKS security controls
Enable Azure Defender for Kubernetes
Google GKE :
Deploy private GKE clusters
Use VPC Service Controls to restrict API access
Enable Shielded GKE Nodes
Implement Binary Authorization
Kubernetes-Native Security Controls
Pod Security Standards
Enforce pod security using the built-in Pod Security Standards:
Copy apiVersion: v1
kind: Namespace
metadata:
name: restricted-namespace
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Policy Enforcement with OPA Gatekeeper
Deploy policy guardrails with OPA Gatekeeper:
Copy apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-team-label
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Namespace"]
parameters:
labels: ["team", "environment", "application"]
Image Scanning and Admission Control
Copy apiVersion: v1
kind: ConfigMap
metadata:
name: trivy-operator-policies
namespace: trivy-system
data:
policy.yaml: |
package trivy
deny[msg] {
input.vulnerabilities[_].Severity == "CRITICAL"
msg := "Critical vulnerabilities are not allowed"
}
deny[msg] {
input.securityIssues[_].Severity == "HIGH"
msg := "Images with high security issues are not allowed"
}
Network Security
Core Network Security Components
Copy ┌──────────────────────────────────────────┐
│ Egress Firewall │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Ingress Controller │
│ with WAF/DDoS │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Service Mesh │
│ (East-West Traffic Control) │
└──────────────────┬───────────────────────┘
│
┌──────────────────▼───────────────────────┐
│ Network Policies │
│ (Pod-level Firewalls) │
└──────────────────────────────────────────┘
Network Policy Implementation
Copy apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- namespaceSelector:
matchLabels:
purpose: database
podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
Service Mesh Security (Istio Example)
Copy apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-specific-methods
namespace: production
spec:
selector:
matchLabels:
app: backend
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/default/sa/frontend-service-account"]
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/*"]
Secret Management
External Secret Management Integration
Copy # Using External Secrets Operator with AWS Secrets Manager
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
namespace: production
spec:
refreshInterval: "15m"
secretStoreRef:
name: aws-secretsmanager
kind: ClusterSecretStore
target:
name: database-credentials
data:
- secretKey: username
remoteRef:
key: production/database
property: username
- secretKey: password
remoteRef:
key: production/database
property: password
Sealed Secrets for GitOps
Copy apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: api-key
namespace: production
spec:
encryptedData:
api-key: AgBy8hCNjjSa...truncated...P8kQ9H3mAyxF3A
Authentication & Authorization
RBAC Implementation Best Practices
Principle of Least Privilege :
Copy # Team-specific role with limited permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: team-a
name: team-a-developer
rules:
- apiGroups: [""]
resources: ["pods", "services", "configmaps", "secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Bind role to team group from identity provider
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-developers
namespace: team-a
subjects:
- kind: Group
name: "ad:team-a-developers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: team-a-developer
apiGroup: rbac.authorization.k8s.io
SSO Integration with OIDC
Copy # Using OIDC with Azure AD for AKS
apiVersion: v1
kind: ConfigMap
metadata:
name: azure-ad-oidc-config
namespace: kube-system
data:
oidc-client-id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
oidc-issuer-url: "https://sts.windows.net/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/"
oidc-username-claim: "email"
oidc-groups-claim: "groups"
Audit Logging & Monitoring
Enhanced Audit Policy
Copy apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all requests at the Metadata level
- level: Metadata
# Long-running requests like watches that aren't recorded at RequestReceived
omitStages:
- "RequestReceived"
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["pods"]
# Log auth at RequestResponse level
- level: RequestResponse
resources:
- group: "authentication.k8s.io"
resources: ["*"]
# Log all other resources at the Metadata level
- level: Metadata
# Long-running requests like watches that aren't recorded at RequestReceived
omitStages:
- "RequestReceived"
Security-Focused Monitoring
Copy # Falco security monitoring
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: falco
namespace: security
spec:
selector:
matchLabels:
app: falco
template:
metadata:
labels:
app: falco
spec:
containers:
- name: falco
image: falcosecurity/falco:latest
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/run/docker.sock
name: docker-socket
- mountPath: /host/var/run/docker.sock
name: docker-socket
- mountPath: /host/dev
name: dev-fs
- mountPath: /host/proc
name: proc-fs
readOnly: true
- mountPath: /host/boot
name: boot-fs
readOnly: true
- mountPath: /host/lib/modules
name: lib-modules
readOnly: true
- mountPath: /host/usr
name: usr-fs
readOnly: true
- mountPath: /etc/falco
name: falco-config
volumes:
- name: docker-socket
hostPath:
path: /var/run/docker.sock
- name: dev-fs
hostPath:
path: /dev
- name: proc-fs
hostPath:
path: /proc
- name: boot-fs
hostPath:
path: /boot
- name: lib-modules
hostPath:
path: /lib/modules
- name: usr-fs
hostPath:
path: /usr
- name: falco-config
configMap:
name: falco-config
Compliance Automation
Continuous Compliance Validation
Copy # Kyverno policy for PCI-DSS compliance
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: pci-dss-restricted
spec:
validationFailureAction: Enforce
rules:
- name: host-path-volumes-restricted
match:
resources:
kinds:
- Pod
validate:
message: "Host path volumes are not allowed for PCI-DSS compliance"
pattern:
spec:
=(volumes):
- =(hostPath): "null"
- name: require-pod-probes
match:
resources:
kinds:
- Deployment
validate:
message: "Liveness and readiness probes are required"
pattern:
spec:
template:
spec:
containers:
- livenessProbe:
=(httpGet):
path: "*"
port: "*"
readinessProbe:
=(httpGet):
path: "*"
port: "*"
Compliance Scanning and Reporting
Copy # Trivy Operator configuration for vulnerability reporting
apiVersion: aquasecurity.github.io/v1alpha1
kind: VulnerabilityReport
metadata:
name: app-vulnerability-report
namespace: compliance
spec:
target:
resource:
apiVersion: apps/v1
kind: Deployment
name: web-application
namespace: production
scanner:
name: Trivy
parameters:
- name: "ignoreUnfixed"
value: "true"
- name: "severity"
value: "CRITICAL,HIGH"
schedule: "0 */6 * * *" # Every 6 hours
Disaster Recovery & Security Incident Response
Security Incident Response Plan
Detection : Monitor security alerts from:
Container runtime security tools (Falco)
Cloud provider security services
Containment :
Copy # Isolate compromised namespace
kubectl label namespace compromised security=isolated
kubectl apply -f isolate-network-policy.yaml
# Force restart suspicious pods
kubectl delete pod suspicious-pod-xyz -n compromised
# Temporarily disable compromised service account
kubectl patch serviceaccount -n compromised compromised-sa \
-p '{"metadata":{"annotations":{"security.alpha.kubernetes.io/disabled":"true"}}}'
Eradication & Recovery :
Copy # Rotate credentials
kubectl create secret generic app-credentials --from-literal=password=$(openssl rand -base64 32) -n compromised --dry-run=client -o yaml | kubectl apply -f -
# Apply updated security policies
kubectl apply -f updated-pod-security-policies.yaml
# Restore from known good state
kubectl apply -f https://gitops-repo/known-good-state.yaml
Post-Incident Analysis :
Forensic analysis of compromised containers
Root cause identification and remediation
Cloud-Specific Compliance Controls
AWS EKS Compliance
Requirement
Implementation
AWS CloudTrail + EKS audit logs to CloudWatch
EBS encryption with KMS for PVs
Security Groups, NACLs, and K8s NetworkPolicies
Amazon Inspector + ECR image scanning
AWS Config Rules + AWS Security Hub
Azure AKS Compliance
Requirement
Implementation
Azure Monitor + AKS diagnostic settings
Azure Disk Encryption + Azure Key Vault
NSGs, Azure Firewall, and K8s NetworkPolicies
Microsoft Defender for Containers
Azure Policy for AKS + Azure Security Center
GCP GKE Compliance
Requirement
Implementation
Cloud Audit Logs + GKE audit logging
Application-layer encryption with Cloud KMS
VPC Firewalls and K8s NetworkPolicies
GKE container threat detection + Binary Authorization
Security Command Center + Compliance Reports
References