Troubleshooting Guide
Shell-Based Diagnostics
Process Investigation
# Advanced process tree analysis with resource usage
ps auxf | awk '{if($3>0.0 || $4>0.0) print $0}'
# Find processes causing high I/O
iotop -o -b -n 2
# Trace system calls with stack traces
perf trace -p $(pgrep process_name) -s
# Advanced strace filtering
strace -e trace=network,ipc -f -p $(pgrep process_name)System Performance
# Quick performance profile using perf
perf record -F 99 -a -g -- sleep 30
perf report --stdio
# Memory leak investigation
valgrind --leak-check=full --show-leak-kinds=all ./program
# System-wide performance snapshot
sudo sysdig -c spectrogram 'evt.type=switch and evt.dir=>>'Network Diagnostics
Advanced Network Troubleshooting
Container Networking
Cloud Platform Issues
AWS Troubleshooting
Azure Diagnostics
Container Orchestration
Kubernetes Deep Dive
Performance Analysis
Resource Profiling
AI/LLM Integration
Using AI for Troubleshooting
GitHub Copilot CLI
OpenAI API Integration
Using Ollama Locally
Quick Reference: One-Liners
System Analysis
Network Debugging
Container Management
Best Practices
Always maintain a local troubleshooting toolkit:
Set up persistent debug aliases:
Use structured logging:
Automation Tips
Create Self-Documenting Scripts
Monitoring Setup
Remember to regularly update this guide as new tools and techniques emerge.
Last updated