deploying-monitoring-stacks
About
This skill generates production-ready configurations for deploying monitoring stacks like Prometheus, Grafana, and Datadog. Use it when you need to set up metric collection, visualization dashboards, and alerting rules. It provides infrastructure-aware configurations for Kubernetes, Docker, or bare metal environments.
Quick Install
Claude Code
Recommended/plugin add https://github.com/jeremylongshore/claude-code-plugins-plusgit clone https://github.com/jeremylongshore/claude-code-plugins-plus.git ~/.claude/skills/deploying-monitoring-stacksCopy and paste this command in Claude Code to install this skill
Documentation
Prerequisites
Before using this skill, ensure:
- Target infrastructure is identified (Kubernetes, Docker, bare metal)
- Metric endpoints are accessible from monitoring platform
- Storage backend is configured for time-series data
- Alert notification channels are defined (email, Slack, PagerDuty)
- Resource requirements are calculated based on scale
Instructions
- Select Platform: Choose Prometheus/Grafana, Datadog, or hybrid approach
- Deploy Collectors: Install exporters and agents on monitored systems
- Configure Scraping: Define metric collection endpoints and intervals
- Set Up Storage: Configure retention policies and data compaction
- Create Dashboards: Build visualization panels for key metrics
- Define Alerts: Create alerting rules with appropriate thresholds
- Test Monitoring: Verify metrics flow and alert triggering
Output
Prometheus + Grafana (Kubernetes):
# {baseDir}/monitoring/prometheus.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
spec:
replicas: 1
template:
spec:
containers:
- name: prometheus
image: prom/prometheus:latest
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
ports:
- containerPort: 9090
Grafana Dashboard Configuration:
{
"dashboard": {
"title": "Application Metrics",
"panels": [
{
"title": "CPU Usage",
"type": "graph",
"targets": [
{
"expr": "rate(container_cpu_usage_seconds_total[5m])"
}
]
}
]
}
}
Error Handling
Metrics Not Appearing
- Error: "No data points"
- Solution: Verify scrape targets are accessible and returning metrics
High Cardinality
- Error: "Too many time series"
- Solution: Reduce label combinations or increase Prometheus resources
Alert Not Firing
- Error: "Alert condition met but no notification"
- Solution: Check Alertmanager configuration and notification channels
Dashboard Load Failure
- Error: "Failed to load dashboard"
- Solution: Verify Grafana datasource configuration and permissions
Resources
- Prometheus documentation: https://prometheus.io/docs/
- Grafana documentation: https://grafana.com/docs/
- Example dashboards in {baseDir}/monitoring-examples/
GitHub Repository
Related Skills
content-collections
MetaThis skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.
langchain
MetaLangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.
creating-opencode-plugins
MetaThis skill provides the structure and API specifications for creating OpenCode plugins that hook into 25+ event types like commands, files, and LSP operations. It offers implementation patterns for JavaScript/TypeScript modules that intercept and extend the AI assistant's lifecycle. Use it when you need to build event-driven plugins for monitoring, custom handling, or extending OpenCode's capabilities.
Algorithmic Art Generation
MetaThis skill helps developers create algorithmic art using p5.js, focusing on generative art, computational aesthetics, and interactive visualizations. It automatically activates for topics like "generative art" or "p5.js visualization" and guides you through creating unique algorithms with features like seeded randomness, flow fields, and particle systems. Use it when you need to build reproducible, code-driven artistic patterns.
