build-grafana-dashboards
关于
This skill generates production-ready Grafana dashboards with reusable panels, template variables, and annotations for version-controlled deployment. Use it when creating operational dashboards for SRE teams, visualizing Prometheus/Loki metrics, or establishing SLO compliance reporting. It helps migrate from manual dashboard creation to automated, version-controlled provisioning.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/build-grafana-dashboards在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
建 Grafana 儀表
設發 Grafana 儀表附可維、可重用、版控之佳慣。
用
- 造 Prometheus、Loki 或他資源指之視
- 建 SRE 隊與事應者之運儀
- 立高管 SLO 合規報儀
- 移儀自手造至版控發
- 以模變一跨隊儀布
- 造自高概至細指之鑽探
入
- 必:資源配(Prometheus、Loki、Tempo 等)
- 必:當視之指或志附查模
- 可:多服或多環視之模變
- 可:現儀 JSON 為移或改
- 可:事關註查(發、事件)
行
詳 Extended Examples 備全配檔與模。
一:設儀構
建板前→謀儀布與組。
造儀規文:
# Service Overview Dashboard
## Purpose
Real-time operational view for on-call engineers monitoring the API service.
## Rows
1. High-Level Metrics (collapsed by default)
- Request rate, error rate, latency (RED metrics)
- Service uptime, instance count
2. Detailed Metrics (expanded by default)
- Per-endpoint latency breakdown
- Error rate by status code
- Database connection pool status
3. Resource Utilization
- CPU, memory, disk usage per instance
- Network I/O rates
4. Logs (collapsed by default)
- Recent errors from Loki
- Alert firing history
## Variables
- `environment`: production, staging, development
- `instance`: all instances or specific instance selection
- `interval`: aggregation window (5m, 15m, 1h)
## Annotations
- Deployment events from CI/CD system
- Alert firing/resolving events
要設則:
- 要指先:頂為危指、下為詳
- 一致時範:諸板同時
- 鑽探路:自高連至細儀
- 適布:用列與板寬適諸屏
得: 儀構已錄,當事人合指與布之優序。
敗:
- 與末用者(SRE、開發者)行儀設察
- 較業標(USE 法、RED 法、四金號)
- 察隊現儀以求一致模
二:以模變造儀
建儀基附可重用變為濾。
造儀 JSON 構(或用 UI 後出):
{
"dashboard": {
"title": "API Service Overview",
"uid": "api-service-overview",
"version": 1,
"timezone": "browser",
"editable": true,
"graphTooltip": 1,
"time": {
"from": "now-6h",
"to": "now"
},
"refresh": "30s",
"templating": {
"list": [
{
"name": "environment",
"type": "query",
"datasource": "Prometheus",
"query": "label_values(up{job=\"api-service\"}, environment)",
"multi": false,
"includeAll": false,
"refresh": 1,
"sort": 1,
"current": {
"selected": false,
"text": "production",
"value": "production"
}
},
{
"name": "instance",
"type": "query",
"datasource": "Prometheus",
"query": "label_values(up{job=\"api-service\",environment=\"$environment\"}, instance)",
"multi": true,
"includeAll": true,
"refresh": 1,
"allValue": ".*",
"current": {
"selected": true,
"text": "All",
"value": "$__all"
}
},
{
"name": "interval",
"type": "interval",
"options": [
{"text": "1m", "value": "1m"},
{"text": "5m", "value": "5m"},
{"text": "15m", "value": "15m"},
{"text": "1h", "value": "1h"}
],
"current": {
"text": "5m",
"value": "5m"
},
"auto": false
}
]
},
"annotations": {
"list": [
{
"name": "Deployments",
"datasource": "Prometheus",
"enable": true,
"expr": "changes(app_version{job=\"api-service\",environment=\"$environment\"}[5m]) > 0",
"step": "60s",
"iconColor": "rgba(0, 211, 255, 1)",
"tagKeys": "version"
}
]
}
}
}
變類與用例:
- 查變:自資源動列(
label_values()、query_result()) - 區變:查之集區
- 客變:非指擇之靜列
- 恆變:跨板共值(資源名、閾)
- 文變:濾之自入
得: 變自資源正充、級聯濾行(env 濾實)、默擇宜。
敗:
- 於 Prometheus UI 獨試變查
- 察循依(A 依 B 依 A)
- 驗
allValue之正則於多擇變 - 察變刷設(載儀時對時範變時)
三:建視板
每指造宜視類之板。
時序板(請率):
{
"type": "timeseries",
"title": "Request Rate",
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
"targets": [
{
"expr": "sum(rate(http_requests_total{job=\"api-service\",environment=\"$environment\",instance=~\"$instance\"}[$interval])) by (method)",
"legendFormat": "{{method}}",
"refId": "A"
}
],
"fieldConfig": {
"defaults": {
"unit": "reqps",
"color": {
"mode": "palette-classic"
},
"custom": {
"drawStyle": "line",
"lineInterpolation": "smooth",
"fillOpacity": 10,
"spanNulls": true
},
"thresholds": {
"mode": "absolute",
"steps": [
{"value": null, "color": "green"},
{"value": 1000, "color": "yellow"},
{"value": 5000, "color": "red"}
]
}
}
},
"options": {
"tooltip": {
"mode": "multi",
"sort": "desc"
},
"legend": {
"displayMode": "table",
"placement": "right",
"calcs": ["mean", "max", "last"]
}
}
}
stat 板(誤率):
{
"type": "stat",
"title": "Error Rate",
"gridPos": {"h": 4, "w": 6, "x": 12, "y": 0},
"targets": [
{
# ... (see EXAMPLES.md for complete configuration)
熱圖板(延分布):
{
"type": "heatmap",
"title": "Request Duration Heatmap",
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 8},
"targets": [
{
# ... (see EXAMPLES.md for complete configuration)
板擇導:
- 時序:時趨(率、計、時長)
- Stat:單當值附閾色
- 表:百分值(CPU、記、盤)
- 條表:一時多值較
- 熱圖:時值分布(延分位)
- 表:多指細分
- 志:Loki 原志附濾
得: 板正渲附資、視合指類、圖例述、閾顯題。
敗:
- Explore 視中以同時範與變試查
- 察指名或標濾誤
- 驗集函合指類(counter 用 rate、gauge 用 avg)
- 察單位配(bytes、seconds、requests per second)
- 啟「Show query inspector」除空結
四:配列與布
組板入可折列以邏分。
{
"panels": [
{
"type": "row",
"title": "High-Level Metrics",
"collapsed": false,
# ... (see EXAMPLES.md for complete configuration)
布佳慣:
- 格 24 單寬,每板定
w(寬)與h(高) - 用列組相關板,默折較不危段
- 置最危指於首視區(y=0-8)
- 列內守一致板高(典 4、8、12 單)
- 時序用全寬(24),較用半寬(12)
得: 儀布邏組、列正折展、板視齊無隙。
敗:
- 驗 gridPos 坐標不疊
- 察列 panels 陣含板(非 null)
- 驗 y 坐標於頁下遞
- 用 Grafana UI「Edit JSON」察格位
五:加連與鑽探
造相關儀間導路。
儀級連於 JSON:
{
"links": [
{
"title": "Service Details",
"type": "link",
"icon": "external link",
# ... (see EXAMPLES.md for complete configuration)
板級資連:
{
"fieldConfig": {
"defaults": {
"links": [
{
"title": "View Logs for ${__field.labels.instance}",
# ... (see EXAMPLES.md for complete configuration)
連變:
$service、$environment:儀模變${__field.labels.instance}:點擊資點之標值${__from}、${__to}:當儀時範$__url_time_range:URL 之編時範
得: 點板元或儀連→導相關視附脈留(時範、變)。
敗:
- URL 編查參中特字
- 以諸變擇(All 對具值)試連
- 驗標儀 UID 存且可達
- 察
includeVars與keepTime如望行
六:設儀發
版控儀為碼以可復發。
造發目構:
mkdir -p /etc/grafana/provisioning/{dashboards,datasources}
資源發(/etc/grafana/provisioning/datasources/prometheus.yml):
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
# ... (see EXAMPLES.md for complete configuration)
儀發(/etc/grafana/provisioning/dashboards/default.yml):
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: 'Services'
type: file
disableDeletion: false
updateIntervalSeconds: 30
allowUiUpdates: true
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: true
存儀 JSON 檔於 /var/lib/grafana/dashboards/:
/var/lib/grafana/dashboards/
├── api-service/
│ ├── overview.json
│ └── details.json
├── database/
│ └── postgres.json
└── infrastructure/
├── nodes.json
└── kubernetes.json
用 Docker Compose:
version: '3.8'
services:
grafana:
image: grafana/grafana:10.2.0
ports:
- "3000:3000"
volumes:
- ./grafana/provisioning:/etc/grafana/provisioning
- ./grafana/dashboards:/var/lib/grafana/dashboards
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_USERS_ALLOW_SIGN_UP=false
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer
得: 儀於 Grafana 起時自載、JSON 改後於更區映、版控跟儀變。
敗:
- 察 Grafana 志:
docker logs grafana | grep -i provisioning - 驗 JSON 法:
python -m json.tool dashboard.json - 確檔權允 Grafana 讀:
chmod 644 *.json - 以
allowUiUpdates: false試防 UI 改 - 驗發配:
curl http://localhost:3000/api/admin/provisioning/dashboards/reload -X POST -H "Authorization: Bearer $GRAFANA_API_KEY"
驗
- 儀於 Grafana UI 無誤載
- 諸模變以望值充
- 變級聯行(擇環濾實)
- 板顯配時範資
- 板查正用變(無硬值)
- 閾宜顯題態
- 圖例格述且不雜
- 註現於相關事
- 連導正儀附脈留
- 儀自 JSON 發(版控)
- 適布行於異屏大
- 提示與懸互供益脈
忌
- 變不更板:確查用
$variable法,非硬值。察變刷設。 - 空板而查正:驗時範含資點。察取區對集區(5m rate 需 >5m 資)。
- 圖例冗:用
legendFormat示相關標,非全指名。例:{{method}} - {{status}}代默。 - 時範不一:設儀時同令諸板共時窗。用「Sync cursor」為關探。
- 效題:避返高基數系(>1000)。用錄律或預集。限貴查之時範。
- 儀漂:無發→手改生版控撞。產用
allowUiUpdates: false。 - 缺資連:資連需精標名。細用
${__field.labels.labelname},驗標存查結。 - 註泛:過多註雜視。依要濾註或用別註軌。
參
setup-prometheus-monitoring— 配 Grafana 所食之 Prometheus 資源configure-log-aggregation— 設 Loki 為志板查與志註define-slo-sli-sla— 以 Grafana stat 與 gauge 視 SLO 合規與誤預算instrument-distributed-tracing— 自指板加蹤 ID 連至 Tempo 蹤視
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
polymarket
元这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。
creating-opencode-plugins
元该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
