monitoring-error-rates
关于
This skill enables Claude to monitor and analyze application error rates across various components like HTTP requests, databases, and external APIs. Use it when a developer needs to track errors, analyze error rates, or set up alerting based on defined thresholds. It automates comprehensive error tracking to improve application reliability.
快速安装
Claude Code
推荐/plugin add https://github.com/jeremylongshore/claude-code-plugins-plusgit clone https://github.com/jeremylongshore/claude-code-plugins-plus.git ~/.claude/skills/monitoring-error-rates在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Overview
This skill automates the process of setting up comprehensive error monitoring and alerting for various components of an application. It helps identify, track, and analyze different types of errors, enabling proactive identification and resolution of issues before they impact users.
How It Works
- Analyze Error Sources: Identifies potential error sources within the application architecture, including HTTP endpoints, database queries, external APIs, background jobs, and client-side code.
- Define Monitoring Criteria: Establishes specific error types and thresholds for each source, such as HTTP status codes (4xx, 5xx), exception types, query timeouts, and API response failures.
- Configure Alerting: Sets up alerts to trigger when error rates exceed defined thresholds, notifying relevant teams or individuals for investigation and remediation.
When to Use This Skill
This skill activates when you need to:
- Set up error monitoring for a new application.
- Analyze existing error rates and identify areas for improvement.
- Configure alerts to be notified of critical errors in real-time.
- Establish error budgets and track progress towards reliability goals.
Examples
Example 1: Setting up Error Monitoring for a Web Application
User request: "Monitor errors in my web application, especially 500 errors and database connection issues."
The skill will:
- Analyze the web application's architecture to identify potential error sources (e.g., HTTP endpoints, database connections).
- Configure monitoring for 500 errors and database connection failures, setting appropriate thresholds and alerts.
Example 2: Analyzing Error Rates in a Background Job Processor
User request: "Analyze error rates for my background job processor. I'm seeing a lot of failed jobs."
The skill will:
- Focus on the background job processor and identify the types of errors occurring (e.g., task failures, timeouts, resource exhaustion).
- Analyze the frequency and patterns of these errors to identify potential root causes.
Best Practices
- Granularity: Monitor errors at a granular level to identify specific problem areas.
- Thresholding: Set appropriate alert thresholds to avoid alert fatigue and focus on critical issues.
- Context: Include relevant context in error messages and alerts to facilitate troubleshooting.
Integration
This skill can be integrated with other monitoring and alerting tools, such as Prometheus, Grafana, and PagerDuty, to provide a comprehensive view of application health and performance. It can also be used in conjunction with incident management tools to streamline incident response workflows.
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
creating-opencode-plugins
元该Skill为开发者创建OpenCode插件提供指导,涵盖命令、文件、LSP等25+种事件类型。它详细说明了插件结构、事件API规范及JavaScript/TypeScript实现模式,帮助开发者构建事件驱动的模块。适用于需要拦截操作、扩展功能或自定义AI助手行为的插件开发场景。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
evaluating-llms-harness
测试该Skill通过60+个学术基准测试(如MMLU、GSM8K等)评估大语言模型质量,适用于模型对比、学术研究及训练进度追踪。它支持HuggingFace、vLLM和API接口,被EleutherAI等行业领先机构广泛采用。开发者可通过简单命令行快速对模型进行多任务批量评估。
