performance-testing
About
This skill enables Claude to automate performance testing by designing and executing load, stress, spike, and endurance tests. It analyzes key metrics like response time and throughput to identify bottlenecks in CPU, memory, database, or network. The skill then provides comprehensive reports with graphs and optimization recommendations.
Documentation
Overview
This skill automates performance testing workflows, allowing Claude to create and run various tests to assess system performance under different conditions. It facilitates bottleneck identification and provides actionable recommendations for optimization.
How It Works
- Test Design: Claude analyzes the user's request to determine the appropriate test type (load, stress, spike, or endurance) and configures test parameters such as target users, duration, and ramp-up time.
- Test Execution: The performance-test-suite plugin executes the designed test, collecting performance metrics like response times, throughput, and error rates.
- Metrics Analysis: Claude analyzes the collected metrics to identify performance bottlenecks and potential issues.
- Report Generation: Claude generates a comprehensive report summarizing the test results, highlighting key performance indicators, and providing recommendations for improvement.
When to Use This Skill
This skill activates when you need to:
- Create a load test for an API.
- Design a stress test to determine the breaking point of a system.
- Simulate a spike test to evaluate system behavior during sudden traffic surges.
- Develop an endurance test to detect memory leaks or stability issues.
Examples
Example 1: Load Testing an API
User request: "Create a load test for the /users API, ramping up to 200 concurrent users over 10 minutes."
The skill will:
- Design a load test configuration with a ramp-up stage to 200 users over 10 minutes.
- Execute the load test using the performance-test-suite plugin.
- Generate a report showing response times, throughput, and error rates for the /users API.
Example 2: Stress Testing a Checkout Process
User request: "Design a stress test to find the breaking point of the checkout process."
The skill will:
- Design a stress test configuration with gradually increasing load on the checkout process.
- Execute the stress test, monitoring response times and error rates.
- Identify the point at which the checkout process fails and generate a report detailing the system's breaking point.
Best Practices
- Realistic Scenarios: Design tests that accurately reflect real-world usage patterns.
- Comprehensive Metrics: Monitor a wide range of performance metrics to gain a holistic view of system performance.
- Iterative Testing: Run multiple tests with different configurations to fine-tune performance and identify optimal settings.
Integration
This skill integrates with other monitoring and alerting plugins to provide real-time feedback on system performance during testing. It can also be used in conjunction with deployment plugins to automatically validate performance after code changes.
Quick Install
/plugin add https://github.com/jeremylongshore/claude-code-plugins-plus/tree/main/performance-test-suiteCopy and paste this command in Claude Code to install this skill
GitHub 仓库
Related Skills
sglang
MetaSGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.
llamaguard
OtherLlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
langchain
MetaLangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.
