Back to Skills

running-mutation-tests

jeremylongshore
Updated Today
90 views
409
51
409
View on GitHub
Metaaitestingdesign

About

This skill performs mutation testing to evaluate test suite quality by introducing code mutations and checking if tests detect them. It calculates a mutation survival rate to reveal test coverage gaps and effectiveness. Use it when developers need to assess test robustness using terms like "mutation testing" or "mutation score".

Documentation

Overview

This skill empowers Claude to execute mutation testing, providing insights into the effectiveness of a test suite. By introducing small changes (mutations) into the code and running the tests, it determines if the tests are capable of detecting these changes. This helps identify weaknesses in the test suite and improve overall code quality.

How It Works

  1. Mutation Generation: The plugin automatically introduces mutations (e.g., changing + to -) into the code.
  2. Test Execution: The test suite is run against the mutated code.
  3. Result Analysis: The plugin analyzes which mutations were "killed" (detected by tests) and which "survived" (were not detected).
  4. Reporting: A mutation score is calculated, and surviving mutants are identified for further investigation.

When to Use This Skill

This skill activates when you need to:

  • Validate the effectiveness of a test suite.
  • Identify gaps in test coverage.
  • Improve the mutation score of a project.
  • Analyze surviving mutants to strengthen tests.

Examples

Example 1: Improving Test Coverage

User request: "Run mutation testing on the validator module and suggest improvements to the tests."

The skill will:

  1. Execute mutation tests on the validator module.
  2. Analyze the results and identify surviving mutants, indicating areas where tests are weak.
  3. Suggest specific improvements to the tests based on the surviving mutants, such as adding new test cases or modifying existing ones.

Example 2: Assessing Test Quality

User request: "What is the mutation score for the user authentication service?"

The skill will:

  1. Execute mutation tests on the user authentication service.
  2. Calculate the mutation score based on the number of killed mutants.
  3. Report the mutation score to the user, providing a metric for test quality.

Best Practices

  • Targeted Mutation: Focus mutation testing on critical modules or areas with high complexity.
  • Analyze Survivors: Prioritize the analysis of surviving mutants to identify the most impactful improvements to test coverage.
  • Iterative Improvement: Use mutation testing as part of an iterative process to continuously improve test suite quality.

Integration

This skill integrates well with other testing and code analysis tools. For example, it can be used in conjunction with code coverage tools to provide a more comprehensive view of test effectiveness.

Quick Install

/plugin add https://github.com/jeremylongshore/claude-code-plugins-plus/tree/main/mutation-test-runner

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

jeremylongshore/claude-code-plugins-plus
Path: backups/skills-migration-20251108-070147/plugins/testing/mutation-test-runner/skills/mutation-test-runner
aiautomationclaude-codedevopsmarketplacemcp

Related Skills

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

langchain

Meta

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.

View skill