SKILL·738DCD

usability-testing

Name: usability-testing
Author: rampstackco

rampstackco

Mis à jour 1 month ago

10 vues

424

Voir sur GitHub

Autretestingdesign

À propos

Cette compétence aide les développeurs à planifier et à réaliser des tests d'utilisabilité sur des maquettes ou prototypes afin d'identifier les problèmes avant le lancement. Elle gère la conception des tests, la rédaction des scénarios, la modération, et la synthèse des résultats provenant de tests modérés et non modérés. Utilisez-la pour valider les conceptions, améliorer l'accomplissement des tâches et garantir que les utilisateurs réels puissent interagir avec succès avec votre construction.

Installation rapide

Claude Code

Recommandé

Principal

npx skills add rampstackco/claude-skills -a claude-code

Commande PluginAlternatif

/plugin add https://github.com/rampstackco/claude-skills

Git CloneAlternatif

git clone https://github.com/rampstackco/claude-skills.git ~/.claude/skills/usability-testing

Copiez et collez cette commande dans Claude Code pour installer cette compétence

Documentation

Usability Testing

Plan and run tests that find usability problems before users hit them in production. Stack-agnostic. Tool-agnostic.

This skill is for testing existing designs or prototypes. For broader discovery research, use ux-research. For conversion testing in production, use cro-optimization.

When to use

Before launching a new flow or major redesign
After a redesign to verify it doesn't introduce new problems
When analytics show drop-off but you don't know why
When customer support tickets pattern around specific UI areas
Pre-launch user validation
Comparing two design directions

When NOT to use

Discovery / generative research (use ux-research)
Live conversion optimization (use cro-optimization)
Mapping the broader experience (use journey-mapping)
Pure quantitative measurement (use analytics-strategy)

Required inputs

The design or prototype to test (functional or near-functional)
Specific tasks users would do
The audience (who should be tested)
Testing infrastructure (moderated tool, unmoderated tool, in-person setup)

The framework: 5 phases

1. Define what to test

Don't test the whole product. Test specific tasks.

Task selection criteria:

The task represents a real user goal (not "click around and explore")
The task has a clear start and end
The task is achievable in 2 to 10 minutes
The task is one of: most common, most strategic, most problematic

Examples of testable tasks:

"You want to find a contractor near you who can install a fence. Show me how you'd do that on this site."

"You're a first-time visitor. You want to understand if this product fits your needs. Walk me through how you'd evaluate it."

"Your team needs a new tool to manage projects. Use this site to figure out which plan is right for a 12-person team."

Task framing rules:

State the user goal, not the system action ("find a place to stay" not "click the search button")
Provide context (why are you doing this?)
Don't reveal the path
Don't use product terminology in the task framing

2. Choose moderated or unmoderated

Moderated (live, with researcher):

Researcher observes and probes in real time
Best for early-stage prototypes, complex tasks, novel concepts
Higher cost, smaller sample (5 to 8 participants typical)
Catches surprises and probe deeper

Unmoderated (recorded, asynchronous):

Participant completes alone, often via tool (UserTesting, Maze, Lookback)
Best for stable designs, simple tasks, larger sample
Lower cost, larger sample (15 to 30 participants typical)
Catches patterns at scale, less depth per session

For most teams: moderated for early/critical decisions, unmoderated for ongoing validation.

3. Recruit

Target audience - not just convenience.

Recruit criteria:

Match real users (target audience, not just "anyone")
Mix of experience levels with the product (new and existing if applicable)
Mix of relevant device types (mobile, desktop, tablet if relevant)
Exclude friends, family, employees

Sample size:

Moderated: 5 to 8 participants (Nielsen's "5 users find 85% of usability issues" for the most common segment)
Unmoderated: 15 to 30 participants (more participants compensate for less probing)
Multi-segment testing: 5 to 8 per segment

4. Run the test

Pre-task setup:

Confirm recording works
Brief participant (purpose, anonymity, recording, "no wrong answers")
Get verbal consent
Have participant share screen if remote

Moderated session structure:

Warm-up (2 to 3 min). Easy questions to put participant at ease.
Pre-test questions (3 to 5 min). Background context, current behavior with similar products.
Task 1 (5 to 10 min). Describe task. Have participant attempt while thinking aloud.
Post-task questions (1 to 2 min). What was easy/hard? Anything confusing?
Repeat for tasks 2, 3, 4 (typically 3 to 5 tasks per 60-minute session).
Overall debrief (5 to 10 min). General reactions, comparisons to alternatives, anything else.
Close (2 min).

Moderation principles:

Encourage think-aloud ("What's going through your mind?")
Don't help unless they're truly stuck (and even then, only after a long pause)
Don't lead ("Are you looking for the menu?" - bad)
Note where they hesitate, scroll, or backtrack
Note their language vs the product's language
Note emotional reactions

Anti-patterns:

Talking too much (researcher should talk maybe 20% of the time)
Defending the design when participants struggle
Helping prematurely
Asking participants to predict their future behavior
Treating participant suggestions as features ("Users want X" - test demand for X separately)

5. Synthesize and report

Patterns across participants are signal. Single-participant complaints are weaker (but worth investigating).

Synthesis steps:

Issue inventory. Every issue observed, with which participant, which task, severity.
Cluster. Issues that are the same root problem.
Severity.
- Critical: Blocks task completion. Most users hit this.
- Major: Significantly slows task. Many users hit this.
- Minor: Friction. Some users hit this. Workaround exists.
- Cosmetic: Polish. Doesn't affect task.
Recommendations. For each issue, propose specific fixes.
Prioritize. By severity and effort.

Report structure:

# Usability Test: [Design / flow]

## Summary
[2 to 3 paragraphs covering: what was tested, headline findings, top 3 priorities]

## Method
[Moderated/unmoderated, sample size, audience, dates, tasks]

## Critical findings
[Each with description, frequency, supporting evidence (quotes/clips), recommendation]

## Major findings
[Same structure]

## Minor findings
[Brief]

## Cosmetic findings
[Briefest]

## What worked well
[Calibration: capture successes too]

## Recommendations
[Prioritized list with effort estimates]

## Next steps
[Test re-run schedule, design iteration plan]

Workflow

Define the goals. What decisions hinge on this? What tasks matter most?
Design tasks. 3 to 5 specific, realistic, goal-framed tasks.
Choose moderated vs unmoderated. Match to stage and depth needed.
Recruit. Specific to audience.
Pilot. 1 to 2 sessions before main batch. Refine tasks if needed.
Run. Follow the protocol. Stay disciplined.
Synthesize during, not just after. Patterns emerge by session 4 or 5.
Report. Multiple formats - written report + highlight clips.
Track fixes. Every critical issue should have an owner and date.
Re-test after fixes. Verify the fix worked, didn't introduce new issues.

Failure patterns

Testing the whole product instead of specific tasks. Vague results.
Tasks that reveal the path. ("Click the menu and find...")
Friends and family as participants. Biased, not representative.
Researcher leading the participant. Findings reflect the researcher.
Defending the design when participants struggle. Misses real issues.
Helping too quickly. Participant doesn't experience the friction.
Treating participant suggestions as features. Users solve their problem; product team designs the solution.
One participant = data point. A single strong opinion isn't a finding.
Skipping severity scoring. All findings treated equally; team can't prioritize.
Reports no one reads. Highlight clips and live walkthroughs work better than 80-page decks.
Testing once, never re-testing. Fixes that introduce new problems go undetected.

Output format

Default outputs:

Test plan (before testing) - usability-test-plan-[topic].md
Task script (per session) - usability-tasks-[topic].md
Findings report (after synthesis) - usability-findings-[topic].md
Highlight clips (separately produced)

Reference files

references/task-script-patterns.md - Task framing patterns by common product type, with good and bad examples.

Dépôt GitHub

rampstackco/claude-skills

Chemin: skills/usability-testing

agent-skillsai-agentsanthropicclaudeclaude-aiclaude-code

FAQ

Frequently asked questions

What is the usability-testing skill?

usability-testing is a Claude Skill by rampstackco. Skills package instructions and resources that Claude loads on demand, so Claude can perform usability-testing-related tasks without extra prompting.

How do I install usability-testing?

Use the install commands on this page: add usability-testing to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does usability-testing belong to?

usability-testing is in the research category, tagged testing and design.

Is usability-testing free to use?

Yes. usability-testing is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Compétences associées

Web Research

Autre

Cette compétence effectue des recherches web automatisées sur n'importe quel sujet en formulant des requêtes de recherche, en agrégeant des informations provenant de multiples sources et en synthétisant les résultats sous forme de rapports structurés en markdown. Elle propose des modes de recherche superficielle et approfondie, ce qui la rend idéale pour recueillir rapidement des informations complètes. Les développeurs doivent l'utiliser pour des tâches de recherche, la collecte d'informations et pour se tenir au courant de sujets en évolution rapide.

Voir la compétence

dev-research-codebase-exploration

Autre

Cette Compétence Claude permet une exploration efficace de bases de code grâce aux outils Glob et Grep pour la recherche de motifs de fichiers et la recherche de contenu. Elle aide les développeurs à localiser rapidement des fichiers par type, répertoire ou nom, et à effectuer des recherches dans le contenu des fichiers avec des options de sensibilité à la casse et de contexte. Utilisez-la pour naviguer dans des bases de code non familières ou pour trouver des composants, fonctions ou motifs spécifiques à travers un projet.

Voir la compétence

Data Analyzer

Autre

Data Analyzer est une compétence de recherche complexe pour traiter des ensembles de données structurées et non structurées afin d'en extraire des insights et d'identifier des tendances. Elle réalise des analyses exploratoires de données, des tests statistiques et des analyses de corrélation pour produire des renseignements exploitables. Utilisez-la pour l'analyse commerciale, la validation de recherches et la transformation de données brutes en recommandations fondées sur les données.

Voir la compétence

moltlab

Autre

MoltLab permet aux développeurs de rejoindre une communauté de recherche collaborative où ils peuvent proposer des affirmations, exécuter des calculs et débattre ou examiner collectivement des travaux. Il fonctionne comme une institution de recherche participative, permettant aux utilisateurs de rédiger des articles et de voter sur des idées. Utilisez cette compétence pour participer à ou orienter des projets de recherche contradictoires et évalués par des pairs.

Voir la compétence