Back to Skills

webapp-testing

bobmatnyc
Updated Today
40 views
22
3
22
View on GitHub
Testingaitestingautomationdesign

About

This Claude Skill provides automated webapp testing using Playwright with a reconnaissance-first approach. It focuses on verifying server state and page load before testing, and includes capabilities for server management, UI testing, and visual debugging. Use it when testing web applications with Playwright for server verification, frontend debugging, and comprehensive UI testing.

Documentation

Webapp Testing

Overview

Core Principle: Reconnaissance Before Action

Automated webapp testing using Playwright with a focus on verifying system state (server status, page load, element presence) before taking any action. This ensures reliable, debuggable tests that fail for clear reasons.

Key capabilities:

  • Automated browser testing with Playwright
  • Server lifecycle management
  • Visual reconnaissance (screenshots, DOM inspection)
  • Network monitoring and debugging

When to Use This Skill

  • Web application testing - UI behavior, forms, navigation, integration testing
  • Frontend debugging - Screenshots, DOM inspection, console monitoring
  • Regression testing - Ensure changes don't break existing functionality
  • Server verification - Check servers are running and responding

Not suitable for: Unit testing (use Jest/pytest), load testing, or API-only testing.

The Iron Law

RECONNAISSANCE BEFORE ACTION

Never execute test actions without first:

  1. Verify server state - lsof -i :PORT and curl checks
  2. Wait for page ready - page.wait_for_load_state('networkidle')
  3. Visual confirmation - Screenshot before actions
  4. Read complete output - Examine full results before claiming success

Why: Tests fail mysteriously when servers aren't ready, selectors break when DOM is still building, and 5 seconds of reconnaissance saves 30 minutes of debugging.

Quick Start

Step 1: Verify Server State

lsof -i :3000 -sTCP:LISTEN  # Check server listening
curl -f http://localhost:3000/health  # Test response

Step 2: Start Server (If Needed)

# Single server
python scripts/with_server.py --server "npm run dev" --port 5173 -- python test.py

# Multiple servers (backend + frontend)
python scripts/with_server.py \
  --server "cd backend && python server.py" --port 3000 \
  --server "cd frontend && npm run dev" --port 5173 \
  -- python test.py

Step 3: Write Test with Reconnaissance

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()

    # 1. Navigate and wait
    page.goto('http://localhost:5173')
    page.wait_for_load_state('networkidle')  # CRITICAL

    # 2. Reconnaissance
    page.screenshot(path='/tmp/before.png', full_page=True)
    buttons = page.locator('button').all()
    print(f"Found {len(buttons)} buttons")

    # 3. Execute
    page.click('button.submit')

    # 4. Verify
    page.wait_for_selector('.success-message')
    page.screenshot(path='/tmp/after.png', full_page=True)

    browser.close()

Step 4: Verify Results

Review console output, check for errors, verify state changes, examine screenshots.

Key Patterns

Server Management - Check → Start → Wait → Test → Cleanup

  • Use with_server.py for automatic lifecycle management
  • Check status with lsof, test with curl
  • Automatic cleanup on exit

Reconnaissance - Inspect → Understand → Act → Verify

  • Screenshot current state
  • Inspect DOM for elements
  • Act on discovered selectors
  • Verify results visually

Wait Strategy - Load → Idle → Element → Action

  • Always wait for networkidle on dynamic apps
  • Wait for specific elements before interaction
  • Playwright auto-waits but explicit waits prevent race conditions

Selector Priority - data-testid > role > text > CSS > XPath

  • [data-testid="submit"] - most stable
  • role=button[name="Submit"] - semantic
  • text=Submit - readable
  • button.submit - acceptable
  • XPath - last resort

Common Pitfalls

Testing without server verification - Always check lsof and curl first ❌ Ignoring timeout errors - TimeoutError means something is wrong, investigate ❌ Not waiting for networkidle - Dynamic apps need full page load ❌ Poor selector strategies - Use data-testid for stability ❌ Missing network verification - Check API responses complete ❌ Incomplete cleanup - Close browsers, stop servers properly

Reference Documentation

playwright-patterns.md - Complete Playwright reference Selectors, waits, interactions, assertions, test organization, network interception, screenshots, debugging

server-management.md - Server lifecycle and operations with_server.py usage, manual management, port management, process control, environment config, health checks

reconnaissance-pattern.md - Philosophy and practice Why reconnaissance first, complete process, server checks, network diagnostics, DOM inspection, log analysis

decision-tree.md - Flowcharts for every scenario New test decisions, server state paths, test failure diagnosis, debugging flows, selector/wait strategies

troubleshooting.md - Solutions to common problems Timeout issues, selector problems, server crashes, network errors, environment config, debugging workflow

Examples and Scripts

Examples (examples/ directory):

  • element_discovery.py - Discovering page elements
  • static_html_automation.py - Testing local HTML files
  • console_logging.py - Capturing console output

Scripts (scripts/ directory):

  • with_server.py - Server lifecycle management (run with --help first)

Integration with Other Skills

Mandatory: verification-before-completion Recommended: systematic-debugging, test-driven-development Related: playwright-testing, selenium-automation

Bottom Line

  1. Reconnaissance always comes first - Verify before acting
  2. Never skip server checks - 5 seconds saves 30 minutes
  3. Wait for networkidle - Dynamic apps need time
  4. Read complete output - Verify before claiming success
  5. Screenshot everything - Visual evidence is invaluable

The reconnaissance-then-action pattern is not optional - it's the foundation of reliable webapp testing.

Quick Install

/plugin add https://github.com/bobmatnyc/claude-mpm/tree/main/webapp-testing

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

bobmatnyc/claude-mpm
Path: src/claude_mpm/skills/bundled/testing/webapp-testing

Related Skills

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.

View skill

langchain

Meta

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.

View skill