MCP HubMCP Hub
Zurück zu Fähigkeiten

testability-scoring

proffesor-for-testing
Aktualisiert Yesterday
339 Ansichten
267
56
267
Auf GitHub ansehen
Anderetestabilityscoringplaywrightvibiumassessment10-principlesintrinsic-testabilityjames-bachmichael-bolton

Über

Diese Fähigkeit bietet eine KI-gestützte Testbarkeitsbewertung für Webanwendungen unter Verwendung von Playwright und optionaler Vibium-Integration. Sie bewertet Anwendungen anhand von 10 intrinsischen Testbarkeitsprinzipien wie Beobachtbarkeit und Steuerbarkeit, um Verbesserungsbereiche zu identifizieren. Nutzen Sie sie zur Bewertung der Softwaretestbarkeit, zur Einschätzung der Testbereitschaft oder zur Erstellung von Testbarkeitsberichten.

Schnellinstallation

Claude Code

Empfohlen
Primär
npx skills add proffesor-for-testing/agentic-qe
Plugin-BefehlAlternativ
/plugin add https://github.com/proffesor-for-testing/agentic-qe
Git CloneAlternativ
git clone https://github.com/proffesor-for-testing/agentic-qe.git ~/.claude/skills/testability-scoring

Kopieren Sie diesen Befehl und fügen Sie ihn in Claude Code ein, um diese Fähigkeit zu installieren

Dokumentation

Testability Scoring

<default_to_action> When assessing testability:

  1. RUN assessment against target URL
  2. ANALYZE all 10 principles automatically
  3. GENERATE HTML report with radar chart
  4. PRIORITIZE improvements by impact/effort
  5. INTEGRATE with QX Partner for holistic view

Quick Assessment:

# Run assessment on any URL
TEST_URL='https://example.com/' npx playwright test tests/testability-scoring/testability-scoring.spec.js --project=chromium --workers=1

# Or use shell script wrapper
.claude/skills/testability-scoring/scripts/run-assessment.sh https://example.com/

The 10 Principles at a Glance:

PrincipleWeightKey Question
Observability15%Can we see what's happening?
Controllability15%Can we control the application?
Algorithmic Simplicity10%Are behaviors predictable?
Algorithmic Transparency10%Can we understand what it does?
Algorithmic Stability10%Does behavior remain consistent?
Explainability10%Is the interface understandable?
Unbugginess10%How error-free is it?
Smallness10%Are components appropriately sized?
Decomposability5%Can we test parts in isolation?
Similarity5%Is the tech stack familiar?

Grade Scale:

  • A (90-100): Excellent testability
  • B (80-89): Good testability
  • C (70-79): Adequate testability
  • D (60-69): Below average
  • F (0-59): Poor testability </default_to_action>

Quick Reference Card

Running Assessments

MethodCommandWhen to Use
Shell Script./scripts/run-assessment.sh URLOne-time assessment
ENV OverrideTEST_URL='URL' npx playwright test...CI/CD integration
Config FileUpdate tests/testability-scoring/config.jsRepeated runs

Principle Details

High Weight (15% each)

PrincipleMeasuresIndicators
ObservabilityState visibility, logging, monitoringConsole output, network tracking, error visibility
ControllabilityInput control, state manipulationAPI access, test data injection, determinism

Medium Weight (10% each)

PrincipleMeasuresIndicators
SimplicityPredictable behaviorClear I/O relationships, low complexity
TransparencyUnderstanding what system doesVisible processes, readable code
StabilityConsistent behaviorChange resilience, maintainability
ExplainabilityInterface understandingGood docs, semantic structure, help text
UnbugginessError-free operationConsole errors, warnings, runtime issues
SmallnessComponent sizeElement count, script bloat, page complexity

Low Weight (5% each)

PrincipleMeasuresIndicators
DecomposabilityIsolation testingComponent separation, modular design
SimilarityTechnology familiarityStandard frameworks, known patterns

Assessment Workflow

1. Navigate to URL → 2. Collect Metrics → 3. Score Principles
                                              ↓
4. Generate JSON ← 5. Calculate Grades ← 6. Apply Weights
         ↓
7. Generate HTML Report with Radar Chart
         ↓
8. Open in Browser (auto-opens)

Output Files

tests/reports/
├── testability-results-<timestamp>.json  # Raw data
├── testability-report-<timestamp>.html   # Visual report
└── latest.json                           # Symlink

Integration Examples

CI/CD Integration

# GitHub Actions
- name: Testability Assessment
  run: |
    timeout 180 .claude/skills/testability-scoring/scripts/run-assessment.sh ${{ env.APP_URL }}

- name: Upload Reports
  uses: actions/upload-artifact@v3
  with:
    name: testability-reports
    path: tests/reports/testability-*.html

QX Partner Integration

// Combine testability with QX analysis
const qxAnalysis = await Task("QX Analysis", {
  target: 'https://example.com',
  integrateTestability: true
}, "qx-partner");

// Returns combined insights:
// - QX Score: 78/100
// - Testability Integration: Observability 72/100
// - Combined Insight: Low observability may mask UX issues

Programmatic Usage

import { runTestabilityAssessment } from './testability';

const results = await runTestabilityAssessment('https://example.com');
console.log(`Overall: ${results.overallScore}/100 (${results.grade})`);
console.log('Recommendations:', results.recommendations);

Agent Integration

// Run testability assessment
const assessment = await Task("Testability Assessment", {
  url: 'https://example.com',
  generateReport: true,
  openBrowser: true
}, "qe-quality-analyzer");

// Use with QX Partner for holistic analysis
const qxReport = await Task("Full QX Analysis", {
  target: 'https://example.com',
  integrateTestability: true,
  detectOracleProblems: true
}, "qx-partner");

Vibium Integration (Optional)

Overview

Vibium browser automation can be used alongside Playwright for enhanced testability assessment. While Playwright remains the primary engine, Vibium offers complementary capabilities for certain metrics.

Installation:

claude mcp add vibium -- npx -y vibium

Vibium-Enhanced Metrics

PrincipleVibium EnhancementBenefit
ObservabilityAuto-wait duration trackingMeasures DOM stability (30s timeout, 100ms polling)
ControllabilityElement interaction success rateValidates automation readiness via MCP
StabilityScreenshot consistencyVisual regression detection for layout stability
ExplainabilityElement attribute extractionARIA labels, semantic HTML validation

When to Use Vibium

USE Vibium for:

  • Element stability metrics (auto-wait duration analysis)
  • Visual consistency checks (screenshot comparison)
  • MCP-native AI agent integration
  • Lightweight Docker images (400MB vs 1.2GB)

USE Playwright for:

  • Console error detection (Vibium V1 lacks console API)
  • Network performance metrics (BiDi network APIs coming in V2)
  • Comprehensive browser coverage (Firefox, Safari)
  • Production-proven stability (Vibium V1 released Dec 2024)

Hybrid Assessment Example

// Testability assessment using both engines
const assessment = {
  // Playwright: Comprehensive metrics
  playwright: await runPlaywrightAssessment(url),

  // Vibium: Stability metrics
  vibium: {
    elementStability: await measureAutoWaitDuration(url),
    visualConsistency: await compareScreenshots(url),
    accessibilityAttributes: await extractARIALabels(url)
  }
};

// Enhanced Observability Score
const observability =
  (assessment.playwright.consoleErrors * 0.6) +
  (assessment.vibium.elementStability * 0.4);

Vibium MCP Tools for Testability

// 1. Element Stability Measurement
const browser = await browser_launch();
await browser_navigate({ url });
const startTime = Date.now();
const element = await browser_find({ selector: ".critical-element" });
const autoWaitDuration = Date.now() - startTime;
// Lower duration = better stability

// 2. Visual Consistency Check
const screenshot1 = await browser_screenshot();
await browser_navigate({ url }); // Reload
const screenshot2 = await browser_screenshot();
const visualDiff = compareImages(screenshot1.png, screenshot2.png);
// Lower diff = better stability

// 3. Accessibility Attribute Extraction
const elements = await browser_find({ selector: "button, a, input" });
const ariaLabels = elements.map(el => el.attributes["aria-label"]);
const semanticScore = (ariaLabels.filter(Boolean).length / elements.length) * 100;

Migration Strategy

Current (V2.2): Hybrid approach

  • Playwright: Primary engine for all 10 principles
  • Vibium: Optional enhancement for stability metrics

Future (V3.0): When Vibium V2 ships

  • Evaluate Vibium as primary engine if:
    • Console/Network APIs available
    • Production stability proven
    • Community adoption increases

Agent Coordination Hints

Memory Namespace

aqe/testability/
├── assessments/*       - Assessment results by URL
├── historical/*        - Historical scores for trend analysis
├── recommendations/*   - Improvement recommendations
├── integration/*       - QX integration data
└── vibium/*           - Vibium-specific metrics (optional)

Fleet Coordination

const testabilityFleet = await FleetManager.coordinate({
  strategy: 'testability-assessment',
  agents: [
    'qe-quality-analyzer',  // Primary assessment
    'qx-partner',           // UX integration
    'qe-visual-tester'      // Visual validation
  ],
  topology: 'sequential'
});

Common Issues & Solutions

IssueSolution
Tests timing outIncrease timeout: timeout 300 ./scripts/run-assessment.sh URL
Partial resultsCheck console errors, increase network timeout
Report not openingUse AUTO_OPEN=false, open manually
Config not updatingUse TEST_URL env var instead
Vibium not availableInstall via claude mcp add vibium -- npx -y vibium (optional)
Hybrid mode errorsVibium is optional; assessments work without it

Related Skills


Credits & References

Framework Origin

Implementation

Vibium Resources


Remember

Testability is an investment, not an afterthought.

Good testability:

  • Reduces debugging time
  • Enables faster feedback loops
  • Makes defects easier to find
  • Supports continuous testing

Low scores = High risk. Prioritize improvements by weight × impact.

GitHub Repository

proffesor-for-testing/agentic-qe
Pfad: .claude/skills/testability-scoring
agenticqeagenticsfoundationagentsquality-engineering

Verwandte Skills

compatibility-testing

Andere

This skill performs automated cross-browser, cross-platform, and cross-device compatibility testing to ensure consistent user experiences. It validates browser support, tests responsive designs across breakpoints, and runs parallel tests using cloud services. Use it when you need to verify compatibility across a defined matrix covering the majority of your user base.

Skill ansehen

visual-testing-advanced

Andere

This Claude Skill performs advanced visual regression testing with pixel-perfect comparisons and AI-powered diff analysis. It validates responsive designs and ensures cross-browser visual consistency, making it ideal for detecting UI regressions. Developers should use it when needing to validate designs or maintain visual quality across releases.

Skill ansehen

code-review-quality

Andere

This skill conducts automated code reviews focused on quality, testability, and maintainability, prioritizing critical feedback like bugs and security issues. It's designed for use during code reviews, when providing feedback, or when establishing review practices. The tool categorizes feedback by severity and emphasizes asking questions over issuing commands.

Skill ansehen

compatibility-testing

Andere

This skill performs automated cross-browser, cross-platform, and cross-device compatibility testing to ensure a consistent user experience. Use it for validating browser support, testing responsive design breakpoints, or verifying platform compatibility. It runs parallel tests across a defined browser matrix and leverages cloud services for broad device coverage.

Skill ansehen