SKILL·1536D0

evaluation-metrics

Name: evaluation-metrics
Author: mattnigh

mattnigh

Aktualisiert 1 month ago

11 Ansichten

Andereaitestingautomationdata

Über

Diese Claude-Skill wird automatisch während der Leistungsbewertung von LLMs aktiviert, um korrekte Metriken und Tests sicherzustellen. Sie verwaltet Evaluierungsdatensätze, berechnet Metriken, ermöglicht A/B-Tests und implementiert LLM-as-Judge-Muster. Nutzen Sie sie, wenn Sie strukturierte Experimentverfolgung und strenge Leistungsbewertung für Ihre LLM-Anwendungen benötigen.

Schnellinstallation

Claude Code

GitHub Repository

mattnigh/skills_collection

Pfad: collection/ricardoroche__ricardos-claude-code__claude__skills__evaluation-metrics__SKILL.md

FAQ

Frequently asked questions

What is the evaluation-metrics skill?

evaluation-metrics is a Claude Skill by mattnigh. Skills package instructions and resources that Claude loads on demand, so Claude can perform evaluation-metrics-related tasks without extra prompting.

How do I install evaluation-metrics?

Use the install commands on this page: add evaluation-metrics to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does evaluation-metrics belong to?

evaluation-metrics is in the ai-llm category, tagged ai, testing, automation and data.

Is evaluation-metrics free to use?

Yes. evaluation-metrics is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.