返回技能列表

evaluation-metrics

mattnigh
更新于 5 days ago
10 次查看
22
1
22
在 GitHub 上查看
其他aitestingautomationdata

关于

This Claude Skill automatically activates during LLM performance evaluation to ensure proper metrics and testing. It handles evaluation datasets, computes metrics, facilitates A/B testing, and implements LLM-as-judge patterns. Use it when you need structured experiment tracking and rigorous performance assessment for your LLM applications.

快速安装

Claude Code

推荐
主要方式
npx skills add mattnigh/skills_collection -a claude-code
插件命令备选方式
/plugin add https://github.com/mattnigh/skills_collection
Git 克隆备选方式
git clone https://github.com/mattnigh/skills_collection.git ~/.claude/skills/evaluation-metrics

在 Claude Code 中复制并粘贴此命令以安装该技能

GitHub 仓库

mattnigh/skills_collection
路径: collection/ricardoroche__ricardos-claude-code__claude__skills__evaluation-metrics__SKILL.md
0

相关推荐技能