Back to Skills

agenta-3-evaluation-metrics-and-testing

vamseeachanta
Updated 2 days ago
5 views
3
2
3
View on GitHub
Othertesting

About

This skill enables automated evaluation of LLM outputs using customizable metrics like exact match and semantic similarity. It provides a framework for testing prompts against expected outputs with detailed scoring and comparison capabilities. Developers should use it to systematically measure and improve prompt performance in their applications.

Quick Install

Claude Code

Recommended
Primary
npx skills add vamseeachanta/workspace-hub -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/vamseeachanta/workspace-hub
Git CloneAlternative
git clone https://github.com/vamseeachanta/workspace-hub.git ~/.claude/skills/agenta-3-evaluation-metrics-and-testing

Copy and paste this command in Claude Code to install this skill

GitHub Repository

vamseeachanta/workspace-hub
Path: .claude/skills/ai/prompting/agenta/3-evaluation-metrics-and-testing
0

Related Skills