agenta-1-prompt-versioning-strategy
정보
이 스킬은 시맨틱 버저닝과 구조화된 메타데이터를 사용하여 AI 프롬프트 버전 관리의 모범 사례를 제공합니다. 개발자가 프롬프트 변경 사항을 추적하고, 변경 로그를 유지하며, 다양한 프롬프트 버전을 체계적으로 구성할 수 있도록 돕습니다. AI 애플리케이션에서 프로덕션 프롬프트의 버전 관리를 구현할 때 사용하세요.
빠른 설치
Claude Code
추천npx skills add vamseeachanta/workspace-hub/plugin add https://github.com/vamseeachanta/workspace-hubgit clone https://github.com/vamseeachanta/workspace-hub.git ~/.claude/skills/agenta-1-prompt-versioning-strategyClaude Code에서 이 명령을 복사하여 붙여넣어 스킬을 설치하세요
문서
1. Prompt Versioning Strategy (+2)
1. Prompt Versioning Strategy
"""Best practices for prompt versioning."""
# DO: Use semantic versioning for prompts
version_naming = {
"v1.0.0": "Initial production version",
"v1.1.0": "Added context handling",
"v1.1.1": "Fixed edge case in formatting",
"v2.0.0": "Major rewrite with new approach"
}
# DO: Include metadata with versions
def create_versioned_prompt(name: str, template: str, metadata: dict):
return {
"name": name,
"template": template,
"metadata": {
"created_by": metadata.get("author"),
"description": metadata.get("description"),
"changelog": metadata.get("changelog"),
"test_results": metadata.get("test_results")
}
}
# DO: Test before promoting to production
def promote_to_production(variant_id: str, min_eval_score: float = 0.8):
# Run evaluation
score = run_evaluation(variant_id)
if score >= min_eval_score:
client.set_default_variant(variant_id)
return True
return False
2. Evaluation Strategy
"""Best practices for prompt evaluation."""
# DO: Define clear evaluation criteria
evaluation_criteria = {
"accuracy": {"weight": 0.4, "threshold": 0.8},
"relevance": {"weight": 0.3, "threshold": 0.7},
"coherence": {"weight": 0.2, "threshold": 0.7},
"safety": {"weight": 0.1, "threshold": 0.9}
}
# DO: Use diverse test sets
def create_evaluation_set():
return [
{"input": "...", "expected": "...", "category": "basic"},
{"input": "...", "expected": "...", "category": "edge_case"},
{"input": "...", "expected": "...", "category": "adversarial"}
]
# DO: Track evaluation over time
def track_evaluation_history(app_name: str, variant_id: str, results: dict):
# Store results with timestamp for trend analysis
pass
3. A/B Testing Guidelines
"""Best practices for A/B testing prompts."""
# DO: Calculate required sample size
def calculate_sample_size(
baseline_metric: float,
minimum_detectable_effect: float,
alpha: float = 0.05,
power: float = 0.8
) -> int:
# Statistical calculation for required samples
pass
# DO: Use proper statistical tests
def analyze_ab_test(control_results: list, treatment_results: list):
from scipy import stats
# T-test for continuous metrics
t_stat, p_value = stats.ttest_ind(control_results, treatment_results)
return {
"significant": p_value < 0.05,
"p_value": p_value,
"effect_size": (sum(treatment_results)/len(treatment_results) -
sum(control_results)/len(control_results))
}
GitHub 저장소
연관 스킬
algorithmic-art
메타This Claude Skill creates original algorithmic art using p5.js with seeded randomness and interactive parameters. It generates .md files for algorithmic philosophies, plus .html and .js files for interactive generative art implementations. Use it when developers need to create flow fields, particle systems, or other computational art while avoiding copyright issues.
subagent-driven-development
개발This skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.
executing-plans
디자인Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.
cost-optimization
기타This Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.
