MCP HubMCP Hub
スキル一覧に戻る

agenta-1-prompt-versioning-strategy

vamseeachanta
更新日 Today
62 閲覧
3
2
3
GitHubで表示
その他general

について

このスキルは、セマンティックバージョニングと構造化メタデータを使用したAIプロンプトのバージョン管理におけるベストプラクティスを提供します。開発者がプロンプトの変更を追跡し、変更履歴を維持し、異なるプロンプトバージョンを体系的に整理することを支援します。AIアプリケーションにおける本番環境プロンプトのバージョン管理を実装する際にご活用ください。

クイックインストール

Claude Code

推奨
メイン
npx skills add vamseeachanta/workspace-hub
プラグインコマンド代替
/plugin add https://github.com/vamseeachanta/workspace-hub
Git クローン代替
git clone https://github.com/vamseeachanta/workspace-hub.git ~/.claude/skills/agenta-1-prompt-versioning-strategy

このコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします

ドキュメント

1. Prompt Versioning Strategy (+2)

1. Prompt Versioning Strategy

"""Best practices for prompt versioning."""

# DO: Use semantic versioning for prompts
version_naming = {
    "v1.0.0": "Initial production version",
    "v1.1.0": "Added context handling",
    "v1.1.1": "Fixed edge case in formatting",
    "v2.0.0": "Major rewrite with new approach"
}

# DO: Include metadata with versions
def create_versioned_prompt(name: str, template: str, metadata: dict):
    return {
        "name": name,
        "template": template,
        "metadata": {
            "created_by": metadata.get("author"),
            "description": metadata.get("description"),
            "changelog": metadata.get("changelog"),
            "test_results": metadata.get("test_results")
        }
    }

# DO: Test before promoting to production
def promote_to_production(variant_id: str, min_eval_score: float = 0.8):
    # Run evaluation
    score = run_evaluation(variant_id)

    if score >= min_eval_score:
        client.set_default_variant(variant_id)
        return True
    return False

2. Evaluation Strategy

"""Best practices for prompt evaluation."""

# DO: Define clear evaluation criteria
evaluation_criteria = {
    "accuracy": {"weight": 0.4, "threshold": 0.8},
    "relevance": {"weight": 0.3, "threshold": 0.7},
    "coherence": {"weight": 0.2, "threshold": 0.7},
    "safety": {"weight": 0.1, "threshold": 0.9}
}

# DO: Use diverse test sets
def create_evaluation_set():
    return [
        {"input": "...", "expected": "...", "category": "basic"},
        {"input": "...", "expected": "...", "category": "edge_case"},
        {"input": "...", "expected": "...", "category": "adversarial"}
    ]

# DO: Track evaluation over time
def track_evaluation_history(app_name: str, variant_id: str, results: dict):
    # Store results with timestamp for trend analysis
    pass

3. A/B Testing Guidelines

"""Best practices for A/B testing prompts."""

# DO: Calculate required sample size
def calculate_sample_size(
    baseline_metric: float,
    minimum_detectable_effect: float,
    alpha: float = 0.05,
    power: float = 0.8
) -> int:
    # Statistical calculation for required samples
    pass

# DO: Use proper statistical tests
def analyze_ab_test(control_results: list, treatment_results: list):
    from scipy import stats

    # T-test for continuous metrics
    t_stat, p_value = stats.ttest_ind(control_results, treatment_results)

    return {
        "significant": p_value < 0.05,
        "p_value": p_value,
        "effect_size": (sum(treatment_results)/len(treatment_results) -
                       sum(control_results)/len(control_results))
    }

GitHub リポジトリ

vamseeachanta/workspace-hub
パス: .claude/skills/ai/prompting/agenta/1-prompt-versioning-strategy

関連スキル

algorithmic-art

メタ

This Claude Skill creates original algorithmic art using p5.js with seeded randomness and interactive parameters. It generates .md files for algorithmic philosophies, plus .html and .js files for interactive generative art implementations. Use it when developers need to create flow fields, particle systems, or other computational art while avoiding copyright issues.

スキルを見る

subagent-driven-development

開発

This skill executes implementation plans by dispatching a fresh subagent for each independent task, with code review between tasks. It enables fast iteration while maintaining quality gates through this review process. Use it when working on mostly independent tasks within the same session to ensure continuous progress with built-in quality checks.

スキルを見る

executing-plans

デザイン

Use the executing-plans skill when you have a complete implementation plan to execute in controlled batches with review checkpoints. It loads and critically reviews the plan, then executes tasks in small batches (default 3 tasks) while reporting progress between each batch for architect review. This ensures systematic implementation with built-in quality control checkpoints.

スキルを見る

cost-optimization

その他

This Claude Skill helps developers optimize cloud costs through resource rightsizing, tagging strategies, and spending analysis. It provides a framework for reducing cloud expenses and implementing cost governance across AWS, Azure, and GCP. Use it when you need to analyze infrastructure costs, right-size resources, or meet budget constraints.

スキルを見る