MCP HubMCP Hub
スキル一覧に戻る

QuantConnect Validation

derekcrosslu
更新日 Today
75 閲覧
0
GitHubで表示
テストtesting

について

このスキルは、QuantConnectストラテジーのウォークフォワード検証を実行し、デプロイ前に過学習を検出して堅牢性を確保します。開発者がサンプル外パフォーマンスを評価できるよう、データの80%でトレーニングし、残り20%でテストを行います。`qc_validate.py`ツールを使用したデプロイ判断の実施や、ストラテジーの一般化テストを行う際にご利用ください。

クイックインストール

Claude Code

推奨
プラグインコマンド推奨
/plugin add https://github.com/derekcrosslu/CLAUDE_CODE_EXPLORE
Git クローン代替
git clone https://github.com/derekcrosslu/CLAUDE_CODE_EXPLORE.git ~/.claude/skills/QuantConnect Validation

このコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします

ドキュメント

QuantConnect Validation Skill (Phase 5)

Purpose: Walk-forward validation for Phase 5 robustness testing before deployment.

Progressive Disclosure: This primer contains essentials only. Full details available via qc_validate.py docs command.


When to Use This Skill

Load when:

  • Running /qc-validate command
  • Testing out-of-sample performance
  • Evaluating strategy robustness
  • Making deployment decisions

Tool: Use python SCRIPTS/qc_validate.py for walk-forward validation


Walk-Forward Validation Overview

Purpose: Detect overfitting and ensure strategy generalizes to new data.

Approach:

  1. Training (in-sample): Develop/optimize on 80% of data
  2. Testing (out-of-sample): Validate on remaining 20%
  3. Compare: Measure performance degradation

Example (5-year backtest 2019-2023):

  • In-sample: 2019-2022 (4 years) - Training period
  • Out-of-sample: 2023 (1 year) - Testing period

Key Metrics

1. Performance Degradation

Formula: (IS Sharpe - OOS Sharpe) / IS Sharpe

DegradationQualityDecision
< 15%ExcellentDeploy with confidence
15-30%AcceptableDeploy but monitor
30-40%ConcerningEscalate to human
> 40%SevereAbandon (overfit)

Key Insight: < 15% degradation indicates robust strategy that generalizes well.


2. Robustness Score

Formula: OOS Sharpe / IS Sharpe

ScoreQualityInterpretation
> 0.75HighStrategy robust across periods
0.60-0.75ModerateAcceptable but monitor
< 0.60LowStrategy unstable

Key Insight: > 0.75 indicates strategy maintains performance out-of-sample.


Quick Usage

Run Walk-Forward Validation

# From hypothesis directory with iteration_state.json
python SCRIPTS/qc_validate.py run --strategy strategy.py

# Custom split ratio (default 80/20)
python SCRIPTS/qc_validate.py run --strategy strategy.py --split 0.70

What it does:

  1. Reads project_id from iteration_state.json
  2. Splits date range (80/20 default)
  3. Runs in-sample backtest
  4. Runs out-of-sample backtest
  5. Calculates degradation and robustness
  6. Saves results to PROJECT_LOGS/validation_result.json

Analyze Results

python SCRIPTS/qc_validate.py analyze --results PROJECT_LOGS/validation_result.json

Output:

  • Performance comparison table
  • Degradation percentage
  • Robustness assessment
  • Deployment recommendation

Decision Integration

After validation, the decision framework evaluates:

DEPLOY_STRATEGY (Deploy with confidence):

  • Degradation < 15% AND
  • Robustness > 0.75 AND
  • OOS Sharpe > 0.7

PROCEED_WITH_CAUTION (Deploy but monitor):

  • Degradation < 30% AND
  • Robustness > 0.60 AND
  • OOS Sharpe > 0.5

ABANDON_HYPOTHESIS (Too unstable):

  • Degradation > 40% OR
  • Robustness < 0.5 OR
  • OOS Sharpe < 0

ESCALATE_TO_HUMAN (Borderline):

  • Results don't clearly fit above criteria

Best Practices

1. Time Splits

  • Standard: 80/20 (4 years training, 1 year testing)
  • Conservative: 70/30 (more OOS testing)
  • Very Conservative: 60/40 (extensive testing)

Minimum OOS period: 6 months (1 year preferred)


2. Never Peek at Out-of-Sample

CRITICAL RULE: Never adjust strategy based on OOS results.

  • OOS is for testing only
  • Adjusting based on OOS defeats validation purpose
  • If you adjust, OOS becomes in-sample

3. Check Trade Count

Both periods need sufficient trades:

  • In-sample: Minimum 30 trades (50+ preferred)
  • Out-of-sample: Minimum 10 trades (20+ preferred)

Too few trades = unreliable validation.


4. Compare Multiple Metrics

Don't just look at Sharpe:

  • Sharpe Ratio degradation
  • Max Drawdown increase
  • Win Rate change
  • Profit Factor degradation
  • Trade Count consistency

All metrics should degrade similarly for robust strategy.


Common Issues

Severe Degradation (> 40%)

Cause: Strategy overfit to in-sample period

Example:

  • IS Sharpe: 1.5 → OOS Sharpe: 0.6
  • Degradation: 60%

Decision: ABANDON_HYPOTHESIS

Fix for next hypothesis: Simplify (fewer parameters), longer training period


Different Market Regimes

Cause: IS was bull market, OOS was bear market

Example:

  • 2019-2022 (bull): Sharpe 1.2
  • 2023 (bear): Sharpe -0.3

Decision: Not necessarily overfit, but not robust across regimes

Fix: Test across multiple regimes, add regime detection


Low Trade Count in OOS

Cause: Strategy stops trading in OOS period

Example:

  • IS: 120 trades → OOS: 3 trades

Decision: ESCALATE_TO_HUMAN (insufficient OOS data)


Integration with /qc-validate

The /qc-validate command workflow:

  1. Read iteration_state.json for project_id and parameters
  2. Load this skill for validation approach
  3. Modify strategy for time splits (80/20)
  4. Run in-sample and OOS backtests
  5. Calculate degradation and robustness
  6. Evaluate using decision framework
  7. Update iteration_state.json with results
  8. Git commit with validation summary

Reference Documentation

Need implementation details? All reference documentation accessible via --help:

python SCRIPTS/qc_validate.py --help

That's the only way to access complete reference documentation.

Topics covered in --help:

  • Walk-forward validation methodology
  • Performance degradation thresholds
  • Monte Carlo validation techniques
  • PSR/DSR statistical metrics
  • Common errors and fixes
  • Phase 5 decision criteria

The primer above covers 90% of use cases. Use --help for edge cases and detailed analysis.


Related Skills

  • quantconnect - Core strategy development
  • quantconnect-backtest - Phase 3 backtesting (qc_backtest.py:**)
  • quantconnect-optimization - Phase 4 optimization (qc_optimize.py:**)
  • decision-framework - Decision thresholds
  • backtesting-analysis - Metric interpretation

Key Principles

  1. OOS is sacred - Never adjust strategy based on OOS results
  2. Degradation < 15% is excellent - Strategy generalizes well
  3. Robustness > 0.75 is target - Maintains performance OOS
  4. Trade count matters - Need sufficient trades in both periods
  5. Multiple metrics - All should degrade similarly for robustness

Example Decision

In-Sample (2019-2022):
  Sharpe: 0.97, Drawdown: 18%, Trades: 142

Out-of-Sample (2023):
  Sharpe: 0.89, Drawdown: 22%, Trades: 38

Degradation: 8.2% (< 15%)
Robustness: 0.92 (> 0.75)

→ DEPLOY_STRATEGY (minimal degradation, high robustness)

Version: 2.0.0 (Progressive Disclosure) Last Updated: November 13, 2025 Lines: ~190 (was 463) Context Reduction: 59%

GitHub リポジトリ

derekcrosslu/CLAUDE_CODE_EXPLORE
パス: .claude/skills/quantconnect-validation

関連スキル

content-collections

メタ

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

スキルを見る

evaluating-llms-harness

テスト

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

スキルを見る

cloudflare-turnstile

メタ

This skill provides comprehensive guidance for implementing Cloudflare Turnstile as a CAPTCHA-alternative bot protection system. It covers integration for forms, login pages, API endpoints, and frameworks like React/Next.js/Hono, while handling invisible challenges that maintain user experience. Use it when migrating from reCAPTCHA, debugging error codes, or implementing token validation and E2E tests.

スキルを見る

cloudflare-cron-triggers

テスト

This skill provides comprehensive knowledge for implementing Cloudflare Cron Triggers to schedule Workers using cron expressions. It covers setting up periodic tasks, maintenance jobs, and automated workflows while handling common issues like invalid cron expressions and timezone problems. Developers can use it for configuring scheduled handlers, testing cron triggers, and integrating with Workflows and Green Compute.

スキルを見る