data-validation
について
このスキルは、データベース、API、パイプライン全体におけるスキーマ準拠、データ品質、参照整合性のテストのための包括的なデータ検証フレームワークを提供します。データソースの検証や、完全性、正確性、一貫性の異常検出を伴う品質スコアカードの生成にご利用ください。データ整合性の確認やETLテストを実行する必要がある開発者に最適です。
クイックインストール
Claude Code
推奨/plugin add https://github.com/majiayu000/claude-skill-registrygit clone https://github.com/majiayu000/claude-skill-registry.git ~/.claude/skills/data-validationこのコマンドをClaude Codeにコピー&ペーストしてスキルをインストールします
ドキュメント
Data Validation Framework
Purpose
Comprehensive data validation framework for testing schema compliance, data quality, and referential integrity. Validates databases, APIs, data pipelines, and file formats. Generates data quality scorecards with anomaly detection.
Triggers
Use this skill when:
- "validate data quality"
- "check data integrity"
- "schema validation"
- "test data pipeline"
- "data quality report"
- "validate CSV"
- "check for data anomalies"
- "test ETL output"
When to Use
- Data pipeline deployment
- Database migration
- API response validation
- Report generation systems
- Data warehouse testing
- ML training data validation
When NOT to Use
- API endpoint testing (use api-contract-validator)
- Security testing (use security-test-suite)
- Performance testing (use performance-benchmark)
Core Instructions
Data Quality Dimensions
| Dimension | Description | Weight |
|---|---|---|
| Completeness | Missing values, required fields | 25% |
| Accuracy | Type conformance, format validation | 25% |
| Consistency | Cross-field rules, referential integrity | 20% |
| Uniqueness | Duplicate detection, key uniqueness | 15% |
| Freshness | Timestamp validation, staleness | 10% |
| Anomaly | Statistical outlier detection | 5% |
Validation Categories
| Category | Description | Severity |
|---|---|---|
| Schema | Structure and type compliance | Critical |
| Completeness | Missing/null value detection | High |
| Accuracy | Value correctness and format | High |
| Consistency | Cross-field/cross-table rules | Medium |
| Uniqueness | Duplicate detection | Medium |
| Freshness | Timeliness of data | Medium |
| Anomaly | Statistical outlier detection | Low |
Schema Definition
schema:
tables:
transactions:
columns:
- name: transaction_id
type: string
required: true
unique: true
pattern: "^TXN-[A-Z0-9]{10}$"
- name: amount
type: float
required: true
min: 0.01
max: 1000000
- name: status
type: string
required: true
enum: [pending, completed, failed]
Templates
Data Quality Report
# Data Quality Report
**Source:** {source_type}
**Table:** {table_name}
**Generated:** {timestamp}
## Quality Scorecard
**Overall Score:** {score}/100 ({grade})
| Dimension | Score | Status |
| --------- | ----- | ------ |
| Completeness | {completeness} | {status_icon} |
| Accuracy | {accuracy} | {status_icon} |
| Consistency | {consistency} | {status_icon} |
| Uniqueness | {uniqueness} | {status_icon} |
| Freshness | {freshness} | {status_icon} |
## Data Summary
| Metric | Value |
| ------ | ----- |
| Total Rows | {total_rows} |
| Valid Rows | {valid_rows} ({valid_percent}%) |
| Invalid Rows | {invalid_rows} ({invalid_percent}%) |
## Issue Details
### {category} Issues
**{issue_id}:** {message}
- Column: `{column}`
- Affected rows: {row_count}
- Sample values: `{samples}`
Example
Input: Validate transactions CSV against schema
Output:
## Quality Scorecard
**Overall Score:** 87.3/100 (B)
| Dimension | Score | Status |
| --------- | ----- | ------ |
| Completeness | 95.0 | Pass |
| Accuracy | 88.5 | Pass |
| Consistency | 82.0 | Pass |
| Uniqueness | 100.0 | Pass |
| Freshness | 75.0 | Warn |
## Issue Details
### Accuracy Issues
**TYPE-amount:** Expected float, got string
- Column: `amount`
- Affected rows: 45
- Sample values: `"N/A", "pending", "TBD"`
Validation Checklist
- Schema definition matches expected structure
- All required columns validated
- Null thresholds appropriately set
- Foreign key references checked (if applicable)
- Anomaly detection parameters tuned
- Sample data reviewed for false positives
- Report includes actionable remediation
Related Skills
api-contract-validator- For API response validationunit-test-generator- For data processing function teststest-health-monitor- For tracking validation trends
GitHub リポジトリ
関連スキル
content-collections
メタThis skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.
creating-opencode-plugins
メタThis skill provides the structure and API specifications for creating OpenCode plugins that hook into 25+ event types like commands, files, and LSP operations. It offers implementation patterns for JavaScript/TypeScript modules that intercept and extend the AI assistant's lifecycle. Use it when you need to build event-driven plugins for monitoring, custom handling, or extending OpenCode's capabilities.
evaluating-llms-harness
テストThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
polymarket
メタThis skill enables developers to build applications with the Polymarket prediction markets platform, including API integration for trading and market data. It also provides real-time data streaming via WebSocket to monitor live trades and market activity. Use it for implementing trading strategies or creating tools that process live market updates.
