SKILL·344142

deeptools

Name: deeptools
Author: K-Dense-AI

K-Dense-AI

업데이트됨 1 month ago

31,025

3,113

31,025

GitHub에서 보기

디자인design

정보

deeptools 스킬은 NGS 데이터 처리 및 시각화를 가능하게 하며, BAM 파일을 커버리지 트랙으로 변환하고 상관관계 및 핑거프린팅과 같은 QC 분석을 수행합니다. 이 스킬은 ChIP-seq, RNA-seq, ATAC-seq 데이터를 위한 유전체 특징 주변의 히트맵과 프로필 플롯을 생성합니다. Claude 내에서 직접 포괄적인 시퀀싱 데이터 분석과 출판용 시각화를 위해 이 스킬을 사용하세요.

빠른 설치

Claude Code

문서

deepTools: NGS Data Analysis Toolkit

Overview

deepTools is a comprehensive suite of Python command-line tools designed for processing and analyzing high-throughput sequencing data. Use deepTools to perform quality control, normalize data, compare samples, and generate publication-quality visualizations for ChIP-seq, RNA-seq, ATAC-seq, MNase-seq, and other NGS experiments.

Core capabilities:

Convert BAM alignments to normalized coverage tracks (bigWig/bedGraph)
Quality control assessment (fingerprint, correlation, coverage)
Sample comparison and correlation analysis
Heatmap and profile plot generation around genomic features
Enrichment analysis and peak region visualization

When to Use This Skill

This skill should be used when:

File conversion: "Convert BAM to bigWig", "generate coverage tracks", "normalize ChIP-seq data"
Quality control: "check ChIP quality", "compare replicates", "assess sequencing depth", "QC analysis"
Visualization: "create heatmap around TSS", "plot ChIP signal", "visualize enrichment", "generate profile plot"
Sample comparison: "compare treatment vs control", "correlate samples", "PCA analysis"
Analysis workflows: "analyze ChIP-seq data", "RNA-seq coverage", "ATAC-seq analysis", "complete workflow"
Working with specific file types: BAM files, bigWig files, BED region files in genomics context

Quick Start

For users new to deepTools, start with file validation and common workflows:

1. Validate Input Files

Before running any analysis, validate BAM, bigWig, and BED files using the validation script:

python scripts/validate_files.py --bam sample1.bam sample2.bam --bed regions.bed

This checks file existence, BAM indices, and format correctness.

2. Generate Workflow Template

For standard analyses, use the workflow generator to create customized scripts:

# List available workflows
python scripts/workflow_generator.py --list

# Generate ChIP-seq QC workflow
python scripts/workflow_generator.py chipseq_qc -o qc_workflow.sh \
    --input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam" \
    --genome-size 2913022398

# Make executable and run
chmod +x qc_workflow.sh
./qc_workflow.sh

3. Most Common Operations

See assets/quick_reference.md for frequently used commands and parameters.

Installation

uv pip install deeptools

Core Workflows

deepTools workflows typically follow this pattern: QC → Normalization → Comparison/Visualization

ChIP-seq Quality Control Workflow

When users request ChIP-seq QC or quality assessment:

Generate workflow script using scripts/workflow_generator.py chipseq_qc
Key QC steps:
- Sample correlation (multiBamSummary + plotCorrelation)
- PCA analysis (plotPCA)
- Coverage assessment (plotCoverage)
- Fragment size validation (bamPEFragmentSize)
- ChIP enrichment strength (plotFingerprint)

Interpreting results:

Correlation: Replicates should cluster together with high correlation (>0.9)
Fingerprint: Strong ChIP shows steep rise; flat diagonal indicates poor enrichment
Coverage: Assess if sequencing depth is adequate for analysis

Full workflow details in references/workflows.md → "ChIP-seq Quality Control Workflow"

ChIP-seq Complete Analysis Workflow

For full ChIP-seq analysis from BAM to visualizations:

Generate coverage tracks with normalization (bamCoverage)
Create comparison tracks (bamCompare for log2 ratio)
Compute signal matrices around features (computeMatrix)
Generate visualizations (plotHeatmap, plotProfile)
Enrichment analysis at peaks (plotEnrichment)

Use scripts/workflow_generator.py chipseq_analysis to generate template.

Complete command sequences in references/workflows.md → "ChIP-seq Analysis Workflow"

RNA-seq Coverage Workflow

For strand-specific RNA-seq coverage tracks:

Use bamCoverage with --filterRNAstrand to separate forward and reverse strands.

Important: NEVER use --extendReads for RNA-seq (would extend over splice junctions).

Use normalization: CPM for fixed bins, RPKM for gene-level analysis.

Template available: scripts/workflow_generator.py rnaseq_coverage

Details in references/workflows.md → "RNA-seq Coverage Workflow"

ATAC-seq Analysis Workflow

ATAC-seq requires Tn5 offset correction:

Shift reads using alignmentSieve with --ATACshift
Generate coverage with bamCoverage
Analyze fragment sizes (expect nucleosome ladder pattern)
Visualize at peaks if available

Template: scripts/workflow_generator.py atacseq

Full workflow in references/workflows.md → "ATAC-seq Workflow"

Tool Categories and Common Tasks

BAM/bigWig Processing

Convert BAM to normalized coverage:

bamCoverage --bam input.bam --outFileName output.bw \
    --normalizeUsing RPGC --effectiveGenomeSize 2913022398 \
    --binSize 10 --numberOfProcessors 8

Compare two samples (log2 ratio):

bamCompare -b1 treatment.bam -b2 control.bam -o ratio.bw \
    --operation log2 --scaleFactorsMethod readCount

Key tools: bamCoverage, bamCompare, multiBamSummary, multiBigwigSummary, correctGCBias, alignmentSieve

Complete reference: references/tools_reference.md → "BAM and bigWig File Processing Tools"

Quality Control

Check ChIP enrichment:

plotFingerprint -b input.bam chip.bam -o fingerprint.png \
    --extendReads 200 --ignoreDuplicates

Sample correlation:

multiBamSummary bins --bamfiles *.bam -o counts.npz
plotCorrelation -in counts.npz --corMethod pearson \
    --whatToShow heatmap -o correlation.png

Key tools: plotFingerprint, plotCoverage, plotCorrelation, plotPCA, bamPEFragmentSize

Complete reference: references/tools_reference.md → "Quality Control Tools"

Visualization

Create heatmap around TSS:

# Compute matrix
computeMatrix reference-point -S signal.bw -R genes.bed \
    -b 3000 -a 3000 --referencePoint TSS -o matrix.gz

# Generate heatmap
plotHeatmap -m matrix.gz -o heatmap.png \
    --colorMap RdBu --kmeans 3

Create profile plot:

plotProfile -m matrix.gz -o profile.png \
    --plotType lines --colors blue red

Key tools: computeMatrix, plotHeatmap, plotProfile, plotEnrichment

Complete reference: references/tools_reference.md → "Visualization Tools"

Normalization Methods

Choosing the correct normalization is critical for valid comparisons. Consult references/normalization_methods.md for comprehensive guidance.

Quick selection guide:

ChIP-seq coverage: Use RPGC or CPM
ChIP-seq comparison: Use bamCompare with log2 and readCount
RNA-seq bins: Use CPM
RNA-seq genes: Use RPKM (accounts for gene length)
ATAC-seq: Use RPGC or CPM

Normalization methods:

RPGC: 1× genome coverage (requires --effectiveGenomeSize)
CPM: Counts per million mapped reads
RPKM: Reads per kb per million (accounts for region length)
BPM: Bins per million
None: Raw counts (not recommended for comparisons)

Full explanation: references/normalization_methods.md

Effective Genome Sizes

RPGC normalization requires effective genome size. Common values:

Organism	Assembly	Size	Usage
Human	GRCh38/hg38	2,913,022,398	`--effectiveGenomeSize 2913022398`
Mouse	GRCm38/mm10	2,652,783,500	`--effectiveGenomeSize 2652783500`
Zebrafish	GRCz11	1,368,780,147	`--effectiveGenomeSize 1368780147`
Drosophila	dm6	142,573,017	`--effectiveGenomeSize 142573017`
C. elegans	ce10/ce11	100,286,401	`--effectiveGenomeSize 100286401`

Complete table with read-length-specific values: references/effective_genome_sizes.md

Common Parameters Across Tools

Many deepTools commands share these options:

Performance:

--numberOfProcessors, -p: Enable parallel processing (always use available cores)
--region: Process specific regions for testing (e.g., chr1:1-1000000)

Read Filtering:

--ignoreDuplicates: Remove PCR duplicates (recommended for most analyses)
--minMappingQuality: Filter by alignment quality (e.g., --minMappingQuality 10)
--minFragmentLength / --maxFragmentLength: Fragment length bounds
--samFlagInclude / --samFlagExclude: SAM flag filtering

Read Processing:

--extendReads: Extend to fragment length (ChIP-seq: YES, RNA-seq: NO)
--centerReads: Center at fragment midpoint for sharper signals

Best Practices

File Validation

Always validate files first using scripts/validate_files.py to check:

File existence and readability
BAM indices present (.bai files)
BED format correctness
File sizes reasonable

Analysis Strategy

Start with QC: Run correlation, coverage, and fingerprint analysis before proceeding
Test on small regions: Use --region chr1:1-10000000 for parameter testing
Document commands: Save full command lines for reproducibility
Use consistent normalization: Apply same method across samples in comparisons
Verify genome assembly: Ensure BAM and BED files use matching genome builds

ChIP-seq Specific

Always extend reads for ChIP-seq: --extendReads 200
Remove duplicates: Use --ignoreDuplicates in most cases
Check enrichment first: Run plotFingerprint before detailed analysis
GC correction: Only apply if significant bias detected; never use --ignoreDuplicates after GC correction

RNA-seq Specific

Never extend reads for RNA-seq (would span splice junctions)
Strand-specific: Use --filterRNAstrand forward/reverse for stranded libraries
Normalization: CPM for bins, RPKM for genes

ATAC-seq Specific

Apply Tn5 correction: Use alignmentSieve with --ATACshift
Fragment filtering: Set appropriate min/max fragment lengths
Check nucleosome pattern: Fragment size plot should show ladder pattern

Performance Optimization

Use multiple processors: --numberOfProcessors 8 (or available cores)
Increase bin size for faster processing and smaller files
Process chromosomes separately for memory-limited systems
Pre-filter BAM files using alignmentSieve to create reusable filtered files
Use bigWig over bedGraph: Compressed and faster to process

Troubleshooting

Common Issues

BAM index missing:

samtools index input.bam

Out of memory: Process chromosomes individually using --region:

bamCoverage --bam input.bam -o chr1.bw --region chr1

Slow processing: Increase --numberOfProcessors and/or increase --binSize

bigWig files too large: Increase bin size: --binSize 50 or larger

Validation Errors

Run validation script to identify issues:

python scripts/validate_files.py --bam *.bam --bed regions.bed

Common errors and solutions explained in script output.

Reference Documentation

This skill includes comprehensive reference documentation:

references/tools_reference.md

Complete documentation of all deepTools commands organized by category:

BAM and bigWig processing tools (9 tools)
Quality control tools (6 tools)
Visualization tools (3 tools)
Miscellaneous tools (2 tools)

Each tool includes:

Purpose and overview
Key parameters with explanations
Usage examples
Important notes and best practices

Use this reference when: Users ask about specific tools, parameters, or detailed usage.

references/workflows.md

Complete workflow examples for common analyses:

ChIP-seq quality control workflow
ChIP-seq complete analysis workflow
RNA-seq coverage workflow
ATAC-seq analysis workflow
Multi-sample comparison workflow
Peak region analysis workflow
Troubleshooting and performance tips

Use this reference when: Users need complete analysis pipelines or workflow examples.

references/normalization_methods.md

Comprehensive guide to normalization methods:

Detailed explanation of each method (RPGC, CPM, RPKM, BPM, etc.)
When to use each method
Formulas and interpretation
Selection guide by experiment type
Common pitfalls and solutions
Quick reference table

Use this reference when: Users ask about normalization, comparing samples, or which method to use.

references/effective_genome_sizes.md

Effective genome size values and usage:

Common organism values (human, mouse, fly, worm, zebrafish)
Read-length-specific values
Calculation methods
When and how to use in commands
Custom genome calculation instructions

Use this reference when: Users need genome size for RPGC normalization or GC bias correction.

Helper Scripts

scripts/validate_files.py

Validates BAM, bigWig, and BED files for deepTools analysis. Checks file existence, indices, and format.

Usage:

python scripts/validate_files.py --bam sample1.bam sample2.bam \
    --bed peaks.bed --bigwig signal.bw

When to use: Before starting any analysis, or when troubleshooting errors.

scripts/workflow_generator.py

Generates customizable bash script templates for common deepTools workflows.

Available workflows:

chipseq_qc: ChIP-seq quality control
chipseq_analysis: Complete ChIP-seq analysis
rnaseq_coverage: Strand-specific RNA-seq coverage
atacseq: ATAC-seq with Tn5 correction

Usage:

# List workflows
python scripts/workflow_generator.py --list

# Generate workflow
python scripts/workflow_generator.py chipseq_qc -o qc.sh \
    --input-bam Input.bam --chip-bams "ChIP1.bam ChIP2.bam" \
    --genome-size 2913022398 --threads 8

# Run generated workflow
chmod +x qc.sh
./qc.sh

When to use: Users request standard workflows or need template scripts to customize.

Assets

assets/quick_reference.md

Quick reference card with most common commands, effective genome sizes, and typical workflow pattern.

When to use: Users need quick command examples without detailed documentation.

Handling User Requests

For New Users

Start with installation verification
Validate input files using scripts/validate_files.py
Recommend appropriate workflow based on experiment type
Generate workflow template using scripts/workflow_generator.py
Guide through customization and execution

For Experienced Users

Provide specific tool commands for requested operations
Reference appropriate sections in references/tools_reference.md
Suggest optimizations and best practices
Offer troubleshooting for issues

For Specific Tasks

"Convert BAM to bigWig":

Use bamCoverage with appropriate normalization
Recommend RPGC or CPM based on use case
Provide effective genome size for organism
Suggest relevant parameters (extendReads, ignoreDuplicates, binSize)

"Check ChIP quality":

Run full QC workflow or use plotFingerprint specifically
Explain interpretation of results
Suggest follow-up actions based on results

"Create heatmap":

Guide through two-step process: computeMatrix → plotHeatmap
Help choose appropriate matrix mode (reference-point vs scale-regions)
Suggest visualization parameters and clustering options

"Compare samples":

Recommend bamCompare for two-sample comparison
Suggest multiBamSummary + plotCorrelation for multiple samples
Guide normalization method selection

Referencing Documentation

When users need detailed information:

Tool details: Direct to specific sections in references/tools_reference.md
Workflows: Use references/workflows.md for complete analysis pipelines
Normalization: Consult references/normalization_methods.md for method selection
Genome sizes: Reference references/effective_genome_sizes.md

Search references using grep patterns:

# Find tool documentation
grep -A 20 "^### toolname" references/tools_reference.md

# Find workflow
grep -A 50 "^## Workflow Name" references/workflows.md

# Find normalization method
grep -A 15 "^### Method Name" references/normalization_methods.md

Example Interactions

User: "I need to analyze my ChIP-seq data"

Response approach:

Ask about files available (BAM files, peaks, genes)
Validate files using validation script
Generate chipseq_analysis workflow template
Customize for their specific files and organism
Explain each step as script runs

User: "Which normalization should I use?"

Response approach:

Ask about experiment type (ChIP-seq, RNA-seq, etc.)
Ask about comparison goal (within-sample or between-sample)
Consult references/normalization_methods.md selection guide
Recommend appropriate method with justification
Provide command example with parameters

User: "Create a heatmap around TSS"

Response approach:

Verify bigWig and gene BED files available
Use computeMatrix with reference-point mode at TSS
Generate plotHeatmap with appropriate visualization parameters
Suggest clustering if dataset is large
Offer profile plot as complement

Key Reminders

File validation first: Always validate input files before analysis
Normalization matters: Choose appropriate method for comparison type
Extend reads carefully: YES for ChIP-seq, NO for RNA-seq
Use all cores: Set --numberOfProcessors to available cores
Test on regions: Use --region for parameter testing
Check QC first: Run quality control before detailed analysis
Document everything: Save commands for reproducibility
Reference documentation: Use comprehensive references for detailed guidance

GitHub 저장소

K-Dense-AI/claude-scientific-skills

경로: skills/deeptools

agent-skillsai-scientistbioinformaticschemoinformaticsclaudeclaude-skills

FAQ

Frequently asked questions

What is the deeptools skill?

deeptools is a Claude Skill by K-Dense-AI. Skills package instructions and resources that Claude loads on demand, so Claude can perform deeptools-related tasks without extra prompting.

How do I install deeptools?

Use the install commands on this page: add deeptools to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does deeptools belong to?

deeptools is in the Design category, tagged design.

Is deeptools free to use?

Yes. deeptools is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

연관 스킬

executing-plans

디자인

executing-plans 스킬은 검토 체크포인트가 포함된 통제된 배치로 실행할 완전한 구현 계획이 있을 때 사용합니다. 이 스킬은 계획을 불러와 비판적으로 검토한 후, 소규모 배치(기본값 3개 작업)로 작업을 실행하면서 각 배치 사이에 진행 상황을 아키텍트 검토를 위해 보고합니다. 이를 통해 내재된 품질 관리 체크포인트를 갖춘 체계적인 구현이 보장됩니다.

스킬 보기

requesting-code-review

디자인

이 스킬은 코드 변경 사항을 요구 사항에 따라 분석하기 위해 코드 리뷰어 하위 에이전트를 호출합니다. 작업 완료 후, 주요 기능 구현 후, 또는 메인 브랜치에 병합하기 전에 사용해야 합니다. 이 리뷰는 현재 구현체와 원래 계획을 비교하여 문제를 조기에 발견하는 데 도움이 됩니다.

스킬 보기

connect-mcp-server

디자인

이 스킬은 개발자들이 HTTP, stdio 또는 SSE 전송 방식을 통해 MCP 서버를 Claude Code에 연결하는 포괄적인 가이드를 제공합니다. GitHub, Notion 및 사용자 정의 API와 같은 외부 서비스를 통합하기 위한 설치, 구성, 인증 및 보안을 다룹니다. MCP 통합 설정, 외부 도구 구성 또는 Claude의 모델 컨텍스트 프로토콜 작업 시 활용하세요.

스킬 보기

web-cli-teleport

디자인

이 스킬은 작업 분석을 기반으로 개발자가 Claude Code 웹 인터페이스와 CLI 인터페이스 중 선택할 수 있도록 돕고, 두 환경 간 원활한 세션 텔레포트를 가능하게 합니다. 웹, CLI 또는 모바일 환경 전환 시 세션 상태와 컨텍스트를 관리하여 워크플로를 최적화합니다. 다양한 단계에서 서로 다른 도구가 필요한 복잡한 프로젝트에 사용하세요.

스킬 보기