MCP HubMCP Hub
스킬 목록으로 돌아가기

scvelo

K-Dense-AI
업데이트됨 Today
26,534
2,743
26,534
GitHub에서 보기
기타data

정보

scvelo 스킬은 미접합/접합 mRNA 역학을 모델링하여 단일세포 RNA-seq 데이터에서 세포 상태 전이를 추론하는 RNA 속도 분석을 가능하게 합니다. 이 스킬은 세포 분화 경로와 운명 결정을 분석해야 할 때 사용하며, Scanpy와 같은 트래젝토리 추론 도구를 보완하여 트래젝토리 방향을 예측하고, 잠재 시간을 계산하며, 주도 유전자를 식별합니다.

빠른 설치

Claude Code

추천
기본
npx skills add K-Dense-AI/claude-scientific-skills -a claude-code
플러그인 명령대체
/plugin add https://github.com/K-Dense-AI/claude-scientific-skills
Git 클론대체
git clone https://github.com/K-Dense-AI/claude-scientific-skills.git ~/.claude/skills/scvelo

Claude Code에서 이 명령을 복사하여 붙여넣어 스킬을 설치하세요

문서

scVelo — RNA Velocity Analysis

Overview

scVelo is the leading Python package for RNA velocity analysis in single-cell RNA-seq data. It infers cell state transitions by modeling the kinetics of mRNA splicing — using the ratio of unspliced (pre-mRNA) to spliced (mature mRNA) abundances to determine whether a gene is being upregulated or downregulated in each cell. This allows reconstruction of developmental trajectories and identification of cell fate decisions without requiring time-course data.

Installation: pip install scvelo

Key resources:

When to Use This Skill

Use scVelo when:

  • Trajectory inference from snapshot data: Determine which direction cells are differentiating
  • Cell fate prediction: Identify progenitor cells and their downstream fates
  • Driver gene identification: Find genes whose dynamics best explain observed trajectories
  • Developmental biology: Model hematopoiesis, neurogenesis, epithelial-to-mesenchymal transitions
  • Latent time estimation: Order cells along a pseudotime derived from splicing dynamics
  • Complement to Scanpy: Add directional information to UMAP embeddings

Prerequisites

scVelo requires count matrices for both unspliced and spliced RNA. These are generated by:

  1. STARsolo or kallisto|bustools with lamanno mode
  2. velocyto CLI: velocyto run10x / velocyto run
  3. alevin-fry / simpleaf with spliced/unspliced output

Data is stored in an AnnData object with layers["spliced"] and layers["unspliced"].

Standard RNA Velocity Workflow

1. Setup and Data Loading

import scvelo as scv
import scanpy as sc
import numpy as np
import matplotlib.pyplot as plt

# Configure settings
scv.settings.verbosity = 3       # Show computation steps
scv.settings.presenter_view = True
scv.settings.set_figure_params('scvelo')

# Load data (AnnData with spliced/unspliced layers)
# Option A: Load from loom (velocyto output)
adata = scv.read("cellranger_output.loom", cache=True)

# Option B: Merge velocyto loom with Scanpy-processed AnnData
adata_processed = sc.read_h5ad("processed.h5ad")  # Has UMAP, clusters
adata_velocity = scv.read("velocyto.loom")
adata = scv.utils.merge(adata_processed, adata_velocity)

# Verify layers
print(adata)
# obs × var: N × G
# layers: 'spliced', 'unspliced' (required)
# obsm['X_umap'] (required for visualization)

2. Preprocessing

# Filter and normalize (follows Scanpy conventions)
scv.pp.filter_and_normalize(
    adata,
    min_shared_counts=20,   # Minimum counts in spliced+unspliced
    n_top_genes=2000        # Top highly variable genes
)

# Compute first and second order moments (means and variances)
# knn_connectivities must be computed first
sc.pp.neighbors(adata, n_neighbors=30, n_pcs=30)
scv.pp.moments(
    adata,
    n_pcs=30,
    n_neighbors=30
)

3. Velocity Estimation — Stochastic Model

The stochastic model is fast and suitable for exploratory analysis:

# Stochastic velocity (faster, less accurate)
scv.tl.velocity(adata, mode='stochastic')
scv.tl.velocity_graph(adata)

# Visualize
scv.pl.velocity_embedding_stream(
    adata,
    basis='umap',
    color='leiden',
    title="RNA Velocity (Stochastic)"
)

4. Velocity Estimation — Dynamical Model (Recommended)

The dynamical model fits the full splicing kinetics and is more accurate:

# Recover dynamics (computationally intensive; ~10-30 min for 10K cells)
scv.tl.recover_dynamics(adata, n_jobs=4)

# Compute velocity from dynamical model
scv.tl.velocity(adata, mode='dynamical')
scv.tl.velocity_graph(adata)

5. Latent Time

The dynamical model enables computation of a shared latent time (pseudotime):

# Compute latent time
scv.tl.latent_time(adata)

# Visualize latent time on UMAP
scv.pl.scatter(
    adata,
    color='latent_time',
    color_map='gnuplot',
    size=80,
    title='Latent time'
)

# Identify top genes ordered by latent time
top_genes = adata.var['fit_likelihood'].sort_values(ascending=False).index[:300]
scv.pl.heatmap(
    adata,
    var_names=top_genes,
    sortby='latent_time',
    col_color='leiden',
    n_convolve=100
)

6. Driver Gene Analysis

# Identify genes with highest velocity fit
scv.tl.rank_velocity_genes(adata, groupby='leiden', min_corr=0.3)
df = scv.DataFrame(adata.uns['rank_velocity_genes']['names'])
print(df.head(10))

# Speed and coherence
scv.tl.velocity_confidence(adata)
scv.pl.scatter(
    adata,
    c=['velocity_length', 'velocity_confidence'],
    cmap='coolwarm',
    perc=[5, 95]
)

# Phase portraits for specific genes
scv.pl.velocity(adata, ['Cpe', 'Gnao1', 'Ins2'],
               ncols=3, figsize=(16, 4))

7. Velocity Arrows and Pseudotime

# Arrow plot on UMAP
scv.pl.velocity_embedding(
    adata,
    arrow_length=3,
    arrow_size=2,
    color='leiden',
    basis='umap'
)

# Stream plot (cleaner visualization)
scv.pl.velocity_embedding_stream(
    adata,
    basis='umap',
    color='leiden',
    smooth=0.8,
    min_mass=4
)

# Velocity pseudotime (alternative to latent time)
scv.tl.velocity_pseudotime(adata)
scv.pl.scatter(adata, color='velocity_pseudotime', cmap='gnuplot')

8. PAGA Trajectory Graph

# PAGA graph with velocity-informed transitions
scv.tl.paga(adata, groups='leiden')
df = scv.get_df(adata, 'paga/transitions_confidence', precision=2).T
df.style.background_gradient(cmap='Blues').format('{:.2g}')

# Plot PAGA with velocity
scv.pl.paga(
    adata,
    basis='umap',
    size=50,
    alpha=0.1,
    min_edge_width=2,
    node_size_scale=1.5
)

Complete Workflow Script

import scvelo as scv
import scanpy as sc

def run_rna_velocity(adata, n_top_genes=2000, mode='dynamical', n_jobs=4):
    """
    Complete RNA velocity workflow.

    Args:
        adata: AnnData with 'spliced' and 'unspliced' layers, UMAP in obsm
        n_top_genes: Number of top HVGs for velocity
        mode: 'stochastic' (fast) or 'dynamical' (accurate)
        n_jobs: Parallel jobs for dynamical model

    Returns:
        Processed AnnData with velocity information
    """
    scv.settings.verbosity = 2

    # 1. Preprocessing
    scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=n_top_genes)

    if 'neighbors' not in adata.uns:
        sc.pp.neighbors(adata, n_neighbors=30)

    scv.pp.moments(adata, n_pcs=30, n_neighbors=30)

    # 2. Velocity estimation
    if mode == 'dynamical':
        scv.tl.recover_dynamics(adata, n_jobs=n_jobs)

    scv.tl.velocity(adata, mode=mode)
    scv.tl.velocity_graph(adata)

    # 3. Downstream analyses
    if mode == 'dynamical':
        scv.tl.latent_time(adata)
        scv.tl.rank_velocity_genes(adata, groupby='leiden', min_corr=0.3)

    scv.tl.velocity_confidence(adata)
    scv.tl.velocity_pseudotime(adata)

    return adata

Key Output Fields in AnnData

After running the workflow, the following fields are added:

LocationKeyDescription
adata.layersvelocityRNA velocity per gene per cell
adata.layersfit_tFitted latent time per gene per cell
adata.obsmvelocity_umap2D velocity vectors on UMAP
adata.obsvelocity_pseudotimePseudotime from velocity
adata.obslatent_timeLatent time from dynamical model
adata.obsvelocity_lengthSpeed of each cell
adata.obsvelocity_confidenceConfidence score per cell
adata.varfit_likelihoodGene-level model fit quality
adata.varfit_alphaTranscription rate
adata.varfit_betaSplicing rate
adata.varfit_gammaDegradation rate
adata.unsvelocity_graphCell-cell transition probability matrix

Velocity Models Comparison

ModelSpeedAccuracyWhen to Use
stochasticFastModerateExploratory; large datasets
deterministicMediumModerateSimple linear kinetics
dynamicalSlowHighPublication-quality; identifies driver genes

Best Practices

  • Start with stochastic mode for exploration; switch to dynamical for final analysis
  • Need good coverage of unspliced reads: Short reads (< 100 bp) may miss intron coverage
  • Minimum 2,000 cells: RNA velocity is noisy with fewer cells
  • Velocity should be coherent: Arrows should follow known biology; randomness indicates issues
  • k-NN bandwidth matters: Too few neighbors → noisy velocity; too many → oversmoothed
  • Sanity check: Root cells (progenitors) should have high unspliced/spliced ratios for marker genes
  • Dynamical model requires distinct kinetic states: Works best for clear differentiation processes

Troubleshooting

ProblemSolution
Missing unspliced layerRe-run velocyto or use STARsolo with --soloFeatures Gene Velocyto
Very few velocity genesLower min_shared_counts; check sequencing depth
Random-looking arrowsTry different n_neighbors or velocity model
Memory error with dynamicalSet n_jobs=1; reduce n_top_genes
Negative velocity everywhereCheck that spliced/unspliced layers are not swapped

Additional Resources

GitHub 저장소

K-Dense-AI/claude-scientific-skills
경로: skills/scvelo
0
agent-skillsai-scientistbioinformaticschemoinformaticsclaudeclaude-skills

연관 스킬

llamaguard

기타

LlamaGuard는 폭력 및 혐오 발언 등 6가지 안전 범주에서 LLM 입력과 출력을 조정하기 위한 Meta의 70-80억 파라미터 모델입니다. 94-95% 정확도를 제공하며 vLLM, Hugging Face 또는 Amazon SageMaker를 사용해 배포할 수 있습니다. 이 기술을 사용하여 AI 애플리케이션에 콘텐츠 필터링 및 안전 가드레일을 손쉽게 통합하세요.

스킬 보기

cost-optimization

기타

이 Claude Skill은 리소스 적정화, 태깅 전략, 지출 분석을 통해 개발자들이 클라우드 비용을 최적화할 수 있도록 지원합니다. AWS, Azure, GCP에서 클라우드 비용을 절감하고 비용 거버넌스를 구현하기 위한 프레임워크를 제공합니다. 인프라 비용을 분석하거나, 리소스를 적정화하거나, 예산 제약을 충족해야 할 때 사용하세요.

스킬 보기

quantizing-models-bitsandbytes

기타

이 스킬은 bitsandbytes를 사용하여 LLM을 8비트 또는 4비트 정밀도로 양자화하며, 최소한의 정확도 손실로 50-75%의 메모리 감소를 달성합니다. 제한된 GPU 메모리에서 더 큰 모델을 실행하거나 추론을 가속화하는 데 이상적이며, INT8, NF4, FP4와 같은 형식을 지원합니다. 이 스킬은 HuggingFace Transformers와 통합되어 QLoRA 학습 및 8비트 옵티마이저를 가능하게 합니다.

스킬 보기

dispatching-parallel-agents

기타

이 Claude Skill은 3개 이상의 독립적인 문제를 동시에 조사하고 해결하기 위해 다중 에이전트를 배치합니다. 공유 상태나 의존성 없이 해결 가능한 무관련 장애 시나리오에 맞게 설계되었습니다. 핵심 기능은 병렬 문제 해결로, 각 독립 문제 영역마다 하나의 에이전트를 할당하여 효율성을 극대화합니다.

스킬 보기