SKILL·CF5602

experimental-design

Name: experimental-design
Author: K-Dense-AI

K-Dense-AI

Mis à jour 19 days ago

29,947

3,033

29,947

Voir sur GitHub

Teststestingdesigndata

À propos

Cette compétence aide les développeurs à concevoir des expériences statistiquement robustes avant la collecte des données, couvrant des méthodes telles que la randomisation, le blocage et les plans factoriels. Elle se déclenche sur des questions concernant la configuration de l'étude, l'affectation des groupes et l'évitement des facteurs confondants pour garantir des résultats interprétables. Utilisez-la pour la planification ; pour l'analyse de puissance ou l'analyse de données existantes, utilisez les compétences compagnes respectives.

Installation rapide

Claude Code

Recommandé

Principal

npx skills add K-Dense-AI/claude-scientific-skills -a claude-code

Commande PluginAlternatif

/plugin add https://github.com/K-Dense-AI/claude-scientific-skills

Git CloneAlternatif

git clone https://github.com/K-Dense-AI/claude-scientific-skills.git ~/.claude/skills/experimental-design

Copiez et collez cette commande dans Claude Code pour installer cette compétence

Documentation

Experimental Design

Overview

The design of a study — how units are assigned to conditions, what is held constant, what is varied, and in what structure — determines what questions the data can answer. No analysis can rescue a confounded or pseudoreplicated design after the fact. This skill is about the decisions made before data collection: picking a design that isolates the effect of interest, randomizing to license causal claims, blocking to remove known nuisance variation, and structuring multi-factor experiments so effects are estimable rather than tangled together.

The three ideas behind almost every good design (Fisher's principles):

Randomization — assign treatments at random so that confounders, known and unknown, are balanced in expectation. This is what turns a comparison into a causal claim.
Replication — independent repetition at the right level, so you can estimate variability and your effects aren't artifacts of a single unit. The most common fatal error is pseudoreplication: counting repeated measurements on the same unit as independent replicates.
Blocking / local control — group similar units (by batch, day, site, litter) and randomize within blocks, removing that nuisance variation from the error term instead of letting it inflate noise.

This skill helps you choose among design types, generate the actual randomization or DOE layout (with reproducible scripts), and avoid the structural mistakes that make data uninterpretable.

When to Use This Skill

Planning any comparative experiment or trial and deciding how to assign units
Randomizing subjects/samples to arms (simple, blocked, stratified, or cluster)
Removing nuisance variation by blocking or stratification
Designing multi-factor experiments: full or fractional factorial, screening designs
Optimizing a response over continuous factors (response-surface designs)
Within-subject / repeated-measures, crossover, split-plot, or Latin-square designs
Cluster- or group-randomized designs (sites, clinics, classrooms, litters)
Deciding the number and level of replicates and avoiding pseudoreplication
Sequential, group-sequential, or adaptive designs with interim analyses
Laying out plates/batches and randomizing run order to defeat drift

Installation

uv pip install "numpy>=1.26" "pandas>=2.0" pyDOE3

pyDOE3 is the maintained successor to pyDOE/pyDOE2 and supplies factorial, fractional-factorial, Plackett-Burman, central-composite, Box-Behnken, and Latin-hypercube generators. The bundled scripts wrap it to return designs in real factor units with named columns and randomized run order.

Choosing a design

Start from the question and the structure of your units, not from a favorite design.

What are you trying to learn?
│
├─ Compare a few predefined conditions (A vs B vs C)?
│   ├─ Units independent, possibly with a known nuisance factor (day, batch, site)?
│   │     → Completely randomized (no nuisance) or RANDOMIZED BLOCK design.
│   ├─ Each unit can receive every condition in sequence (washout possible)?
│   │     → CROSSOVER / repeated-measures design (more power, watch carry-over).
│   └─ You can only randomize groups, not individuals (schools, clinics)?
│         → CLUSTER-randomized design (analyze at the cluster level; see pseudoreplication).
│
├─ Screen MANY factors (5+) to find the few that matter?
│     → FRACTIONAL FACTORIAL or PLACKETT-BURMAN screening design.
│
├─ Quantify main effects AND interactions among a handful of factors?
│     → FULL 2^k FACTORIAL design.
│
├─ Find the settings that OPTIMIZE a response (curvature matters)?
│     → RESPONSE-SURFACE design: central composite or Box-Behnken.
│
└─ Explore a simulation/computer model over a continuous space?
      → SPACE-FILLING design: Latin hypercube.

Detailed guidance per branch:

Randomization, blocking, stratification, controls → references/randomization_and_blocking.md
Factorial, fractional-factorial, screening, response-surface, DOE concepts (aliasing, resolution) → references/factorial_and_doe.md
Crossover, repeated-measures, split-plot, Latin-square, cluster, nested designs → references/design_types.md
Sequential, group-sequential, and adaptive designs (interim analyses) → references/sequential_and_adaptive.md

Generating the design

Two scripts produce ready-to-use, reproducible layouts. Run them from the skill's scripts/ directory or add it to sys.path. Everything is seeded so the exact schedule can be archived and regenerated — a requirement for trial registration and good lab practice.

Randomization / allocation schedules — `scripts/randomization.py`

from randomization import (
    simple_randomization, block_randomization,
    stratified_block_randomization, cluster_randomization,
    assign_factorial_runs, arm_balance,
)

# Permuted blocks keep the arms balanced throughout enrollment (use for n < ~100
# or sequential intake — simple randomization can drift out of balance with small n)
sched = block_randomization(n=60, arms=["treatment", "control"], seed=42)

# Balance a prognostic variable across arms by randomizing within each stratum
sched = stratified_block_randomization({"siteA": 30, "siteB": 30},
                                       arms=["drug", "placebo"], ratio=(2, 1), seed=42)

# Randomize whole clusters, not individuals (the cluster is the unit)
sched = cluster_randomization(["clinic1", "clinic2", "clinic3", "clinic4"], seed=42)

arm_balance(sched)            # sanity-check the counts per arm
sched.to_csv("allocation_schedule.csv", index=False)

Choosing among them: simple is fine for large n but can produce imbalance with small n; block guarantees balance throughout; stratified block additionally balances a known prognostic factor; cluster is mandatory when the intervention is delivered at a group level. See references/randomization_and_blocking.md.

DOE matrices — `scripts/doe_designs.py`

from doe_designs import (
    full_factorial, two_level_factorial, fractional_factorial,
    plackett_burman, central_composite, box_behnken, latin_hypercube,
)

# Factors as real-world (low, high) ranges -> design comes back in real units
factors = {"temp_C": (20, 60), "conc_mM": (1, 10), "pH": (6, 8)}

# Full 2^3: all main effects + all interactions (8 runs), run order randomized
design = two_level_factorial(factors, seed=42)

# Screen 7 factors cheaply (main effects only)
many = {f"factor_{i}": (0, 1) for i in range(7)}
design = plackett_burman(many, seed=42)

# Optimize over 2 factors with curvature (response-surface)
design = central_composite({"temp_C": (20, 60), "conc_mM": (1, 10)}, seed=42)

design.to_csv("experimental_runs.csv", index=False)

Run order is randomized by default so factors aren't confounded with time/drift (machine warm-up, reagent aging). See references/factorial_and_doe.md for picking generators, reading the alias structure, and choosing resolution.

The mistakes that ruin studies

These are structural — they can't be fixed in analysis, only in design.

Pseudoreplication. Treating repeated measurements of one unit as independent replicates: 3 mice with 100 cells each is n = 3 (mice), not n = 300 (cells), for any treatment applied to the mouse. The replicate must be at the level the treatment is randomized. This single error invalidates a large share of published experiments. Randomize and replicate at the right level; analyze with the nesting respected (mixed model). See references/design_types.md.
Confounding by a nuisance variable. Running all treatment samples on Monday and all controls on Tuesday confounds treatment with day. Randomize across, or block on, every nuisance factor you can name (batch, day, plate, technician, instrument, position).
No or broken randomization. Convenience assignment (first-come → treatment) lets confounders sneak in. Use a seeded schedule and follow it.
No proper control. Without a concurrent control (and, where relevant, a vehicle/sham and blinding), you can't separate the treatment effect from time, placebo, or handling effects.
Batch effects mistaken for biology. In omics especially, process samples in a randomized/blocked order across batches; never let batch align with the condition.
Edge/position effects on plates. Evaporation and thermal gradients make plate edges differ. Randomize or block sample positions; don't put all controls in column 1.
Aliasing ignored in fractional designs. A low-resolution fractional factorial confounds main effects with interactions; know your alias structure before concluding a factor "has no effect."
Optimizing without curvature. A two-level factorial can't detect a curved response; you'll miss an interior optimum. Use a response-surface design.

Workflow

State the question, the unit, and the response. What is randomized? What is measured? At what level is a true independent replicate? This determines everything.
List nuisance factors (batch, day, site, operator, position) — plan to block, stratify, or randomize across each.
Pick the design using the decision tree and reference files.
Decide replication at the correct level (and get n from the statistical-power skill for the chosen design).
Generate the layout with randomization.py / doe_designs.py, seeded.
Randomize run/processing order and plate/batch positions.
Document the design, seed, and schedule (pre-register if possible) so the analysis is confirmatory and the layout is auditable.
Match the analysis to the design — blocks, strata, clusters, and nesting must appear in the model (hand off to statistical-analysis / statsmodels).

Resources

Scripts

scripts/randomization.py — seeded allocation schedules: simple_randomization, block_randomization, stratified_block_randomization, cluster_randomization, assign_factorial_runs, arm_balance.
scripts/doe_designs.py — DOE matrices in real units: full_factorial, two_level_factorial, fractional_factorial, plackett_burman, central_composite, box_behnken, latin_hypercube.

References

references/randomization_and_blocking.md — randomization methods, blocking, stratification, controls, blinding, batch/plate layout.
references/factorial_and_doe.md — factorial and fractional designs, resolution and aliasing, screening, and response-surface methodology.
references/design_types.md — completely randomized, randomized block, crossover, repeated-measures, split-plot, Latin-square, cluster, and nested designs; the pseudoreplication problem in depth.
references/sequential_and_adaptive.md — group-sequential designs, alpha spending, interim stopping, and adaptive sample-size re-estimation.

Related skills

statistical-power — required sample size / power for the design you've chosen.
statistical-analysis — running and reporting the analysis after collection.
statsmodels / pymc — fitting the models the design implies.

Key references

Fisher, R. A. (1935). The Design of Experiments.
Montgomery, D. C. (2019). Design and Analysis of Experiments (10th ed.).
Hurlbert, S. H. (1984). Pseudoreplication and the design of ecological field experiments. Ecological Monographs, 54(2), 187–211.
Lazic, S. E. (2016). Experimental Design for Laboratory Biologists.

Dépôt GitHub

K-Dense-AI/claude-scientific-skills

Chemin: skills/experimental-design

agent-skillsai-scientistbioinformaticschemoinformaticsclaudeclaude-skills

FAQ

Frequently asked questions

What is the experimental-design skill?

experimental-design is a Claude Skill by K-Dense-AI. Skills package instructions and resources that Claude loads on demand, so Claude can perform experimental-design-related tasks without extra prompting.

How do I install experimental-design?

Use the install commands on this page: add experimental-design to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does experimental-design belong to?

experimental-design is in the Testing category, tagged testing, design and data.

Is experimental-design free to use?

Yes. experimental-design is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Compétences associées

evaluating-llms-harness

Tests

Cette compétence Claude exécute le lm-evaluation-harness pour évaluer les modèles de langage sur plus de 60 tâches académiques standardisées telles que MMLU et GSM8K. Elle est conçue pour permettre aux développeurs de comparer la qualité des modèles, de suivre les progrès de l'entraînement ou de rapporter des résultats académiques. L'outil prend en charge différents backends, incluant les modèles HuggingFace et vLLM.

Voir la compétence

cloudflare-cron-triggers

Tests

Cette compétence fournit une connaissance complète pour la mise en œuvre de Déclencheurs Cron Cloudflare afin de planifier des Workers à l'aide d'expressions cron. Elle couvre la configuration de tâches périodiques, de travaux de maintenance et de flux de travail automatisés, tout en traitant des problèmes courants tels que les expressions cron non valides et les problèmes de fuseau horaire. Les développeurs peuvent l'utiliser pour configurer des gestionnaires planifiés, tester des déclencheurs cron et intégrer avec Workflows et Green Compute.

Voir la compétence

webapp-testing

Tests

Cette Compétence Claude fournit une boîte à outils basée sur Playwright pour tester des applications web locales via des scripts Python. Elle permet la vérification frontend, le débogage d'interface utilisateur, la capture d'écrans et la consultation des journaux, tout en gérant les cycles de vie du serveur. Utilisez-la pour les tâches d'automatisation de navigateur, mais exécutez les scripts directement plutôt que de lire leur code source pour éviter la pollution du contexte.

Voir la compétence

finishing-a-development-branch

Tests

Cette compétence aide les développeurs à finaliser leur travail en vérifiant que les tests passent, puis en présentant des options d'intégration structurées. Elle guide le processus de fusion, de création de PRs ou de nettoyage des branches une fois l'implémentation terminée. Utilisez-la lorsque votre code est prêt et testé pour finaliser systématiquement le cycle de développement.

Voir la compétence

experimental-design

À propos

Installation rapide

Claude Code

Documentation

Experimental Design

Overview

When to Use This Skill

Installation

Choosing a design

Generating the design

Randomization / allocation schedules — scripts/randomization.py

DOE matrices — scripts/doe_designs.py

The mistakes that ruin studies

Workflow

Resources

Scripts

References

Related skills

Key references

Dépôt GitHub

Frequently asked questions

What is the experimental-design skill?

How do I install experimental-design?

What category does experimental-design belong to?

Is experimental-design free to use?

Compétences associées

Randomization / allocation schedules — `scripts/randomization.py`

DOE matrices — `scripts/doe_designs.py`