experimental-design
À propos
Cette compétence aide les développeurs à concevoir des expériences statistiquement robustes avant la collecte des données, couvrant des méthodes telles que la randomisation, le blocage et les plans factoriels. Elle se déclenche sur des questions concernant la configuration de l'étude, l'affectation des groupes et l'évitement des facteurs confondants pour garantir des résultats interprétables. Utilisez-la pour la planification ; pour l'analyse de puissance ou l'analyse de données existantes, utilisez les compétences compagnes respectives.
Installation rapide
Claude Code
Recommandénpx skills add K-Dense-AI/claude-scientific-skills -a claude-code/plugin add https://github.com/K-Dense-AI/claude-scientific-skillsgit clone https://github.com/K-Dense-AI/claude-scientific-skills.git ~/.claude/skills/experimental-designCopiez et collez cette commande dans Claude Code pour installer cette compétence
Documentation
Experimental Design
Overview
The design of a study — how units are assigned to conditions, what is held constant, what is varied, and in what structure — determines what questions the data can answer. No analysis can rescue a confounded or pseudoreplicated design after the fact. This skill is about the decisions made before data collection: picking a design that isolates the effect of interest, randomizing to license causal claims, blocking to remove known nuisance variation, and structuring multi-factor experiments so effects are estimable rather than tangled together.
The three ideas behind almost every good design (Fisher's principles):
- Randomization — assign treatments at random so that confounders, known and unknown, are balanced in expectation. This is what turns a comparison into a causal claim.
- Replication — independent repetition at the right level, so you can estimate variability and your effects aren't artifacts of a single unit. The most common fatal error is pseudoreplication: counting repeated measurements on the same unit as independent replicates.
- Blocking / local control — group similar units (by batch, day, site, litter) and randomize within blocks, removing that nuisance variation from the error term instead of letting it inflate noise.
This skill helps you choose among design types, generate the actual randomization or DOE layout (with reproducible scripts), and avoid the structural mistakes that make data uninterpretable.
When to Use This Skill
- Planning any comparative experiment or trial and deciding how to assign units
- Randomizing subjects/samples to arms (simple, blocked, stratified, or cluster)
- Removing nuisance variation by blocking or stratification
- Designing multi-factor experiments: full or fractional factorial, screening designs
- Optimizing a response over continuous factors (response-surface designs)
- Within-subject / repeated-measures, crossover, split-plot, or Latin-square designs
- Cluster- or group-randomized designs (sites, clinics, classrooms, litters)
- Deciding the number and level of replicates and avoiding pseudoreplication
- Sequential, group-sequential, or adaptive designs with interim analyses
- Laying out plates/batches and randomizing run order to defeat drift
Installation
uv pip install "numpy>=1.26" "pandas>=2.0" pyDOE3
pyDOE3 is the maintained successor to pyDOE/pyDOE2 and supplies factorial,
fractional-factorial, Plackett-Burman, central-composite, Box-Behnken, and
Latin-hypercube generators. The bundled scripts wrap it to return designs in real
factor units with named columns and randomized run order.
Choosing a design
Start from the question and the structure of your units, not from a favorite design.
What are you trying to learn?
│
├─ Compare a few predefined conditions (A vs B vs C)?
│ ├─ Units independent, possibly with a known nuisance factor (day, batch, site)?
│ │ → Completely randomized (no nuisance) or RANDOMIZED BLOCK design.
│ ├─ Each unit can receive every condition in sequence (washout possible)?
│ │ → CROSSOVER / repeated-measures design (more power, watch carry-over).
│ └─ You can only randomize groups, not individuals (schools, clinics)?
│ → CLUSTER-randomized design (analyze at the cluster level; see pseudoreplication).
│
├─ Screen MANY factors (5+) to find the few that matter?
│ → FRACTIONAL FACTORIAL or PLACKETT-BURMAN screening design.
│
├─ Quantify main effects AND interactions among a handful of factors?
│ → FULL 2^k FACTORIAL design.
│
├─ Find the settings that OPTIMIZE a response (curvature matters)?
│ → RESPONSE-SURFACE design: central composite or Box-Behnken.
│
└─ Explore a simulation/computer model over a continuous space?
→ SPACE-FILLING design: Latin hypercube.
Detailed guidance per branch:
- Randomization, blocking, stratification, controls →
references/randomization_and_blocking.md - Factorial, fractional-factorial, screening, response-surface, DOE concepts (aliasing, resolution) →
references/factorial_and_doe.md - Crossover, repeated-measures, split-plot, Latin-square, cluster, nested designs →
references/design_types.md - Sequential, group-sequential, and adaptive designs (interim analyses) →
references/sequential_and_adaptive.md
Generating the design
Two scripts produce ready-to-use, reproducible layouts. Run them from the skill's
scripts/ directory or add it to sys.path. Everything is seeded so the exact
schedule can be archived and regenerated — a requirement for trial registration
and good lab practice.
Randomization / allocation schedules — scripts/randomization.py
from randomization import (
simple_randomization, block_randomization,
stratified_block_randomization, cluster_randomization,
assign_factorial_runs, arm_balance,
)
# Permuted blocks keep the arms balanced throughout enrollment (use for n < ~100
# or sequential intake — simple randomization can drift out of balance with small n)
sched = block_randomization(n=60, arms=["treatment", "control"], seed=42)
# Balance a prognostic variable across arms by randomizing within each stratum
sched = stratified_block_randomization({"siteA": 30, "siteB": 30},
arms=["drug", "placebo"], ratio=(2, 1), seed=42)
# Randomize whole clusters, not individuals (the cluster is the unit)
sched = cluster_randomization(["clinic1", "clinic2", "clinic3", "clinic4"], seed=42)
arm_balance(sched) # sanity-check the counts per arm
sched.to_csv("allocation_schedule.csv", index=False)
Choosing among them: simple is fine for large n but can produce imbalance with
small n; block guarantees balance throughout; stratified block additionally
balances a known prognostic factor; cluster is mandatory when the intervention
is delivered at a group level. See references/randomization_and_blocking.md.
DOE matrices — scripts/doe_designs.py
from doe_designs import (
full_factorial, two_level_factorial, fractional_factorial,
plackett_burman, central_composite, box_behnken, latin_hypercube,
)
# Factors as real-world (low, high) ranges -> design comes back in real units
factors = {"temp_C": (20, 60), "conc_mM": (1, 10), "pH": (6, 8)}
# Full 2^3: all main effects + all interactions (8 runs), run order randomized
design = two_level_factorial(factors, seed=42)
# Screen 7 factors cheaply (main effects only)
many = {f"factor_{i}": (0, 1) for i in range(7)}
design = plackett_burman(many, seed=42)
# Optimize over 2 factors with curvature (response-surface)
design = central_composite({"temp_C": (20, 60), "conc_mM": (1, 10)}, seed=42)
design.to_csv("experimental_runs.csv", index=False)
Run order is randomized by default so factors aren't confounded with time/drift
(machine warm-up, reagent aging). See references/factorial_and_doe.md for picking
generators, reading the alias structure, and choosing resolution.
The mistakes that ruin studies
These are structural — they can't be fixed in analysis, only in design.
- Pseudoreplication. Treating repeated measurements of one unit as independent
replicates: 3 mice with 100 cells each is n = 3 (mice), not n = 300 (cells), for
any treatment applied to the mouse. The replicate must be at the level the
treatment is randomized. This single error invalidates a large share of published
experiments. Randomize and replicate at the right level; analyze with the nesting
respected (mixed model). See
references/design_types.md. - Confounding by a nuisance variable. Running all treatment samples on Monday and all controls on Tuesday confounds treatment with day. Randomize across, or block on, every nuisance factor you can name (batch, day, plate, technician, instrument, position).
- No or broken randomization. Convenience assignment (first-come → treatment) lets confounders sneak in. Use a seeded schedule and follow it.
- No proper control. Without a concurrent control (and, where relevant, a vehicle/sham and blinding), you can't separate the treatment effect from time, placebo, or handling effects.
- Batch effects mistaken for biology. In omics especially, process samples in a randomized/blocked order across batches; never let batch align with the condition.
- Edge/position effects on plates. Evaporation and thermal gradients make plate edges differ. Randomize or block sample positions; don't put all controls in column 1.
- Aliasing ignored in fractional designs. A low-resolution fractional factorial confounds main effects with interactions; know your alias structure before concluding a factor "has no effect."
- Optimizing without curvature. A two-level factorial can't detect a curved response; you'll miss an interior optimum. Use a response-surface design.
Workflow
- State the question, the unit, and the response. What is randomized? What is measured? At what level is a true independent replicate? This determines everything.
- List nuisance factors (batch, day, site, operator, position) — plan to block, stratify, or randomize across each.
- Pick the design using the decision tree and reference files.
- Decide replication at the correct level (and get n from the statistical-power skill for the chosen design).
- Generate the layout with
randomization.py/doe_designs.py, seeded. - Randomize run/processing order and plate/batch positions.
- Document the design, seed, and schedule (pre-register if possible) so the analysis is confirmatory and the layout is auditable.
- Match the analysis to the design — blocks, strata, clusters, and nesting must appear in the model (hand off to statistical-analysis / statsmodels).
Resources
Scripts
scripts/randomization.py— seeded allocation schedules:simple_randomization,block_randomization,stratified_block_randomization,cluster_randomization,assign_factorial_runs,arm_balance.scripts/doe_designs.py— DOE matrices in real units:full_factorial,two_level_factorial,fractional_factorial,plackett_burman,central_composite,box_behnken,latin_hypercube.
References
references/randomization_and_blocking.md— randomization methods, blocking, stratification, controls, blinding, batch/plate layout.references/factorial_and_doe.md— factorial and fractional designs, resolution and aliasing, screening, and response-surface methodology.references/design_types.md— completely randomized, randomized block, crossover, repeated-measures, split-plot, Latin-square, cluster, and nested designs; the pseudoreplication problem in depth.references/sequential_and_adaptive.md— group-sequential designs, alpha spending, interim stopping, and adaptive sample-size re-estimation.
Related skills
- statistical-power — required sample size / power for the design you've chosen.
- statistical-analysis — running and reporting the analysis after collection.
- statsmodels / pymc — fitting the models the design implies.
Key references
- Fisher, R. A. (1935). The Design of Experiments.
- Montgomery, D. C. (2019). Design and Analysis of Experiments (10th ed.).
- Hurlbert, S. H. (1984). Pseudoreplication and the design of ecological field experiments. Ecological Monographs, 54(2), 187–211.
- Lazic, S. E. (2016). Experimental Design for Laboratory Biologists.
Dépôt GitHub
Frequently asked questions
What is the experimental-design skill?
experimental-design is a Claude Skill by K-Dense-AI. Skills package instructions and resources that Claude loads on demand, so Claude can perform experimental-design-related tasks without extra prompting.
How do I install experimental-design?
Use the install commands on this page: add experimental-design to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.
What category does experimental-design belong to?
experimental-design is in the Testing category, tagged testing, design and data.
Is experimental-design free to use?
Yes. experimental-design is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.
Compétences associées
Cette compétence Claude exécute le lm-evaluation-harness pour évaluer les modèles de langage sur plus de 60 tâches académiques standardisées telles que MMLU et GSM8K. Elle est conçue pour permettre aux développeurs de comparer la qualité des modèles, de suivre les progrès de l'entraînement ou de rapporter des résultats académiques. L'outil prend en charge différents backends, incluant les modèles HuggingFace et vLLM.
Cette compétence fournit une connaissance complète pour la mise en œuvre de Déclencheurs Cron Cloudflare afin de planifier des Workers à l'aide d'expressions cron. Elle couvre la configuration de tâches périodiques, de travaux de maintenance et de flux de travail automatisés, tout en traitant des problèmes courants tels que les expressions cron non valides et les problèmes de fuseau horaire. Les développeurs peuvent l'utiliser pour configurer des gestionnaires planifiés, tester des déclencheurs cron et intégrer avec Workflows et Green Compute.
Cette Compétence Claude fournit une boîte à outils basée sur Playwright pour tester des applications web locales via des scripts Python. Elle permet la vérification frontend, le débogage d'interface utilisateur, la capture d'écrans et la consultation des journaux, tout en gérant les cycles de vie du serveur. Utilisez-la pour les tâches d'automatisation de navigateur, mais exécutez les scripts directement plutôt que de lire leur code source pour éviter la pollution du contexte.
Cette compétence aide les développeurs à finaliser leur travail en vérifiant que les tests passent, puis en présentant des options d'intégration structurées. Elle guide le processus de fusion, de création de PRs ou de nettoyage des branches une fois l'implémentation terminée. Utilisez-la lorsque votre code est prêt et testé pour finaliser systématiquement le cycle de développement.
