Back to Skills

setup-gxp-r-project

pjt222
Updated 2 days ago
7 views
17
2
17
View on GitHub
Metaworddesign

About

This skill creates a compliant R project structure for regulated environments like pharmaceuticals, implementing requirements for validated systems, documentation, and electronic records under 21 CFR Part 11. It's used when starting clinical trial analyses or any R project needing GxP compliance. The setup covers qualification, change control, and audit trails for regulatory submissions.

Quick Install

Claude Code

Recommended
Primary
npx skills add pjt222/agent-almanac -a claude-code
Plugin CommandAlternative
/plugin add https://github.com/pjt222/agent-almanac
Git CloneAlternative
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/setup-gxp-r-project

Copy and paste this command in Claude Code to install this skill

Documentation

Set Up GxP R Project

Create an R project structure that meets GxP regulatory requirements for validated computing.

When to Use

  • Starting an R analysis project in a regulated environment (pharma, biotech, medical devices)
  • Setting up R for clinical trial analysis
  • Creating a validated computing environment for regulatory submissions
  • Implementing 21 CFR Part 11 or EU Annex 11 requirements

Inputs

  • Required: Project scope and regulatory framework (FDA, EMA, or both)
  • Required: R version and package versions to validate
  • Required: Validation strategy (risk-based approach)
  • Optional: Existing SOPs for computerized systems
  • Optional: Quality management system integration requirements

Procedure

Step 1: Create Validated Project Structure

gxp-project/
├── R/                          # Analysis scripts
│   ├── 01_data_import.R
│   ├── 02_data_processing.R
│   └── 03_analysis.R
├── validation/                 # Validation documentation
│   ├── validation_plan.md      # VP: scope, strategy, roles
│   ├── risk_assessment.md      # Risk categorization
│   ├── iq/                     # Installation Qualification
│   │   ├── iq_protocol.md
│   │   └── iq_report.md
│   ├── oq/                     # Operational Qualification
│   │   ├── oq_protocol.md
│   │   └── oq_report.md
│   ├── pq/                     # Performance Qualification
│   │   ├── pq_protocol.md
│   │   └── pq_report.md
│   └── traceability_matrix.md  # Requirements to tests mapping
├── tests/                      # Automated test suite
│   ├── testthat.R
│   └── testthat/
│       ├── test-data_import.R
│       └── test-analysis.R
├── data/                       # Input data (controlled)
│   ├── raw/                    # Immutable raw data
│   └── derived/                # Processed datasets
├── output/                     # Analysis outputs
├── docs/                       # Supporting documentation
│   ├── sop_references.md       # Links to relevant SOPs
│   └── change_log.md           # Manual change documentation
├── renv.lock                   # Locked dependencies
├── DESCRIPTION                 # Project metadata
├── .Rprofile                   # Session configuration
└── CLAUDE.md                   # AI assistant instructions

Got: The complete directory structure exists with R/, validation/ (including iq/, oq/, pq/ subdirectories), tests/testthat/, data/raw/, data/derived/, output/, and docs/ directories.

If fail: With missing directories, create them with mkdir -p. Verify you are in the correct project root. For existing projects, create only the missing directories rather than overwriting existing structure.

Step 2: Create Validation Plan

Create validation/validation_plan.md:

# Validation Plan

## 1. Purpose
This plan defines the validation strategy for [Project Name] using R [version].

## 2. Scope
- R version: 4.5.0
- Packages: [list with versions]
- Analysis: [description]
- Regulatory framework: 21 CFR Part 11 / EU Annex 11

## 3. Risk Assessment Approach
Using GAMP 5 risk-based categories:
- Category 3: Non-configured products (R base)
- Category 4: Configured products (R packages with default settings)
- Category 5: Custom applications (custom R scripts)

## 4. Validation Activities
| Activity | Category 3 | Category 4 | Category 5 |
|----------|-----------|-----------|-----------|
| IQ | Required | Required | Required |
| OQ | Reduced | Standard | Enhanced |
| PQ | N/A | Standard | Enhanced |

## 5. Roles and Responsibilities
- Validation Lead: [Name]
- Developer: [Name]
- QA Reviewer: [Name]
- Approver: [Name]

## 6. Acceptance Criteria
All tests must pass with documented evidence.

Got: validation/validation_plan.md is complete with scope, GAMP 5 risk categories, validation activities matrix, roles and responsibilities, and acceptance criteria. The plan references the specific R version and regulatory framework.

If fail: With unclear regulatory framework, consult the organization's QA department for applicable SOPs. Do not proceed with validation activities until the plan is reviewed and approved.

Step 3: Lock Dependencies with renv

# Initialize renv with exact versions
renv::init()

# Install specific validated versions
renv::install("[email protected]")
renv::install("[email protected]")

# Snapshot
renv::snapshot()

The renv.lock file serves as the controlled package inventory.

Got: renv.lock exists with exact version numbers for all required packages. renv::status() reports no issues. Every package version is pinned (e.g., [email protected]), not floating.

If fail: If renv::install() fails for a specific version, check that the version exists on CRAN archives. Use renv::install("package@version", repos = "https://packagemanager.posit.co/cran/latest") for archived versions.

Step 4: Implement Version Control

git init
git add .
git commit -m "Initial validated project structure"

# Use signed commits for traceability
git config user.signingkey YOUR_GPG_KEY
git config commit.gpgsign true

Got: The project is under git version control with signed commits enabled. The initial commit contains the validated project structure and renv.lock.

If fail: With GPG signing failing, verify the GPG key is configured with gpg --list-secret-keys. For environments without GPG, document the deviation and use unsigned commits with manual audit trail entries in docs/change_log.md.

Step 5: Create IQ Protocol

validation/iq/iq_protocol.md:

# Installation Qualification Protocol

## Objective
Verify that R and required packages are correctly installed.

## Test Cases

### IQ-001: R Version Verification
- **Requirement**: R 4.5.0 installed
- **Procedure**: Execute `R.version.string`
- **Expected:** "R version 4.5.0 (date)"
- **Result**: [ PASS / FAIL ]

### IQ-002: Package Installation Verification
- **Requirement**: All packages in renv.lock installed
- **Procedure**: Execute `renv::status()`
- **Expected:** "No issues found"
- **Result**: [ PASS / FAIL ]

### IQ-003: Package Version Verification
- **Procedure**: Execute `installed.packages()[, c("Package", "Version")]`
- **Expected:** Versions match renv.lock exactly
- **Result**: [ PASS / FAIL ]

Got: validation/iq/iq_protocol.md contains test cases for R version verification, package installation verification, and package version verification, each with clear expected results and pass/fail fields.

If fail: If the IQ protocol template does not match organizational SOP requirements, adapt the format while retaining the required fields (requirement, procedure, expected result, actual result, pass/fail). Consult QA for approved templates.

Step 6: Write Automated OQ/PQ Tests

# tests/testthat/test-analysis.R
test_that("primary analysis produces validated results", {
  # Known input -> known output (double programming validation)
  test_data <- read.csv(test_path("fixtures", "validation_dataset.csv"))

  result <- primary_analysis(test_data)

  # Compare against independently calculated expected values
  expect_equal(result$estimate, 2.345, tolerance = 1e-3)
  expect_equal(result$p_value, 0.012, tolerance = 1e-3)
  expect_equal(result$ci_lower, 1.234, tolerance = 1e-3)
})

Got: Automated test files exist in tests/testthat/ covering OQ (operational verification of each function) and PQ (end-to-end validation against independently calculated reference values). Tests use explicit numeric tolerances.

If fail: With reference values not yet available from independent calculation (e.g., SAS), create placeholder tests with skip("Awaiting independent reference values") and document in the traceability matrix.

Step 7: Create Traceability Matrix

# Traceability Matrix

| Req ID | Requirement | Test ID | Test Description | Status |
|--------|-------------|---------|------------------|--------|
| REQ-001 | Import CSV data correctly | OQ-001 | Verify data dimensions and types | PASS |
| REQ-002 | Calculate primary endpoint | PQ-001 | Compare against reference results | PASS |
| REQ-003 | Generate report output | PQ-002 | Verify report contains all sections | PASS |

Got: validation/traceability_matrix.md links every requirement to at least one test case, and every test case is linked to a requirement. No orphaned requirements or tests.

If fail: With untested requirements, create test cases for them or document a risk-based justification for exclusion. With tests lacking a linked requirement, either link them to an existing requirement or remove them as out-of-scope.

Validation

  • Project structure follows documented template
  • renv.lock contains all dependencies with exact versions
  • Validation plan is complete and approved
  • IQ protocol executes successfully
  • OQ test cases cover all configured functionality
  • PQ tests validate against independently computed results
  • Traceability matrix links requirements to tests
  • Change control process is documented

Pitfalls

  • Using install.packages() without version pinning: Always use renv with locked versions
  • Missing audit trail: Every change must be documented. Use git signed commits.
  • Over-validating: Apply risk-based approach. Not every CRAN package needs Category 5 validation.
  • Forgetting system-level qualification: The OS and R installation need IQ too
  • No independent verification: PQ should compare against results computed independently (SAS, manual calculation)

Related Skills

  • write-validation-documentation - detailed validation document creation
  • implement-audit-trail - electronic records and audit trails
  • validate-statistical-output - double programming and output validation
  • manage-renv-dependencies - dependency locking for validated environments

GitHub Repository

pjt222/agent-almanac
Path: i18n/caveman-lite/skills/setup-gxp-r-project
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

polymarket

Meta

This skill enables developers to build applications with the Polymarket prediction markets platform, including API integration for trading and market data. It also provides real-time data streaming via WebSocket to monitor live trades and market activity. Use it for implementing trading strategies or creating tools that process live market updates.

View skill

creating-opencode-plugins

Meta

This skill helps developers create OpenCode plugins that hook into 25+ event types like commands, files, and LSP operations. It provides the plugin structure, event API specifications, and implementation patterns for JavaScript/TypeScript modules. Use it when you need to intercept, monitor, or extend the OpenCode AI assistant's lifecycle with custom event-driven logic.

View skill

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill