implement-audit-trail
Acerca de
Esta habilidad ayuda a los desarrolladores a implementar funcionalidades de auditoría en proyectos R para entornos regulados como el sector sanitario y farmacéutico. Proporciona herramientas para registro, seguimiento de procedencia, firmas electrónicas y verificaciones de integridad de datos, con el fin de cumplir con los requisitos de la normativa 21 CFR Parte 11. Úsela cuando necesite registros de análisis a prueba de manipulaciones, un seguimiento detallado de quién hizo qué y cuándo, o el cumplimiento de registros electrónicos para presentaciones regulatorias.
Instalación rápida
Claude Code
Recomendadonpx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/implement-audit-trailCopia y pega este comando en Claude Code para instalar esta habilidad
Documentación
Implement Audit Trail
Add audit trail capabilities to R projects for regulatory compliance.
When Use
- R analysis needs electronic records compliance (21 CFR Part 11)
- Track who did what, when, why in analysis
- Implement data provenance tracking
- Create tamper-evident analysis logs
Inputs
- Required: R project with data processing or analysis scripts
- Required: Regulatory requirements (which audit trail elements mandatory)
- Optional: Existing logging infrastructure
- Optional: Electronic signature requirements
Steps
Step 1: Set Up Structured Logging
Create R/audit_log.R:
#' Initialize audit log for a session
#'
#' @param log_dir Directory for audit log files
#' @param analyst Name of the analyst
#' @return Path to the created log file
init_audit_log <- function(log_dir = "audit_logs", analyst = Sys.info()["user"]) {
dir.create(log_dir, showWarnings = FALSE, recursive = TRUE)
log_file <- file.path(log_dir, sprintf(
"audit_%s_%s.jsonl",
format(Sys.time(), "%Y%m%d_%H%M%S"),
analyst
))
entry <- list(
timestamp = format(Sys.time(), "%Y-%m-%dT%H:%M:%S%z"),
event = "SESSION_START",
analyst = analyst,
r_version = R.version.string,
platform = .Platform$OS.type,
working_directory = getwd(),
session_id = paste0(Sys.getpid(), "-", format(Sys.time(), "%Y%m%d%H%M%S"))
)
write(jsonlite::toJSON(entry, auto_unbox = TRUE), log_file, append = TRUE)
options(audit_log_file = log_file, audit_session_id = entry$session_id)
log_file
}
#' Log an audit event
#'
#' @param event Event type (DATA_IMPORT, TRANSFORM, ANALYSIS, EXPORT, etc.)
#' @param description Human-readable description
#' @param details Named list of additional details
log_audit_event <- function(event, description, details = list()) {
log_file <- getOption("audit_log_file")
if (is.null(log_file)) stop("Audit log not initialized. Call init_audit_log() first.")
entry <- list(
timestamp = format(Sys.time(), "%Y-%m-%dT%H:%M:%S%z"),
event = event,
description = description,
session_id = getOption("audit_session_id"),
details = details
)
write(jsonlite::toJSON(entry, auto_unbox = TRUE), log_file, append = TRUE)
}
Got: R/audit_log.R created with init_audit_log() + log_audit_event() functions. Calling init_audit_log() creates audit_logs/ directory + timestamped JSONL file. Each log entry = single JSON line with timestamp, event, analyst, session_id fields.
If fail: jsonlite::toJSON() fails? Ensure jsonlite package installed. Log directory can't be created? Check file system permissions. Timestamps lack timezone? Verify %z supported on platform.
Step 2: Add Data Integrity Checks
#' Compute and log data hash for integrity verification
#'
#' @param data Data frame to hash
#' @param label Descriptive label for the dataset
#' @return SHA-256 hash string
hash_data <- function(data, label = "dataset") {
hash_value <- digest::digest(data, algo = "sha256")
log_audit_event("DATA_HASH", sprintf("Hash computed for %s", label), list(
hash_algorithm = "sha256",
hash_value = hash_value,
nrow = nrow(data),
ncol = ncol(data),
columns = names(data)
))
hash_value
}
#' Verify data integrity against a recorded hash
#'
#' @param data Data frame to verify
#' @param expected_hash Previously recorded hash
#' @return Logical indicating whether data matches
verify_data_integrity <- function(data, expected_hash) {
current_hash <- digest::digest(data, algo = "sha256")
match <- identical(current_hash, expected_hash)
log_audit_event("DATA_VERIFY",
sprintf("Data integrity check: %s", ifelse(match, "PASS", "FAIL")),
list(expected = expected_hash, actual = current_hash))
if (!match) warning("Data integrity check FAILED")
match
}
Got: hash_data() returns SHA-256 hash string + logs DATA_HASH event. verify_data_integrity() compares current data vs stored hash + logs DATA_VERIFY event with PASS or FAIL status.
If fail: digest::digest() not found? Install digest package. Hashes don't match for identical data? Check column order + data types consistent between hashing + verification.
Step 3: Track Data Transformations
#' Wrap a data transformation with audit logging
#'
#' @param data Input data frame
#' @param transform_fn Function to apply
#' @param description Description of the transformation
#' @return Transformed data frame
audited_transform <- function(data, transform_fn, description) {
input_hash <- digest::digest(data, algo = "sha256")
input_dim <- dim(data)
result <- transform_fn(data)
output_hash <- digest::digest(result, algo = "sha256")
output_dim <- dim(result)
log_audit_event("DATA_TRANSFORM", description, list(
input_hash = input_hash,
input_rows = input_dim[1],
input_cols = input_dim[2],
output_hash = output_hash,
output_rows = output_dim[1],
output_cols = output_dim[2]
))
result
}
Got: audited_transform() wraps any transformation function, logs input dimensions + hash, output dimensions + hash, transformation description as DATA_TRANSFORM event.
If fail: Transform function errors? Audit event not logged. Wrap transform in tryCatch() to log both successes + failures. Ensure transform function accepts + returns data frame.
Step 4: Log Session Environment
#' Log complete session information for reproducibility
log_session_info <- function() {
si <- sessionInfo()
log_audit_event("SESSION_INFO", "Complete session environment recorded", list(
r_version = si$R.version$version.string,
platform = si$platform,
locale = Sys.getlocale(),
base_packages = si$basePkgs,
attached_packages = sapply(si$otherPkgs, function(p) paste(p$Package, p$Version)),
renv_lockfile_hash = if (file.exists("renv.lock")) {
digest::digest(file = "renv.lock", algo = "sha256")
} else NA
))
}
Got: SESSION_INFO event logged with R version, platform, locale, attached packages + versions, renv lockfile hash (if applicable).
If fail: sessionInfo() returns incomplete package info? Ensure all packages loaded via library() before calling log_session_info(). renv lockfile hash = NA if project doesn't use renv.
Step 5: Implement in Analysis Scripts
# 01_analysis.R
library(jsonlite)
library(digest)
# Start audit trail
log_file <- init_audit_log(analyst = "Philipp Thoss")
# Import data with audit
raw_data <- read.csv("data/raw/study_data.csv")
raw_hash <- hash_data(raw_data, "raw study data")
# Transform with audit
clean_data <- audited_transform(raw_data, function(d) {
d |>
dplyr::filter(!is.na(primary_endpoint)) |>
dplyr::mutate(bmi = weight / (height/100)^2)
}, "Remove missing endpoints, calculate BMI")
# Run analysis
log_audit_event("ANALYSIS_START", "Primary efficacy analysis")
model <- lm(primary_endpoint ~ treatment + age + sex, data = clean_data)
log_audit_event("ANALYSIS_COMPLETE", "Primary efficacy analysis", list(
model_class = class(model),
formula = deparse(formula(model)),
n_observations = nobs(model)
))
# Log session
log_session_info()
Got: Analysis scripts init audit log at start, log each data import, transformation, analysis step, record session info at end. JSONL log file captures complete provenance chain.
If fail: init_audit_log() missing? Ensure R/audit_log.R sourced or package loaded. Events missing from log? Verify log_audit_event() called after every significant operation.
Step 6: Git-Based Change Control
Complement application-level audit trail with git:
# Use signed commits for non-repudiation
git config commit.gpgsign true
# Descriptive commit messages referencing change control
git commit -m "CHG-042: Add BMI calculation to data processing
Per change request CHG-042, approved by [Name] on [Date].
Validation impact assessment: Low risk - additional derived variable."
Got: Git commits signed (GPG) + use descriptive messages referencing change control IDs. Combination of application-level JSONL audit trail + git history provides complete change control record.
If fail: GPG signing fails? Configure signing key with git config --global user.signingkey KEY_ID. Key not set up? Follow gpg --gen-key to create one.
Checks
- Audit log captures all required events (start, data access, transforms, analysis, export)
- Timestamps use ISO 8601 format with timezone
- Data hashes enable integrity verification
- Session information recorded
- Logs append-only (no deletion or modification)
- Analyst identity captured for each session
- Log format machine-readable (JSONL)
Pitfalls
- Logging too much: Focus on regulated events. Don't log every variable assignment.
- Mutable logs: Audit logs must be append-only. Use JSONL (one JSON object per line).
- Missing timestamps: Every event needs timestamp with timezone.
- No session context: Each log entry should reference session for correlation.
- Forgetting to initialize: Scripts must call
init_audit_log()before any analysis.
See Also
setup-gxp-r-project- project structure for validated environmentswrite-validation-documentation- validation protocols + reportsvalidate-statistical-output- output verification methodologyconfigure-git-repository- version control as part of change control
Repositorio GitHub
Habilidades relacionadas
content-collections
MetaEsta habilidad proporciona una configuración probada en producción para Content Collections, una herramienta centrada en TypeScript que transforma archivos Markdown/MDX en colecciones de datos con tipado seguro mediante validación Zod. Úsala al construir blogs, sitios de documentación o aplicaciones Vite + React con mucho contenido para garantizar seguridad de tipos y validación automática de contenido. Abarca todo, desde la configuración del plugin de Vite y compilación MDX hasta la optimización de despliegue y validación de esquemas.
polymarket
MetaEsta habilidad permite a los desarrolladores crear aplicaciones con la plataforma de mercados de predicción Polymarket, incluyendo la integración de API para operaciones y datos de mercado. También proporciona transmisión de datos en tiempo real a través de WebSocket para monitorear operaciones en vivo y actividad del mercado. Úsela para implementar estrategias de trading o crear herramientas que procesen actualizaciones de mercado en tiempo real.
creating-opencode-plugins
MetaEsta habilidad ayuda a los desarrolladores a crear complementos de OpenCode que se conectan a más de 25 tipos de eventos, como comandos, archivos y operaciones LSP. Proporciona la estructura del complemento, las especificaciones de la API de eventos y los patrones de implementación para módulos en JavaScript/TypeScript. Úsala cuando necesites interceptar, monitorear o extender el ciclo de vida del asistente de IA de OpenCode con lógica personalizada basada en eventos.
sglang
MetaSGLang es un framework de alto rendimiento para el servicio de LLM que se especializa en generación rápida y estructurada para JSON, expresiones regulares y flujos de trabajo de agentes utilizando su caché de prefijos RadixAttention. Ofrece una inferencia significativamente más rápida, especialmente para tareas con prefijos repetidos, lo que lo hace ideal para salidas complejas y estructuradas, y conversaciones multiturno. Elige SGLang sobre alternativas como vLLM cuando necesites decodificación restringida o estés construyendo aplicaciones con uso extensivo de prefijos compartidos.
