SKILL·4704ED

analyze-codebase-workflow

Name: analyze-codebase-workflow
Author: pjt222

pjt222

Mis à jour 1 month ago

9 vues

Designwordautomationdata

À propos

Cette compétence analyse automatiquement les bases de code pour détecter les flux de travail, les pipelines de données et les dépendances de fichiers en utilisant le moteur `put_auto()` de putior. Elle génère un plan d'annotation cartographiant les modèles d'E/S pour plus de 30 langages, idéal pour l'intégration de nouveaux membres ou le début d'une intégration putior. Utilisez-la pour comprendre le flux de données dans des projets non familiers ou pour préparer l'annotation des fichiers sources.

Installation rapide

Claude Code

Recommandé

Principal

npx skills add pjt222/agent-almanac -a claude-code

Commande PluginAlternatif

/plugin add https://github.com/pjt222/agent-almanac

Git CloneAlternatif

git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/analyze-codebase-workflow

Copiez et collez cette commande dans Claude Code pour installer cette compétence

Documentation

Analyze Codebase Workflow

Survey an arbitrary repository to auto-detect data flows, file I/O, and script dependencies, then produce a structured annotation plan for manual refinement.

When to Use

Onboarding onto an unfamiliar codebase and need to understand data flow
Starting putior integration in a project that has no PUT annotations yet
Auditing an existing project's data pipeline before documentation
Preparing an annotation plan before running annotate-source-files

Inputs

Required: Path to the repository or source directory to analyze
Optional: Specific subdirectories to focus on (default: entire repo)
Optional: Languages to include or exclude (default: all detected)
Optional: Detection scope: inputs only, outputs only, or both (default: both + dependencies)

Procedure

Step 1: Survey Repository Structure

Identify source files and their languages to understand what putior can analyze.

library(putior)

# List all supported languages and their extensions
list_supported_languages()
list_supported_languages(detection_only = TRUE)  # Only languages with auto-detection

# Get supported extensions
exts <- get_supported_extensions()

Use file listing to understand repo composition:

# Count files by extension in the target directory
find /path/to/repo -type f | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -20

Got: A list of file extensions present in the repo, with counts. Map these against get_supported_extensions() to know coverage.

If fail: If the repo has no files matching supported extensions, putior cannot auto-detect workflows. Consider whether the language is supported but files use non-standard extensions.

Step 2: Check Language Detection Coverage

For each detected language, verify auto-detection pattern availability.

# Check which languages have auto-detection patterns (18 languages, 902 patterns)
detection_langs <- list_supported_languages(detection_only = TRUE)
cat("Languages with auto-detection:\n")
print(detection_langs)

# Get pattern counts for specific languages found in the repo
for (lang in c("r", "python", "javascript", "sql", "dockerfile", "makefile")) {
  patterns <- get_detection_patterns(lang)
  cat(sprintf("%s: %d input, %d output, %d dependency patterns\n",
    lang,
    length(patterns$input),
    length(patterns$output),
    length(patterns$dependency)
  ))
}

Got: Pattern counts printed for each language. R has 124 patterns, Python 159, JavaScript 71, etc.

If fail: If a language returns no patterns, it supports manual annotations but not auto-detection. Plan to annotate those files manually.

Step 3: Run Auto-Detection

Execute put_auto() on the target directory to discover workflow elements.

# Full auto-detection
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE
)

# Exclude build scripts and test helpers from scanning
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE,
  exclude = c("build-", "test_helper")
)

# View detected workflow nodes
print(workflow)

# Check node count
cat(sprintf("Detected %d workflow nodes\n", nrow(workflow)))

For large repos, analyze subdirectories incrementally:

# Analyze specific subdirectories
etl_workflow <- put_auto("./src/etl/")
api_workflow <- put_auto("./src/api/")

Got: A data frame with columns including id, label, input, output, source_file. Each row represents a detected workflow step.

If fail: If the result is empty, the source files may not contain recognizable I/O patterns. Try enabling debug logging: workflow <- put_auto("./src/", log_level = "DEBUG") to see which files are scanned and which patterns match.

Step 4: Generate Initial Diagram

Visualize the auto-detected workflow to assess coverage and identify gaps.

# Generate diagram from auto-detected workflow
cat(put_diagram(workflow, theme = "github"))

# With source file info for traceability
cat(put_diagram(workflow, show_source_info = TRUE))

# Save to file for review
writeLines(put_diagram(workflow, theme = "github"), "workflow-auto.md")

Got: A Mermaid flowchart showing detected nodes connected by data flow edges. Nodes should be labeled with meaningful function/file names.

If fail: If the diagram shows disconnected nodes, the auto-detection found I/O patterns but couldn't infer connections. This is normal — connections are derived from matching output filenames to input filenames. The annotation plan (next step) will address gaps.

Step 5: Produce Annotation Plan

Generate a structured plan documenting what was found and what needs manual annotation.

# Generate annotation suggestions
put_generate("./src/", style = "single")

# For multiline style (more readable for complex workflows)
put_generate("./src/", style = "multiline")

# Copy suggestions to clipboard for easy pasting
put_generate("./src/", output = "clipboard")

Document the plan with coverage assessment:

## Annotation Plan

### Auto-Detected (no manual work needed)
- `src/etl/extract.R` — 3 inputs, 2 outputs detected
- `src/etl/transform.py` — 1 input, 1 output detected

### Needs Manual Annotation
- `src/api/handler.js` — Language supported but no I/O patterns matched
- `src/config/setup.sh` — Only 12 shell patterns; complex logic missed

### Not Supported
- `src/legacy/process.f90` — Fortran not in detection languages

### Recommended Connections
- extract.R output `data.csv` → transform.py input `data.csv` (auto-linked)
- transform.py output `clean.parquet` → load.R input (needs annotation)

Got: A clear plan separating auto-detected files from those needing manual annotation, with specific recommendations for each file.

If fail: If put_generate() produces no output, ensure the directory path is correct and contains source files in supported languages.

Validation

put_auto() executes without errors on the target directory
Detected workflow has at least one node (unless repo has no recognizable I/O)
put_diagram() produces valid Mermaid code from the auto-detected workflow
put_generate() produces annotation suggestions for files with detected patterns
Annotation plan document created with coverage assessment

Pitfalls

Scanning too broadly: Running put_auto(".") on a repo root may include node_modules/, .git/, venv/, etc. Target specific source directories.
Expecting full coverage: Auto-detection finds file I/O and library calls, not business logic. A 40-60% coverage rate is typical; the rest needs manual annotation.
Ignoring dependencies: The detect_dependencies = TRUE flag catches source(), import, require() calls that link scripts together. Disabling it loses cross-file connections.
Language mismatch: Files with non-standard extensions (e.g., .R vs .r, .jsx vs .js) may not be detected. Use get_comment_prefix() to check if an extension is recognized. Note that extensionless files like Dockerfile and Makefile are supported via exact filename matching.
Large repos: For repos with 100+ source files, analyze by module/directory to keep diagrams readable.

Related Skills

install-putior — prerequisite: putior must be installed first
annotate-source-files — next step: add manual annotations based on the plan
generate-workflow-diagram — generate final diagram after annotation is complete
configure-putior-mcp — use MCP tools for interactive analysis sessions

Dépôt GitHub

pjt222/agent-almanac

Chemin: i18n/caveman-lite/skills/analyze-codebase-workflow

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the analyze-codebase-workflow skill?

analyze-codebase-workflow is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform analyze-codebase-workflow-related tasks without extra prompting.

How do I install analyze-codebase-workflow?

Use the install commands on this page: add analyze-codebase-workflow to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does analyze-codebase-workflow belong to?

analyze-codebase-workflow is in the Design category, tagged word, automation and data.

Is analyze-codebase-workflow free to use?

Yes. analyze-codebase-workflow is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.

Compétences associées

executing-plans

Design

Utilisez la compétence executing-plans lorsque vous disposez d'un plan de mise en œuvre complet à exécuter par lots contrôlés avec des points de contrôle de revue. Elle charge et examine le plan de manière critique, puis exécute les tâches par petits lots (3 tâches par défaut) tout en rapportant la progression entre chaque lot pour une revue par l'architecte. Cela garantit une mise en œuvre systématique avec des points de contrôle de qualité intégrés.

Voir la compétence

requesting-code-review

Design

Cette compétence délègue un sous-agent réviseur de code pour analyser les modifications apportées au code par rapport aux exigences avant de poursuivre. Elle doit être utilisée après avoir terminé des tâches, implémenté des fonctionnalités majeures, ou avant une fusion vers la branche principale. La revue aide à détecter précocement les problèmes en comparant l'implémentation actuelle avec le plan initial.

Voir la compétence

connect-mcp-server

Design

Cette compétence fournit un guide complet permettant aux développeurs de connecter des serveurs MCP à Claude Code via les transports HTTP, stdio ou SSE. Elle couvre l'installation, la configuration, l'authentification et la sécurité pour intégrer des services externes tels que GitHub, Notion et des API personnalisées. Utilisez-la lors de la configuration d'intégrations MCP, de la configuration d'outils externes ou du travail avec le Protocole de Contexte de Modèle de Claude.

Voir la compétence

web-cli-teleport

Design

Cette compétence aide les développeurs à choisir entre les interfaces Web et CLI de Claude Code en fonction de l'analyse des tâches, puis permet une téléportation transparente des sessions entre ces environnements. Elle optimise le flux de travail en gérant l'état et le contexte de la session lors du passage entre le web, la CLI ou le mobile. Utilisez-la pour des projets complexes nécessitant différents outils à diverses étapes.

Voir la compétence