SKILL·65B59C

analyze-codebase-workflow

Name: analyze-codebase-workflow
Author: pjt222

pjt222

更新于 1 month ago

9 次查看

设计wordautomationdata

关于

This skill automatically analyzes codebases to detect workflows, data pipelines, and file dependencies using putior's `put_auto()` engine. It generates an annotation plan mapping I/O patterns across 30+ languages, ideal for onboarding or starting putior integration. Use it to understand data flow in unfamiliar projects or to prepare for source file annotation.

快速安装

Claude Code

技能文档

Analyze Codebase Workflow

Survey repo → auto-detect data flows, file I/O, script deps → structured annotation plan for manual refinement.

Use When

Onboard unfamiliar codebase → understand data flow
Start putior integration, no PUT annotations
Audit existing data pipeline pre-doc
Prep annotation plan before annotate-source-files

In

Required: Path to repo/src dir
Optional: Subdirs focus (default: entire repo)
Optional: Langs include/exclude (default: all detected)
Optional: Scope: inputs only, outputs only, both (default: both + deps)

Do

Step 1: Survey Repo Structure

Identify src files + langs → what putior can analyze.

library(putior)

# List all supported languages and their extensions
list_supported_languages()
list_supported_languages(detection_only = TRUE)  # Only languages with auto-detection

# Get supported extensions
exts <- get_supported_extensions()

File listing → repo composition:

# Count files by extension in the target directory
find /path/to/repo -type f | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -20

→ File extensions in repo + counts. Map against get_supported_extensions() → coverage.

If err: No files match supported → putior can't auto-detect. Check if lang supported but non-standard ext.

Step 2: Check Detection Coverage

Per detected lang → verify auto-detect pattern available.

# Check which languages have auto-detection patterns (18 languages, 902 patterns)
detection_langs <- list_supported_languages(detection_only = TRUE)
cat("Languages with auto-detection:\n")
print(detection_langs)

# Get pattern counts for specific languages found in the repo
for (lang in c("r", "python", "javascript", "sql", "dockerfile", "makefile")) {
  patterns <- get_detection_patterns(lang)
  cat(sprintf("%s: %d input, %d output, %d dependency patterns\n",
    lang,
    length(patterns$input),
    length(patterns$output),
    length(patterns$dependency)
  ))
}

→ Pattern counts printed. R 124, Python 159, JS 71, etc.

If err: No patterns → supports manual only, not auto. Plan manual annotations.

Step 3: Run Auto-Detection

Execute put_auto() → discover workflow elements.

# Full auto-detection
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE
)

# Exclude build scripts and test helpers from scanning
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE,
  exclude = c("build-", "test_helper")
)

# View detected workflow nodes
print(workflow)

# Check node count
cat(sprintf("Detected %d workflow nodes\n", nrow(workflow)))

Large repos → analyze subdirs incrementally:

# Analyze specific subdirectories
etl_workflow <- put_auto("./src/etl/")
api_workflow <- put_auto("./src/api/")

→ Df w/ id, label, input, output, source_file cols. Row = detected step.

If err: Empty → src may lack recognizable I/O patterns. Try workflow <- put_auto("./src/", log_level = "DEBUG") → see scanned + matched.

Step 4: Initial Diagram

Visualize auto-detected → assess coverage + gaps.

# Generate diagram from auto-detected workflow
cat(put_diagram(workflow, theme = "github"))

# With source file info for traceability
cat(put_diagram(workflow, show_source_info = TRUE))

# Save to file for review
writeLines(put_diagram(workflow, theme = "github"), "workflow-auto.md")

→ Mermaid flowchart, detected nodes + data flow edges. Meaningful fn/file labels.

If err: Disconnected nodes → auto-detect found I/O but couldn't infer connections. Normal — matching output → input filenames. Annotation plan next step fills.

Step 5: Annotation Plan

Generate plan → what found + what needs manual.

# Generate annotation suggestions
put_generate("./src/", style = "single")

# For multiline style (more readable for complex workflows)
put_generate("./src/", style = "multiline")

# Copy suggestions to clipboard for easy pasting
put_generate("./src/", output = "clipboard")

Doc plan w/ coverage assessment:

## Annotation Plan

### Auto-Detected (no manual work needed)
- `src/etl/extract.R` — 3 inputs, 2 outputs detected
- `src/etl/transform.py` — 1 input, 1 output detected

### Needs Manual Annotation
- `src/api/handler.js` — Language supported but no I/O patterns matched
- `src/config/setup.sh` — Only 12 shell patterns; complex logic missed

### Not Supported
- `src/legacy/process.f90` — Fortran not in detection languages

### Recommended Connections
- extract.R output `data.csv` → transform.py input `data.csv` (auto-linked)
- transform.py output `clean.parquet` → load.R input (needs annotation)

→ Clear plan: auto-detected vs manual, specific recs per file.

If err: put_generate() no out → verify path correct + has supported src files.

Check

put_auto() no err on target
Detected workflow has ≥1 node (unless no recognizable I/O)
put_diagram() produces valid Mermaid
put_generate() produces suggestions for detected files
Annotation plan doc created w/ coverage assessment

Traps

Scan too broad: put_auto(".") → includes node_modules/, .git/, venv/. Target specific src dirs.
Expect full coverage: Auto-detect finds I/O + lib calls, not business logic. 40-60% typical; rest manual.
Ignore deps: detect_dependencies = TRUE catches source(), import, require() → links scripts. Disable → lose cross-file connections.
Lang mismatch: Non-standard ext (.R vs .r, .jsx vs .js) may not detect. Use get_comment_prefix(). Extensionless Dockerfile, Makefile supported via filename match.
Large repos: 100+ src files → analyze by module/dir → diagrams readable.

→

install-putior — prereq
annotate-source-files — next: add manual
generate-workflow-diagram — final after annotation
configure-putior-mcp — MCP tools for interactive

GitHub 仓库

pjt222/agent-almanac

路径: i18n/caveman-ultra/skills/analyze-codebase-workflow

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the analyze-codebase-workflow skill?

analyze-codebase-workflow is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform analyze-codebase-workflow-related tasks without extra prompting.

How do I install analyze-codebase-workflow?

Use the install commands on this page: add analyze-codebase-workflow to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does analyze-codebase-workflow belong to?

analyze-codebase-workflow is in the Design category, tagged word, automation and data.

Is analyze-codebase-workflow free to use?

Yes. analyze-codebase-workflow is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.