analyze-codebase-workflow
关于
This skill automatically analyzes codebases to detect workflows, data pipelines, and file dependencies using putior's `put_auto()` engine. It generates an annotation plan mapping I/O patterns across 30+ languages, ideal for onboarding or starting putior integration. Use it to understand data flow in unfamiliar projects or to prepare for source file annotation.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/analyze-codebase-workflow在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Analyze Codebase Workflow
Survey repo → auto-detect data flows, file I/O, script deps → structured annotation plan for manual refinement.
Use When
- Onboard unfamiliar codebase → understand data flow
- Start putior integration, no PUT annotations
- Audit existing data pipeline pre-doc
- Prep annotation plan before
annotate-source-files
In
- Required: Path to repo/src dir
- Optional: Subdirs focus (default: entire repo)
- Optional: Langs include/exclude (default: all detected)
- Optional: Scope: inputs only, outputs only, both (default: both + deps)
Do
Step 1: Survey Repo Structure
Identify src files + langs → what putior can analyze.
library(putior)
# List all supported languages and their extensions
list_supported_languages()
list_supported_languages(detection_only = TRUE) # Only languages with auto-detection
# Get supported extensions
exts <- get_supported_extensions()
File listing → repo composition:
# Count files by extension in the target directory
find /path/to/repo -type f | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -20
→ File extensions in repo + counts. Map against get_supported_extensions() → coverage.
If err: No files match supported → putior can't auto-detect. Check if lang supported but non-standard ext.
Step 2: Check Detection Coverage
Per detected lang → verify auto-detect pattern available.
# Check which languages have auto-detection patterns (18 languages, 902 patterns)
detection_langs <- list_supported_languages(detection_only = TRUE)
cat("Languages with auto-detection:\n")
print(detection_langs)
# Get pattern counts for specific languages found in the repo
for (lang in c("r", "python", "javascript", "sql", "dockerfile", "makefile")) {
patterns <- get_detection_patterns(lang)
cat(sprintf("%s: %d input, %d output, %d dependency patterns\n",
lang,
length(patterns$input),
length(patterns$output),
length(patterns$dependency)
))
}
→ Pattern counts printed. R 124, Python 159, JS 71, etc.
If err: No patterns → supports manual only, not auto. Plan manual annotations.
Step 3: Run Auto-Detection
Execute put_auto() → discover workflow elements.
# Full auto-detection
workflow <- put_auto("./src/",
detect_inputs = TRUE,
detect_outputs = TRUE,
detect_dependencies = TRUE
)
# Exclude build scripts and test helpers from scanning
workflow <- put_auto("./src/",
detect_inputs = TRUE,
detect_outputs = TRUE,
detect_dependencies = TRUE,
exclude = c("build-", "test_helper")
)
# View detected workflow nodes
print(workflow)
# Check node count
cat(sprintf("Detected %d workflow nodes\n", nrow(workflow)))
Large repos → analyze subdirs incrementally:
# Analyze specific subdirectories
etl_workflow <- put_auto("./src/etl/")
api_workflow <- put_auto("./src/api/")
→ Df w/ id, label, input, output, source_file cols. Row = detected step.
If err: Empty → src may lack recognizable I/O patterns. Try workflow <- put_auto("./src/", log_level = "DEBUG") → see scanned + matched.
Step 4: Initial Diagram
Visualize auto-detected → assess coverage + gaps.
# Generate diagram from auto-detected workflow
cat(put_diagram(workflow, theme = "github"))
# With source file info for traceability
cat(put_diagram(workflow, show_source_info = TRUE))
# Save to file for review
writeLines(put_diagram(workflow, theme = "github"), "workflow-auto.md")
→ Mermaid flowchart, detected nodes + data flow edges. Meaningful fn/file labels.
If err: Disconnected nodes → auto-detect found I/O but couldn't infer connections. Normal — matching output → input filenames. Annotation plan next step fills.
Step 5: Annotation Plan
Generate plan → what found + what needs manual.
# Generate annotation suggestions
put_generate("./src/", style = "single")
# For multiline style (more readable for complex workflows)
put_generate("./src/", style = "multiline")
# Copy suggestions to clipboard for easy pasting
put_generate("./src/", output = "clipboard")
Doc plan w/ coverage assessment:
## Annotation Plan
### Auto-Detected (no manual work needed)
- `src/etl/extract.R` — 3 inputs, 2 outputs detected
- `src/etl/transform.py` — 1 input, 1 output detected
### Needs Manual Annotation
- `src/api/handler.js` — Language supported but no I/O patterns matched
- `src/config/setup.sh` — Only 12 shell patterns; complex logic missed
### Not Supported
- `src/legacy/process.f90` — Fortran not in detection languages
### Recommended Connections
- extract.R output `data.csv` → transform.py input `data.csv` (auto-linked)
- transform.py output `clean.parquet` → load.R input (needs annotation)
→ Clear plan: auto-detected vs manual, specific recs per file.
If err: put_generate() no out → verify path correct + has supported src files.
Check
-
put_auto()no err on target - Detected workflow has ≥1 node (unless no recognizable I/O)
-
put_diagram()produces valid Mermaid -
put_generate()produces suggestions for detected files - Annotation plan doc created w/ coverage assessment
Traps
- Scan too broad:
put_auto(".")→ includesnode_modules/,.git/,venv/. Target specific src dirs. - Expect full coverage: Auto-detect finds I/O + lib calls, not business logic. 40-60% typical; rest manual.
- Ignore deps:
detect_dependencies = TRUEcatchessource(),import,require()→ links scripts. Disable → lose cross-file connections. - Lang mismatch: Non-standard ext (
.Rvs.r,.jsxvs.js) may not detect. Useget_comment_prefix(). ExtensionlessDockerfile,Makefilesupported via filename match. - Large repos: 100+ src files → analyze by module/dir → diagrams readable.
→
install-putior— prereqannotate-source-files— next: add manualgenerate-workflow-diagram— final after annotationconfigure-putior-mcp— MCP tools for interactive
GitHub 仓库
相关推荐技能
executing-plans
设计该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。
requesting-code-review
设计该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。
connect-mcp-server
设计这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。
web-cli-teleport
设计该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。
