返回技能列表

analyze-codebase-workflow

pjt222
更新于 2 days ago
5 次查看
17
2
17
在 GitHub 上查看
设计wordautomationdata

关于

This skill automatically analyzes codebases to detect workflows, data pipelines, and file dependencies using putior's `put_auto()` engine. It generates an annotation plan mapping I/O patterns across 30+ languages, ideal for onboarding or starting putior integration. Use it to understand data flow in unfamiliar projects or to prepare for source file annotation.

快速安装

Claude Code

推荐
主要方式
npx skills add pjt222/agent-almanac -a claude-code
插件命令备选方式
/plugin add https://github.com/pjt222/agent-almanac
Git 克隆备选方式
git clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/analyze-codebase-workflow

在 Claude Code 中复制并粘贴此命令以安装该技能

技能文档

Analyze Codebase Workflow

Survey repo → auto-detect data flows, file I/O, script deps → structured annotation plan for manual refinement.

Use When

  • Onboard unfamiliar codebase → understand data flow
  • Start putior integration, no PUT annotations
  • Audit existing data pipeline pre-doc
  • Prep annotation plan before annotate-source-files

In

  • Required: Path to repo/src dir
  • Optional: Subdirs focus (default: entire repo)
  • Optional: Langs include/exclude (default: all detected)
  • Optional: Scope: inputs only, outputs only, both (default: both + deps)

Do

Step 1: Survey Repo Structure

Identify src files + langs → what putior can analyze.

library(putior)

# List all supported languages and their extensions
list_supported_languages()
list_supported_languages(detection_only = TRUE)  # Only languages with auto-detection

# Get supported extensions
exts <- get_supported_extensions()

File listing → repo composition:

# Count files by extension in the target directory
find /path/to/repo -type f | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -20

File extensions in repo + counts. Map against get_supported_extensions() → coverage.

If err: No files match supported → putior can't auto-detect. Check if lang supported but non-standard ext.

Step 2: Check Detection Coverage

Per detected lang → verify auto-detect pattern available.

# Check which languages have auto-detection patterns (18 languages, 902 patterns)
detection_langs <- list_supported_languages(detection_only = TRUE)
cat("Languages with auto-detection:\n")
print(detection_langs)

# Get pattern counts for specific languages found in the repo
for (lang in c("r", "python", "javascript", "sql", "dockerfile", "makefile")) {
  patterns <- get_detection_patterns(lang)
  cat(sprintf("%s: %d input, %d output, %d dependency patterns\n",
    lang,
    length(patterns$input),
    length(patterns$output),
    length(patterns$dependency)
  ))
}

Pattern counts printed. R 124, Python 159, JS 71, etc.

If err: No patterns → supports manual only, not auto. Plan manual annotations.

Step 3: Run Auto-Detection

Execute put_auto() → discover workflow elements.

# Full auto-detection
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE
)

# Exclude build scripts and test helpers from scanning
workflow <- put_auto("./src/",
  detect_inputs = TRUE,
  detect_outputs = TRUE,
  detect_dependencies = TRUE,
  exclude = c("build-", "test_helper")
)

# View detected workflow nodes
print(workflow)

# Check node count
cat(sprintf("Detected %d workflow nodes\n", nrow(workflow)))

Large repos → analyze subdirs incrementally:

# Analyze specific subdirectories
etl_workflow <- put_auto("./src/etl/")
api_workflow <- put_auto("./src/api/")

Df w/ id, label, input, output, source_file cols. Row = detected step.

If err: Empty → src may lack recognizable I/O patterns. Try workflow <- put_auto("./src/", log_level = "DEBUG") → see scanned + matched.

Step 4: Initial Diagram

Visualize auto-detected → assess coverage + gaps.

# Generate diagram from auto-detected workflow
cat(put_diagram(workflow, theme = "github"))

# With source file info for traceability
cat(put_diagram(workflow, show_source_info = TRUE))

# Save to file for review
writeLines(put_diagram(workflow, theme = "github"), "workflow-auto.md")

Mermaid flowchart, detected nodes + data flow edges. Meaningful fn/file labels.

If err: Disconnected nodes → auto-detect found I/O but couldn't infer connections. Normal — matching output → input filenames. Annotation plan next step fills.

Step 5: Annotation Plan

Generate plan → what found + what needs manual.

# Generate annotation suggestions
put_generate("./src/", style = "single")

# For multiline style (more readable for complex workflows)
put_generate("./src/", style = "multiline")

# Copy suggestions to clipboard for easy pasting
put_generate("./src/", output = "clipboard")

Doc plan w/ coverage assessment:

## Annotation Plan

### Auto-Detected (no manual work needed)
- `src/etl/extract.R` — 3 inputs, 2 outputs detected
- `src/etl/transform.py` — 1 input, 1 output detected

### Needs Manual Annotation
- `src/api/handler.js` — Language supported but no I/O patterns matched
- `src/config/setup.sh` — Only 12 shell patterns; complex logic missed

### Not Supported
- `src/legacy/process.f90` — Fortran not in detection languages

### Recommended Connections
- extract.R output `data.csv` → transform.py input `data.csv` (auto-linked)
- transform.py output `clean.parquet` → load.R input (needs annotation)

Clear plan: auto-detected vs manual, specific recs per file.

If err: put_generate() no out → verify path correct + has supported src files.

Check

  • put_auto() no err on target
  • Detected workflow has ≥1 node (unless no recognizable I/O)
  • put_diagram() produces valid Mermaid
  • put_generate() produces suggestions for detected files
  • Annotation plan doc created w/ coverage assessment

Traps

  • Scan too broad: put_auto(".") → includes node_modules/, .git/, venv/. Target specific src dirs.
  • Expect full coverage: Auto-detect finds I/O + lib calls, not business logic. 40-60% typical; rest manual.
  • Ignore deps: detect_dependencies = TRUE catches source(), import, require() → links scripts. Disable → lose cross-file connections.
  • Lang mismatch: Non-standard ext (.R vs .r, .jsx vs .js) may not detect. Use get_comment_prefix(). Extensionless Dockerfile, Makefile supported via filename match.
  • Large repos: 100+ src files → analyze by module/dir → diagrams readable.

  • install-putior — prereq
  • annotate-source-files — next: add manual
  • generate-workflow-diagram — final after annotation
  • configure-putior-mcp — MCP tools for interactive

GitHub 仓库

pjt222/agent-almanac
路径: i18n/caveman-ultra/skills/analyze-codebase-workflow
0
agentsagentskillsai-assisted-developmentclaude-codeskillsteams

相关推荐技能

executing-plans

设计

该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。

查看技能

requesting-code-review

设计

该Skill可在完成任务、实现主要功能或合并代码前自动调度代码审查子代理,确保实现符合需求和计划。它支持通过指定git SHA范围进行精准的代码变更审查,帮助开发者在关键节点及时发现潜在问题。核心原则是"早审查、勤审查",适用于开发流程的各个关键阶段。

查看技能

connect-mcp-server

设计

这个Skill指导开发者如何将MCP服务器连接到Claude Code,支持HTTP、stdio和SSE三种传输协议。它涵盖了从安装配置到认证安全的完整流程,适用于集成GitHub、Notion、数据库等外部服务。当开发者需要添加集成、配置外部工具或提及MCP相关功能时,这个Skill能提供实用的操作指南。

查看技能

web-cli-teleport

设计

该Skill帮助开发者根据任务特性选择Claude Code的Web或CLI界面,并指导如何在两种环境间无缝迁移会话。它能分析任务复杂度、迭代需求等要素,推荐最优工作界面和工作流。关键特性包括会话状态管理、环境切换指导和上下文优化建议。

查看技能