analyze-codebase-for-mcp
关于
This skill analyzes a codebase to identify functions, APIs, and data sources that can be exposed as Model Context Protocol (MCP) tools, generating a structured specification document. It's used when planning an MCP server, auditing a codebase for AI-accessible tool surfaces, or comparing existing capabilities with current MCP exposure. The analysis focuses on functions, REST endpoints, CLI commands, and data access points suitable for tool conversion.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/analyze-codebase-for-mcp在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Analyze Codebase for MCP
Scan codebase → fns, REST endpoints, CLI cmds, data access candidates → MCP tool exposure → structured tool spec doc.
Use When
- Plan MCP server existing project → know what to expose
- Audit codebase pre-AI-tool-surface wrap
- Compare codebase capability vs MCP exposed
- Generate tool spec → hand to
scaffold-mcp-server - Evaluate 3rd-party lib worth wrapping
In
- Required: Path to codebase root
- Required: Target lang(s) (TS, Python, R, Go)
- Optional: Existing MCP server code → gap analysis
- Optional: Domain focus ("data analysis", "file ops", "API integration")
- Optional: Max tools to recommend (default: 20)
Do
Step 1: Scan Structure
1.1. Glob → dir tree, src dirs:
src/**/*.{ts,js,py,R,go,rs}→ src files**/routes/**,**/api/**,**/controllers/**→ endpoints**/cli/**,**/commands/**→ CLI entries**/package.json,**/setup.py,**/DESCRIPTION→ dep metadata
1.2. Categorize by role:
- Entry: main files, route handlers, CLI cmds
- Core logic: business fns, algos, data transformers
- Data access: DB queries, file I/O, API clients
- Utilities: helpers, formatters, validators
1.3. Count files, LOC, exported symbols → gauge size.
→ Categorized inventory w/ role annotations.
If err: Too large (>10K files) → narrow via domain focus. No src found → verify root path + lang params.
Step 2: Identify Fns + Endpoints
2.1. Grep exported fns + public APIs:
- TS/JS:
export (async )?function,export default,module.exports - Python: fns no
_prefix,@app.route,@router - R: NAMESPACE or
#' @exportroxygen - Go: capitalized fn names (exported by convention)
2.2. Per candidate extract:
- Name: fn/endpoint
- Signature: params w/ types + defaults
- Return type
- Docs: docstrings, JSDoc, roxygen, godoc
- Location: file path + line
2.3. REST APIs, also extract:
- HTTP method + route pattern
- Req body schema
- Res shape
- Auth reqs
2.4. Sort by potential utility (public, documented, well-typed first).
→ 20-100 candidates w/ extracted metadata.
If err: Few candidates → broaden → include internal that could be public. Sparse docs → flag as risk.
Step 3: Evaluate MCP Suitability
3.1. Per candidate → MCP tool criteria:
- In contract clarity: params well-typed + documented? JSON Schema describable?
- Out predictability: structured (JSON-serializable)? Consistent shape?
- Side effects: modifies state (files, DB, external)? Must be labeled.
- Idempotency: safe to retry? Non-idempotent → explicit warn.
- Exec time: completes <30s? Long-running → async patterns.
- Err handling: structured errs or silent fail?
3.2. Score 1-5:
- 5: Pure fn, typed I/O, documented, fast, no side effects
- 4: Well-typed, documented, minor side effects (logging)
- 3: Reasonable I/O, needs wrapping (raw objects)
- 2: Significant side effects or unclear, substantial adaptation
- 1: Not suitable no major refactor
3.3. Filter ≥3. Flag score-2 → "future candidates" needing refactor.
→ Scored + filtered list w/ suitability rationale.
If err: Most <3 → codebase needs refactor pre-MCP. Doc gaps → recommend (add types, extract pure fns, wrap side effects).
Step 4: Design Tool Specs
4.1. Per selected (≥3) draft spec:
- name: tool_name
description: >
One-line description of what the tool does.
source_function: module.function_name
source_file: src/path/to/file.ts:42
parameters:
param_name:
type: string | number | boolean | object | array
description: What this parameter controls
required: true | false
default: value_if_optional
returns:
type: string | object | array
description: What the tool returns
side_effects:
- description of any side effect
estimated_latency: fast | medium | slow
suitability_score: 5
4.2. Group logical categories ("Data Queries", "File Ops", "Analysis", "Config").
4.3. Identify deps between tools ("list_datasets" before "query_dataset").
4.4. Need wrappers?
- Simplify complex param objects → flat in
- Convert raw returns → structured text/JSON
- Safety guards (read-only wrappers for DB fns)
→ Complete YAML spec w/ categories, deps, wrapper notes.
If err: Ambiguous → Step 2 → more src detail. Param types uninferable → flag manual review.
Step 5: Generate Tool Spec Doc
5.1. Final doc sections:
- Summary: Codebase overview, lang, size, date
- Recommended Tools: Full specs from Step 4, grouped
- Future Candidates: Score-2 + refactor recs
- Excluded: Score-1 + rationale
- Dependencies: Tool dep graph
- Impl Notes: Wrappers, auth, transport
5.2. Save mcp-tool-spec.yml (machine) + mcp-tool-spec.md (human).
5.3. Existing MCP server provided → gap analysis:
- In spec, not impl
- Impl, not in spec (stale)
- Spec drift (impl diverges)
→ Complete doc → consumable by scaffold-mcp-server.
If err: >200 tools → split modules w/ cross-refs. No candidates → "readiness assessment" doc w/ refactor recs.
Check
- All src files scanned
- Candidates have names, signatures, returns
- Each candidate scored + rationale
- Tool specs complete param schemas w/ types
- Side effects explicit per tool
- Doc valid YAML (parseable)
- Tool names follow MCP (snake_case, unique)
- Categories + deps coherent
- Gap analysis if existing MCP provided
- Future candidates list refactor steps
Traps
- Too many tools: AI works best 10-30 focused. Breadth > depth. Resist every public fn.
- Ignore side effects: "Just reads" + logs/cache = still side effects. Audit
Grepfile writes, network, DB. - Assume type safety: Dynamic langs (Py, R, JS) may lack type annotations. Infer from usage, flag uncertainty.
- Missing auth ctx: Fns working in authed web req may fail via MCP no session. Check implicit session cookies, JWT, env creds.
- Over-engineer wrappers: 50-line wrapper → not good candidate. Prefer natural mapping.
- Neglect err paths: MCP must return structured errs. Untyped exceptions → err-handling wrappers.
- Conflate internal + external APIs: Internal helpers poor candidates. Focus external-consumption or boundary APIs.
- Skip gap analysis: Existing MCP provided → always compare. No gap analysis → duplicate work or stale tools.
→
scaffold-mcp-server— use out spec → working MCPbuild-custom-mcp-server— manual impl refconfigure-mcp-server— connect to Claude Code/Desktoptroubleshoot-mcp-connection— debug after deployreview-software-architecture— arch review for tool surfacesecurity-audit-codebase— audit pre-external exposure
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
polymarket
元这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。
creating-opencode-plugins
元该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
