analyze-codebase-for-mcp
关于
This skill analyzes codebases to identify functions, APIs, and data sources suitable for exposure as MCP tools, generating a specification document. Use it when planning an MCP server, auditing a codebase for AI tool wrapping, or comparing existing capabilities with current MCP exposure. It helps developers systematically discover tool candidates and create specs for scaffold-mcp-server.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/analyze-codebase-for-mcp在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Codebasis fuer MCP analysieren
Scannen a codebase to discover functions, REST endpoints, CLI commands, and data access patterns that are good candidates for MCP tool exposure, then produce a structured tool specification document.
Wann verwenden
- Planning an MCP server for an existing project and need to know what to expose
- Auditing a codebase vor wrapping it as an AI-accessible tool surface
- Comparing what a codebase can do versus what is already exposed via MCP
- Generating a tool specification document to hand off to
scaffold-mcp-server - Evaluating whether a third-party library is worth wrapping as MCP tools
Eingaben
- Erforderlich: Path to die Codebasis root directory
- Erforderlich: Target language(s) of die Codebasis (e.g., TypeScript, Python, R, Go)
- Optional: Existing MCP server code to compare gegen (gap analysis)
- Optional: Domain focus (e.g., "data analysis", "file operations", "API integration")
- Optional: Maximum number of tools to recommend (default: 20)
Vorgehensweise
Schritt 1: Scannen Codebase Structure
1.1. Use Glob to map das Verzeichnis tree, focusing on source directories:
src/**/*.{ts,js,py,R,go,rs}for Quelldateis**/routes/**,**/api/**,**/controllers/**for endpoint definitions**/cli/**,**/commands/**for CLI entry points**/package.json,**/setup.py,**/DESCRIPTIONfor Abhaengigkeit metadata
1.2. Categorize files by role:
- Entry points: main files, route handlers, CLI commands
- Core logic: business logic functions, algorithms, data transformers
- Data access: database queries, file I/O, API clients
- Utilities: helpers, formatters, validators
1.3. Zaehlen total files, lines of code, and exported symbols to gauge project size.
Erwartet: A categorized file inventory with role annotations.
Bei Fehler: If die Codebasis is too large (>10,000 files), narrow the scan to specific directories or modules using the domain focus input. If no Quelldateis are found, verify the root path and language parameters.
Schritt 2: Identifizieren Exposed Functions and Endpoints
2.1. Use Grep to find exported functions and oeffentliche APIs:
- TypeScript/JavaScript:
export (async )?function,export default,module.exports - Python: functions not prefixed with
_,@app.route,@router - R: functions listed in NAMESPACE or
#' @exportroxygen tags - Go: capitalized function names (exported by convention)
2.2. Fuer jede candidate function, extract:
- Name: function or endpoint name
- Signature: parameters with types and defaults
- Zurueckgeben type: what die Funktion produces
- Documentation: docstrings, JSDoc, roxygen, godoc
- Location: Dateipfad and line number
2.3. For REST APIs, zusaetzlich extract:
- HTTP method and route pattern
- Request body schema
- Response shape
- Authentication requirements
2.4. Erstellen a candidate list sorted by potential utility (public, documented, well-typed functions first).
Erwartet: A list of 20-100 candidate functions/endpoints with extracted metadata.
Bei Fehler: If few candidates are found, broaden the search to include internal functions that could be made public. If documentation is sparse, flag this as a risk in die Ausgabe.
Schritt 3: Bewerten MCP Suitability
3.1. Fuer jede candidate, assess gegen MCP tool criteria:
- Input contract clarity: Are parameters well-typed and documented? Can they be described in a JSON Schema?
- Output predictability: Does die Funktion return structured data (JSON-serializable)? Is the return shape consistent?
- Side effects: Does die Funktion modify state (files, database, external services)? Side effects muss clearly labeled.
- Idempotency: Is the operation safe to retry? Non-idempotent tools need explicit warnings.
- Execution time: Will it complete innerhalb a reasonable timeout (< 30 seconds)? Long-running operations need async patterns.
- Error handling: Does it throw structured errors or fail silently?
3.2. Score each candidate on a 1-5 scale:
- 5: Pure function, typed I/O, documented, fast, no Seiteneffekts
- 4: Well-typed, documented, minor Seiteneffekts (e.g., logging)
- 3: Reasonable I/O contract but needs wrapping (e.g., returns raw objects)
- 2: Significant Seiteneffekts or unclear contract, needs substantial adaptation
- 1: Not suitable ohne major refactoring
3.3. Filtern candidates to those scoring 3 or ueber. Flag score-2 items as "future candidates" requiring refactoring.
Erwartet: A scored and filtered candidate list with suitability rationale for each.
Bei Fehler: If most candidates score unter 3, die Codebasis may need refactoring vor MCP exposure. Dokumentieren the gaps and recommend specific improvements (add types, extract pure functions, wrap Seiteneffekts).
Schritt 4: Entwerfen Tool Specifications
4.1. Fuer jede selected candidate (score >= 3), draft a tool specification:
- name: tool_name
description: >
One-line description of what the tool does.
source_function: module.function_name
source_file: src/path/to/file.ts:42
parameters:
param_name:
type: string | number | boolean | object | array
description: What this parameter controls
required: true | false
default: value_if_optional
returns:
type: string | object | array
description: What the tool returns
side_effects:
- description of any side effect
estimated_latency: fast | medium | slow
suitability_score: 5
4.2. Group tools into logical categories (e.g., "Data Queries", "File Operations", "Analysis", "Configuration").
4.3. Identifizieren Abhaengigkeiten zwischen tools (e.g., "list_datasets" sollte called vor "query_dataset").
4.4. Bestimmen if any tools need wrappers to:
- Simplify complex parameter objects into flat inputs
- Konvertieren raw return values to structured text or JSON
- Hinzufuegen safety guards (e.g., read-only wrappers for database functions)
Erwartet: A complete YAML tool specification with categories, Abhaengigkeiten, and wrapper notes.
Bei Fehler: If tool specifications are ambiguous, revisit Step 2 to extract more detail from Quellcode. If parameter types cannot be inferred, flag for manual review.
Schritt 5: Generieren Tool Spec Document
5.1. Schreiben the final specification document with these sections:
- Summary: Codebase overview, language, size, and analysis date
- Recommended Tools: Full specifications from Step 4, grouped by category
- Future Candidates: Score-2 items with refactoring recommendations
- Excluded Items: Score-1 items with exclusion rationale
- Dependencies: Tool Abhaengigkeit graph
- Implementation Notes: Wrapper requirements, Authentifizierung needs, transport recommendations
5.2. Speichern as mcp-tool-spec.yml (machine-readable) and optionally mcp-tool-spec.md (human-readable summary).
5.3. If an existing MCP server was provided, include a gap analysis section:
- Tools in the spec but not yet implemented
- Implemented tools not in the spec (possibly stale)
- Tools with specification drift (implementation diverges from spec)
Erwartet: A complete tool specification document ready for consumption by scaffold-mcp-server.
Bei Fehler: If the document exceeds reasonable size (>200 tools), split into modules with cross-references. If die Codebasis has no suitable candidates, produce a "readiness assessment" document with refactoring recommendations stattdessen.
Validierung
- All Quelldateis in das Ziel codebase were scanned
- Candidate functions have extracted names, signatures, and return types
- Each candidate has a suitability score with written rationale
- Tool specifications include complete parameter schemas with types
- Side effects are explicitly documented for every tool
- The output document is valid YAML (parseable by any YAML library)
- Tool names follow MCP conventions (snake_case, descriptive, unique)
- Categories and Abhaengigkeiten form a coherent tool surface
- Gap analysis is included when an existing MCP server was provided
- Future candidates section lists refactoring steps needed for score-2 items
Haeufige Stolperfallen
- Exposing too many tools: AI assistants work best with 10-30 focused tools. Priorisieren breadth of capability over depth. Resist exposing every public function.
- Ignoring Seiteneffekts: A function that "just reads" but also writes to a log or cache still has Seiteneffekts. Audit carefully with
Grepfor file writes, network calls, and database mutations. - Assuming type safety: Dynamic languages (Python, R, JavaScript) may have functions with no type annotations. Infer types from usage patterns and tests, but flag uncertainty in the spec.
- Missing Authentifizierung context: Functions that work in an authenticated web request may fail when called via MCP ohne session context. Pruefen auf implicit auth Abhaengigkeiten wie z.B. session cookies, JWT tokens, or environment-injected Zugangsdaten.
- Over-engineering wrappers: If a function needs a 50-line wrapper to be MCP-compatible, it may not be a good candidate. Bevorzugen functions that map naturally to tool interfaces.
- Neglecting error paths: MCP tools must return structured errors. Functions that throw untyped exceptions need error-handling wrappers.
- Conflating internal and external APIs: Internal helper functions called by other internal code are poor MCP candidates. Fokussieren auf functions designed for external consumption or clear boundary APIs.
- Skipping the gap analysis: If an existing MCP server is provided, always compare the spec gegen current implementation. Without gap analysis, you risk duplicating work or missing stale tools.
Verwandte Skills
scaffold-mcp-server- use die Ausgabe spec to generate a working MCP serverbuild-custom-mcp-server- manual server implementation referenceconfigure-mcp-server- connect das Ergebnising server to Claude Code/Desktoptroubleshoot-mcp-connection- debug connectivity nach deploying der Serverreview-software-architecture- architecture review for tool surface designsecurity-audit-codebase- security audit vor exposing functions externally
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
polymarket
元这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。
creating-opencode-plugins
元该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
