mcp-builder

bobmatnyc

Updated Yesterday

20 views

Metaaiapimcpdesign

About

mcp-builder helps developers create high-quality MCP servers for integrating external APIs and services with LLMs. Use it when building MCP integrations in Python (FastMCP) or Node/TypeScript (MCP SDK) to prioritize agent workflows over simple API wrappers. It guides you through research-driven design, implementation with validation, and evaluation-based iteration for context-efficient tools.

Documentation

MCP Server Development Guide

Overview

Build high-quality MCP (Model Context Protocol) servers that enable LLMs to accomplish real-world tasks through well-designed tools. Quality is measured not by API coverage, but by how effectively agents can use your tools to complete realistic workflows.

Core insight: MCP servers expose tools for AI agents, not human users. Design for agent constraints (limited context, no visual UI, workflow-oriented) rather than human convenience.

When to Use This Skill

Activate when:

Building MCP servers for external API integration
Adding tools to existing MCP servers
Improving MCP server tool design for better agent usability
Creating evaluations to test MCP server effectiveness
Debugging why agents struggle with your MCP tools

Language Support:

Python: FastMCP framework (recommended for rapid development)
Node/TypeScript: MCP SDK (recommended for production services)

The Iron Law

DESIGN FOR AGENTS, NOT HUMANS

Every tool must optimize for:
- Context efficiency (agents have limited tokens)
- Workflow completion (not just API calls)
- Actionable errors (guide agents to success)
- Natural task subdivision (how agents think)

If your tools are just thin API wrappers, you're violating the Iron Law.

Core Principles

Agent-Centric Design First: Study design principles before coding. Tools should enable workflows, not mirror APIs.
Research-Driven Planning: Load MCP docs, SDK docs, and exhaustive API documentation before writing code.
Evaluation-Based Iteration: Create realistic evaluations early. Let agent feedback drive improvements.
Context Optimization: Every response token matters. Default to concise, offer detailed when needed.
Actionable Errors: Error messages should teach agents correct usage patterns.

Quick Start

Phase 1: Research and Planning (40% of effort)

Study Design Principles: Load design_principles.md to understand agent-centric design
Load Protocol Docs: Fetch https://modelcontextprotocol.io/llms-full.txt for MCP specification
Study SDK Docs: Load Python or TypeScript SDK documentation from GitHub
Study API Exhaustively: Read ALL API documentation, endpoints, authentication, rate limits
Create Implementation Plan: Define tools, shared utilities, pagination strategy, error handling

See workflow.md for complete Phase 1 steps.

Phase 2: Implementation (30% of effort)

Setup Project: Create structure following language-specific guide
Build Shared Utilities: API helpers, error handlers, formatters BEFORE tools
Implement Tools: Use Pydantic (Python) or Zod (TypeScript) for validation
Follow Best Practices: Load language-specific guide for patterns

See workflow.md for complete Phase 2 steps and language guides.

Phase 3: Review and Refine (15% of effort)

Code Quality Review: Check DRY, composability, consistency, type safety
Test Build: Verify syntax, imports, build process
Quality Checklist: Use language-specific checklist

See workflow.md for complete Phase 3 steps.

Phase 4: Create Evaluations (15% of effort)

Understand Purpose: Evaluations test if agents can answer realistic questions using your tools
Create 10 Questions: Complex, read-only, independent, verifiable questions
Verify Answers: Solve yourself to ensure stability and correctness
Run Evaluation: Use provided scripts to test agent effectiveness

See evaluation.md for complete evaluation guidelines.

Navigation

Core Design and Workflow

🎯 Design Principles - Agent-centric design philosophy: workflows over APIs, context optimization, actionable errors, natural task subdivision. Read FIRST before implementation.
🔄 Complete Workflow - Detailed 4-phase development process with step-by-step instructions, decision trees, and when to load each reference file.

Universal MCP Guidelines

📋 MCP Best Practices - Naming conventions, response formats, pagination, character limits, security, tool annotations, error handling. Applies to all MCP servers.

Language-Specific Implementation

🐍 Python Implementation - FastMCP patterns, Pydantic validation, async/await, complete examples, quality checklist. Load during Phase 2 for Python servers.
⚡ TypeScript Implementation - MCP SDK patterns, Zod validation, project structure, complete examples, quality checklist. Load during Phase 2 for TypeScript servers.

Evaluation and Testing

✅ Evaluation Guide - Creating realistic questions, answer verification, XML format, running evaluations, interpreting results. Load during Phase 4.

Key Reminders

Research First: Spend 40% of time researching before coding
Agent-Centric: Design for AI workflows, not API completeness
Context Efficient: Every token counts - default concise, offer detailed
Actionable Errors: Guide agents to correct usage
Shared Utilities: Extract common code - avoid duplication
Evaluation-Driven: Create evals early, iterate based on feedback
MCP Servers Block: Never run servers directly - use evaluation harness or tmux

Red Flags - STOP

If you catch yourself:

"Just wrapping these API endpoints directly"
"Returning all available data fields"
"Error message just says what failed" (not how to fix)
Starting implementation without reading design principles
Coding before loading MCP protocol documentation
Creating tools without knowing agent use cases
Skipping evaluation creation
Running python server.py directly (will hang forever)

ALL of these mean: STOP. Return to design principles and workflow.

Integration with Other Skills

systematic-debugging: Debug MCP server issues methodically
test-driven-development: Create failing tests before implementation
verification-before-completion: Verify build succeeds before claiming completion
defense-in-depth: Add input validation at multiple layers

Real-World Impact

From MCP server development experience:

Well-designed servers: 80-90% task completion rate by agents
API wrapper approach: 30-40% task completion rate
Context-optimized responses: 3x more information in same token budget
Actionable errors: 60% reduction in agent retry attempts
Evaluation-driven iteration: 2-3x improvement in agent success rate

Remember: The quality of an MCP server is measured by how well it enables LLMs to accomplish realistic tasks, not by how comprehensively it wraps an API.

Quick Install

/plugin add https://github.com/bobmatnyc/claude-mpm/tree/main/mcp-builder

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

bobmatnyc/claude-mpm

Path: src/claude_mpm/skills/bundled/main/mcp-builder

Related Skills

sglang

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

llamaguard

Other

LlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.