review-codebase
关于
This skill performs a comprehensive, multi-phase review of an entire codebase, analyzing architecture, security, code quality, and UX/accessibility in a single pass. It outputs a prioritized table of findings with severity ratings, which is formatted for direct conversion into GitHub issues. Use it for a deep, coordinated audit rather than for reviewing isolated changes like pull requests.
快速安装
Claude Code
推荐npx skills add pjt222/agent-almanac -a claude-code/plugin add https://github.com/pjt222/agent-almanacgit clone https://github.com/pjt222/agent-almanac.git ~/.claude/skills/review-codebase在 Claude Code 中复制并粘贴此命令以安装该技能
技能文档
Review Codebase
Multi-phase deep codebase review producing severity-rated findings with fix-order recommendations. Unlike review-pull-request (scoped to a diff) or single-domain reviews (security-audit-codebase, review-software-architecture), this skill covers an entire project or subproject across all quality dimensions in one pass.
When to Use
- Whole-project or subproject review (not PR-scoped)
- New codebase onboarding — building a mental model of what exists and what needs attention
- Periodic health checks after sustained development
- Pre-release quality gate across architecture, security, code quality, and UX
- When the output should feed directly into issue creation or sprint planning
Inputs
- Required:
target_path— root directory of the codebase or subproject to review - Optional:
scope— which phases to run:full(default),security,architecture,quality,uxoutput_format—findings(table only),report(narrative),both(default)severity_threshold— minimum severity to include:LOW(default),MEDIUM,HIGH,CRITICAL
Procedure
Step 1: Census
Inventory the codebase to establish scope and identify review targets.
- Count files by language/type:
find target_path -type f | sort by extension - Measure total line counts per language
- Identify test directories and estimate test coverage (files with tests vs files without)
- Check dependency state: lockfiles present, outdated dependencies, known vulnerabilities
- Note build system, CI/CD configuration, and documentation state
- Record the census as the opening section of the report
Got: A factual inventory — file counts, languages, test presence, dependency health. No judgments yet.
If fail: If the target path is empty or inaccessible, stop and report. If specific subdirectories are inaccessible, note them and continue with what is available.
Step 2: Architecture Review
Assess structural health: coupling, duplication, data flow, and separation of concerns.
- Map the module/directory structure and identify the primary architectural pattern
- Check for code duplication — repeated logic across files, copy-paste patterns
- Assess coupling — how many files must change for a single feature modification
- Evaluate data flow — are there clear boundaries between layers (UI, logic, data)?
- Identify dead code, unused exports, and orphaned files
- Check for consistent patterns — does the codebase follow its own conventions?
- Rate each finding: CRITICAL, HIGH, MEDIUM, or LOW
Got: A list of architectural findings with severity ratings and file references. Common findings: mode dispatch duplication, missing abstraction layers, circular dependencies.
If fail: If the codebase is too small for meaningful architecture review (< 5 files), note this and skip to Step 3. Architecture review requires enough code to have structure.
Step 3: Security Audit
Identify security vulnerabilities and defensive coding gaps.
- Scan for injection vectors: HTML injection (
innerHTML), SQL injection, command injection - Check authentication and authorization patterns (if applicable)
- Review error handling — are errors silently swallowed? Do error messages leak internals?
- Audit dependency versions against known CVEs
- Check for hardcoded secrets, API keys, or credentials
- Review Docker/container security: root user, exposed ports, build secrets
- Check localStorage/sessionStorage for sensitive data storage
- Rate each finding: CRITICAL, HIGH, MEDIUM, or LOW
Got: A list of security findings with severity, affected files, and remediation guidance. CRITICAL findings include injection vulnerabilities and exposed secrets.
If fail: If no security-relevant code exists (pure documentation project), note this and skip to Step 4.
Step 4: Code Quality
Evaluate maintainability, readability, and defensive coding.
- Identify magic numbers and hardcoded values that should be named constants
- Check for consistent naming conventions across the codebase
- Find missing input validation at system boundaries
- Assess error handling patterns — are they consistent? Do they provide useful messages?
- Check for commented-out code, TODO/FIXME markers, and incomplete implementations
- Review test quality — are tests testing behavior or implementation details?
- Rate each finding: CRITICAL, HIGH, MEDIUM, or LOW
Got: A list of quality findings focused on maintainability. Common findings: magic numbers, inconsistent patterns, missing guards.
If fail: If the codebase is generated or minified, note this and adjust expectations. Generated code has different quality criteria than hand-written code.
Step 5: UX and Accessibility (if frontend exists)
Evaluate user experience and accessibility compliance.
- Check ARIA roles, labels, and landmarks on interactive elements
- Verify keyboard navigation — can all interactive elements be reached via Tab?
- Test focus management — does focus move logically when panels open/close?
- Check responsive design — test at common breakpoints (320px, 768px, 1024px)
- Verify color contrast ratios meet WCAG 2.1 AA standards
- Check screen reader compatibility — are dynamic content changes announced?
- Rate each finding: CRITICAL, HIGH, MEDIUM, or LOW
Got: A list of UX/a11y findings with WCAG references where applicable. If no frontend exists, this step produces "N/A — no frontend code detected."
If fail: If frontend code exists but cannot be rendered (missing build step), audit the source code statically and note that runtime testing was not possible.
Step 6: Findings Synthesis
Compile all findings into a prioritized summary.
- Merge findings from all phases into a single table
- Sort by severity (CRITICAL first, then HIGH, MEDIUM, LOW)
- Within each severity level, group by theme (security, architecture, quality, UX)
- For each finding, include: severity, phase, file(s), one-line description, suggested fix
- Produce a recommended fix order that considers dependencies between fixes
- Summarize: total findings by severity, top 3 priorities, estimated effort level
Got: A findings table with columns: #, Severity, Phase, File(s), Finding, Fix. A fix-order recommendation that accounts for dependencies (e.g., "refactor architecture before adding tests").
If fail: If no findings were produced, this is itself a finding — either the codebase is exceptionally clean or the review was too shallow. Re-examine at least one phase with deeper inspection.
Validation
- All requested phases were completed (or explicitly skipped with justification)
- Every finding has a severity rating (CRITICAL/HIGH/MEDIUM/LOW)
- Every finding references at least one file or directory
- The findings table is sorted by severity
- Fix-order recommendations account for dependencies between findings
- The summary includes total counts by severity
- If
output_formatincludesreport, narrative sections accompany the table
Scaling with Rest
Between review phases, use /rest as a checkpoint — especially between phases 2-5, which require different analytical perspectives. A checkpoint rest (brief, transitional) prevents the momentum of one phase from biasing the next. See the rest skill's "Scaling Rest" section for guidance on checkpoint vs full rest.
Pitfalls
- Boiling the ocean: Reviewing every line of a large codebase produces noise. Focus on high-impact areas: entry points, security boundaries, and architectural seams
- Severity inflation: Not every finding is CRITICAL. Reserve CRITICAL for exploitable vulnerabilities and data-loss risks. Most architectural issues are MEDIUM
- Missing the forest for the trees: Individual code quality issues matter less than systemic patterns. If magic numbers appear in 20 files, that is one architectural finding, not 20 quality findings
- Skipping the census: The census (Step 1) seems bureaucratic but prevents reviewing code that does not exist or missing entire directories
- Phase bleed: Security findings during architecture review, or quality findings during security audit. Note them for the correct phase rather than mixing concerns — it produces a cleaner findings table
Related Skills
security-audit-codebase— deep-dive security audit when the review-codebase security phase reveals complex vulnerabilitiesreview-software-architecture— detailed architecture review for specific subsystemsreview-ux-ui— comprehensive UX/accessibility audit beyond what phase 5 coversreview-pull-request— diff-scoped review for individual changesclean-codebase— implements the code quality fixes identified by this reviewcreate-github-issues— converts findings table into tracked GitHub issues
GitHub 仓库
相关推荐技能
content-collections
元Content Collections 是一个 TypeScript 优先的构建工具,可将本地 Markdown/MDX 文件转换为类型安全的数据集合。它专为构建博客、文档站和内容密集型 Vite+React 应用而设计,提供基于 Zod 的自动模式验证。该工具涵盖从 Vite 插件配置、MDX 编译到生产环境部署的完整工作流。
polymarket
元这个Claude Skill为开发者提供完整的Polymarket预测市场开发支持,涵盖API调用、交易执行和市场数据分析。关键特性包括实时WebSocket数据流,可监控实时交易、订单和市场动态。开发者可用它构建预测市场应用、实施交易策略并集成实时市场预测功能。
creating-opencode-plugins
元该Skill帮助开发者创建OpenCode插件,用于接入命令、文件、LSP等25+种事件。它提供了插件结构、事件API规范和JavaScript/TypeScript实现模式,适合需要拦截操作、扩展功能或自定义事件处理的场景。开发者可通过它快速构建响应式模块来增强OpenCode AI助手的能力。
sglang
元SGLang是一个专为LLM设计的高性能推理框架,特别适用于需要结构化输出的场景。它通过RadixAttention前缀缓存技术,在处理JSON、正则表达式、工具调用等具有重复前缀的复杂工作流时,能实现极速生成。如果你正在构建智能体或多轮对话系统,并追求远超vLLM的推理性能,SGLang是理想选择。
