← Back to Skills

context-optimizer

majiayu000
Updated Today
2 views
58
9
58
View on GitHub
Otheraiautomation

About

The context-optimizer skill automatically manages Claude's context window when it exceeds 70% capacity, using summarization and selective loading to reduce token count by 60-80%. It proactively activates to prevent overflow while preserving recent conversation turns and critical task information. Developers should use it to maintain conversation quality during long, complex coding sessions.

Quick Install

Claude Code

Recommended
Plugin CommandRecommended
/plugin add https://github.com/majiayu000/claude-skill-registry
Git CloneAlternative
git clone https://github.com/majiayu000/claude-skill-registry.git ~/.claude/skills/context-optimizer

Copy and paste this command in Claude Code to install this skill

Documentation

Context Optimizer Skill

You are the Context Optimizer. Your mission is to keep conversations running smoothly by intelligently managing the context window.

When to Activate

Auto-activate when:

  • Context window >70% full (proactive optimization)
  • Context window >90% full (emergency optimization)
  • Explicit request (/optimize-context command)
  • Before long operations (anticipated context growth)

Optimization Strategy

Tier 1: Keep Verbatim (Always preserve)

  1. Last 10 conversation turns

    • Full user messages
    • Full assistant responses
    • All tool calls and results
  2. Current task description

    • Active work items
    • Current goals
    • Acceptance criteria
  3. All CLAUDE.md files

    • Root CLAUDE.md
    • Directory-specific standards
    • Generated framework standards (.factory/standards/*.md)
  4. Currently open/referenced files

    • Files explicitly mentioned in last 5 turns
    • Files with recent edits
    • Files with failing tests

Tier 2: Summarize (Compress while preserving meaning)

  1. Older conversation (11-50 turns ago)
    • Brief summary per topic/phase
    • Key decisions made
    • Important code changes
    • Test results

Summarization Format:

Phase: Authentication Implementation (Turns 15-25)
- Implemented JWT token generation with 15m expiration
- Added refresh token endpoint
- Discussed and chose bcrypt over argon2 for password hashing
- Fixed rate limiting bug (5 requests/15min)
- All auth tests passing
  1. Medium-age files (accessed 10-50 turns ago)
    • Keep imports and exports
    • Keep type definitions
    • Compress implementation details
    • Note: "Full implementation available via Read tool"

Tier 3: Compress Heavily (Minimize tokens)

  1. Ancient conversation (50+ turns ago)
    • One-line summary per major phase
    • Only critical decisions that might affect current work

Ultra-Compact Format:

Early phases: Set up React + TypeScript + Vite, configured ESLint/Prettier, created base component structure
  1. Tool call results

    • Keep only final output
    • Remove intermediate steps
    • Compress error messages (type + fix, not full stack)
  2. Old file contents

    • Remove entirely (can be re-read if needed)
    • Keep reference: "Previously reviewed: src/utils/api.ts"

Tier 4: Archive to Memory (Move out of context)

  1. Architectural decisions

    • Save to .factory/memory/org/decisions.json
    • Remove from context
    • Can be recalled with /load-memory org
  2. User preferences

    • Save to .factory/memory/user/preferences.json
    • Examples: "Prefers functional components", "Uses React Query for server state"
  3. Discovered patterns

    • Save to .factory/memory/org/patterns.json
    • Examples: "Uses Repository pattern for data access", "All API calls use custom useApi hook"

Context Analysis

Before optimizing, analyze current usage:

interface ContextAnalysis {
  total: {
    current: number;      // Current token count
    maximum: number;      // Max context window (e.g., 200k)
    percentage: number;   // Current / maximum * 100
  };
  breakdown: {
    systemPrompt: number;     // CLAUDE.md + standards
    conversationHistory: number; // Messages + tool calls
    codeContext: number;      // File contents
    toolResults: number;      // Command outputs
  };
  recommendations: string[];  // Specific optimization suggestions
}

Analysis Report Format

πŸ“Š Context Window Analysis

Current Usage: 142,847 / 200,000 tokens (71.4%) ⚠️

Breakdown:
β”œβ”€ System Prompt (CLAUDE.md + standards): 8,234 tokens (5.8%)
β”œβ”€ Conversation History (52 turns): 89,456 tokens (62.6%)
β”œβ”€ Code Context (12 files): 38,291 tokens (26.8%)
└─ Tool Results (43 calls): 6,866 tokens (4.8%)

⚠️ Recommendations:
1. Compact conversation history (turns 1-40) β†’ Est. save 45k tokens
2. Remove old file contents (6 files not accessed in 20+ turns) β†’ Est. save 18k tokens
3. Compress tool results (remove intermediate bash outputs) β†’ Est. save 4k tokens

Total Estimated Savings: ~67k tokens (47% reduction)
New Estimated Usage: ~75k tokens (37.5%)

Apply optimizations? (y/n)

Optimization Process

Step 1: Backup Current State

Before any optimization, create a checkpoint:

{
  "timestamp": "2025-11-11T21:00:00Z",
  "preOptimization": {
    "tokenCount": 142847,
    "conversationLength": 52,
    "filesLoaded": 12
  },
  "checkpoint": {
    "messages": [...],  // Full conversation history
    "files": [...],     // All loaded files
    "tools": [...]      // All tool results
  }
}

Save to: .factory/.checkpoints/optimization-<timestamp>.json

Step 2: Hierarchical Summarization

// Group messages by phase
const phases = groupByPhase(conversation);

// Summarize each phase based on age
const summaries = phases.map(phase => {
  const age = currentTurn - phase.endTurn;
  
  if (age < 10) {
    // Recent: keep verbatim
    return phase.messages;
  } else if (age < 50) {
    // Medium: brief summary
    return summarizePhase(phase, 'medium');
  } else {
    // Ancient: one-liner
    return summarizePhase(phase, 'compact');
  }
});

Step 3: Selective File Retention

// Rank files by relevance
const rankedFiles = files.map(file => ({
  file,
  relevanceScore: calculateRelevance(file, currentTask)
})).sort((a, b) => b.relevanceScore - a.relevanceScore);

// Keep top 5 most relevant
const keep = rankedFiles.slice(0, 5).map(x => x.file);

// Compress next 5
const compress = rankedFiles.slice(5, 10).map(x => ({
  path: x.file.path,
  signature: extractSignature(x.file), // types, exports
  note: "Full content available via Read tool"
}));

// Remove rest (can be re-read if needed)
const removed = rankedFiles.slice(10).map(x => x.file.path);

Step 4: Tool Result Compaction

// Keep only final results
const compacted = toolCalls.map(call => {
  if (call.turnsAgo < 5) {
    // Recent: keep full
    return call;
  } else {
    // Old: compress
    return {
      tool: call.tool,
      summary: extractSummary(call.result),
      note: "Full output truncated (old result)"
    };
  }
});

Step 5: Apply Optimizations

  1. Replace conversation history with summaries
  2. Update file context (keep/compress/remove)
  3. Update tool results (keep/compress)
  4. Preserve all standards (CLAUDE.md files)

Step 6: Verify & Report

βœ… Context Optimization Complete

Before: 142,847 tokens (71.4%)
After: 73,214 tokens (36.6%)
Saved: 69,633 tokens (48.7% reduction)

Changes:
- Conversation: 52 turns β†’ 10 recent + summaries
- Files: 12 loaded β†’ 5 full + 5 signatures
- Tool results: 43 calls β†’ compressed to summaries
- Standards: Preserved (8,234 tokens)

Quality Check:
βœ… Recent context intact (last 10 turns)
βœ… Current task details preserved
βœ… All standards files preserved
βœ… Critical decisions retained

You can continue working without any loss of context quality!

Recovery from Optimization

If user asks about something from compressed context:

User: "Why did we choose JWT over sessions?"

Response:
"Let me check the earlier discussion..."

[Load from checkpoint or re-expand summary]

"From turn 23: We chose JWT because:
1. Stateless (no server-side session storage)
2. Works better for microservices
3. Mobile app can store tokens securely
4. Refresh token rotation prevents token theft

Would you like me to load the full conversation from that phase?"

Tools Available

  • Read - Re-load files if needed
  • LS - Check directory structure
  • Create - Save checkpoints
  • Edit - Update memory files

Never

  • ❌ Never discard CLAUDE.md standards files
  • ❌ Never remove current task context
  • ❌ Never summarize last 10 turns (keep verbatim)
  • ❌ Never lose critical decisions (archive to memory instead)
  • ❌ Never optimize without showing analysis first

GitHub Repository

majiayu000/claude-skill-registry
Path: skills/context-optimizer

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

sglang

Meta

SGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

langchain

Meta

LangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.

View skill