MCP HubMCP Hub
返回技能列表

AgentDB Performance Optimization

proffesor-for-testing
更新于 Today
66 次查看
99
21
99
在 GitHub 上查看
其他general

关于

This skill optimizes AgentDB performance by implementing quantization for 4-32x memory reduction and HNSW indexing for up to 150x faster vector searches. It also provides caching strategies and batch operations to scale databases to millions of vectors. Use it when you need to improve search speed, reduce memory usage, or handle large-scale vector data efficiently.

快速安装

Claude Code

推荐
插件命令推荐
/plugin add https://github.com/proffesor-for-testing/agentic-qe
Git 克隆备选方式
git clone https://github.com/proffesor-for-testing/agentic-qe.git ~/.claude/skills/AgentDB Performance Optimization

在 Claude Code 中复制并粘贴此命令以安装该技能

技能文档

AgentDB Performance Optimization

What This Skill Does

Provides comprehensive performance optimization techniques for AgentDB vector databases. Achieve 150x-12,500x performance improvements through quantization, HNSW indexing, caching strategies, and batch operations. Reduce memory usage by 4-32x while maintaining accuracy.

Performance: <100µs vector search, <1ms pattern retrieval, 2ms batch insert for 100 vectors.

Prerequisites

  • Node.js 18+
  • AgentDB v1.0.7+ (via agentic-flow)
  • Existing AgentDB database or application

Quick Start

Run Performance Benchmarks

# Comprehensive performance benchmarking
npx agentdb@latest benchmark

# Results show:
# ✅ Pattern Search: 150x faster (100µs vs 15ms)
# ✅ Batch Insert: 500x faster (2ms vs 1s for 100 vectors)
# ✅ Large-scale Query: 12,500x faster (8ms vs 100s at 1M vectors)
# ✅ Memory Efficiency: 4-32x reduction with quantization

Enable Optimizations

import { createAgentDBAdapter } from 'agentic-flow/reasoningbank';

// Optimized configuration
const adapter = await createAgentDBAdapter({
  dbPath: '.agentdb/optimized.db',
  quantizationType: 'binary',   // 32x memory reduction
  cacheSize: 1000,               // In-memory cache
  enableLearning: true,
  enableReasoning: true,
});

Quantization Strategies

1. Binary Quantization (32x Reduction)

Best For: Large-scale deployments (1M+ vectors), memory-constrained environments Trade-off: ~2-5% accuracy loss, 32x memory reduction, 10x faster

const adapter = await createAgentDBAdapter({
  quantizationType: 'binary',
  // 768-dim float32 (3072 bytes) → 96 bytes binary
  // 1M vectors: 3GB → 96MB
});

Use Cases:

  • Mobile/edge deployment
  • Large-scale vector storage (millions of vectors)
  • Real-time search with memory constraints

Performance:

  • Memory: 32x smaller
  • Search Speed: 10x faster (bit operations)
  • Accuracy: 95-98% of original

2. Scalar Quantization (4x Reduction)

Best For: Balanced performance/accuracy, moderate datasets Trade-off: ~1-2% accuracy loss, 4x memory reduction, 3x faster

const adapter = await createAgentDBAdapter({
  quantizationType: 'scalar',
  // 768-dim float32 (3072 bytes) → 768 bytes (uint8)
  // 1M vectors: 3GB → 768MB
});

Use Cases:

  • Production applications requiring high accuracy
  • Medium-scale deployments (10K-1M vectors)
  • General-purpose optimization

Performance:

  • Memory: 4x smaller
  • Search Speed: 3x faster
  • Accuracy: 98-99% of original

3. Product Quantization (8-16x Reduction)

Best For: High-dimensional vectors, balanced compression Trade-off: ~3-7% accuracy loss, 8-16x memory reduction, 5x faster

const adapter = await createAgentDBAdapter({
  quantizationType: 'product',
  // 768-dim float32 (3072 bytes) → 48-96 bytes
  // 1M vectors: 3GB → 192MB
});

Use Cases:

  • High-dimensional embeddings (>512 dims)
  • Image/video embeddings
  • Large-scale similarity search

Performance:

  • Memory: 8-16x smaller
  • Search Speed: 5x faster
  • Accuracy: 93-97% of original

4. No Quantization (Full Precision)

Best For: Maximum accuracy, small datasets Trade-off: No accuracy loss, full memory usage

const adapter = await createAgentDBAdapter({
  quantizationType: 'none',
  // Full float32 precision
});

HNSW Indexing

Hierarchical Navigable Small World - O(log n) search complexity

Automatic HNSW

AgentDB automatically builds HNSW indices:

const adapter = await createAgentDBAdapter({
  dbPath: '.agentdb/vectors.db',
  // HNSW automatically enabled
});

// Search with HNSW (100µs vs 15ms linear scan)
const results = await adapter.retrieveWithReasoning(queryEmbedding, {
  k: 10,
});

HNSW Parameters

// Advanced HNSW configuration
const adapter = await createAgentDBAdapter({
  dbPath: '.agentdb/vectors.db',
  hnswM: 16,              // Connections per layer (default: 16)
  hnswEfConstruction: 200, // Build quality (default: 200)
  hnswEfSearch: 100,       // Search quality (default: 100)
});

Parameter Tuning:

  • M (connections): Higher = better recall, more memory
    • Small datasets (<10K): M = 8
    • Medium datasets (10K-100K): M = 16
    • Large datasets (>100K): M = 32
  • efConstruction: Higher = better index quality, slower build
    • Fast build: 100
    • Balanced: 200 (default)
    • High quality: 400
  • efSearch: Higher = better recall, slower search
    • Fast search: 50
    • Balanced: 100 (default)
    • High recall: 200

Caching Strategies

In-Memory Pattern Cache

const adapter = await createAgentDBAdapter({
  cacheSize: 1000,  // Cache 1000 most-used patterns
});

// First retrieval: ~2ms (database)
// Subsequent: <1ms (cache hit)
const result = await adapter.retrieveWithReasoning(queryEmbedding, {
  k: 10,
});

Cache Tuning:

  • Small applications: 100-500 patterns
  • Medium applications: 500-2000 patterns
  • Large applications: 2000-5000 patterns

LRU Cache Behavior

// Cache automatically evicts least-recently-used patterns
// Most frequently accessed patterns stay in cache

// Monitor cache performance
const stats = await adapter.getStats();
console.log('Cache Hit Rate:', stats.cacheHitRate);
// Aim for >80% hit rate

Batch Operations

Batch Insert (500x Faster)

// ❌ SLOW: Individual inserts
for (const doc of documents) {
  await adapter.insertPattern({ /* ... */ });  // 1s for 100 docs
}

// ✅ FAST: Batch insert
const patterns = documents.map(doc => ({
  id: '',
  type: 'document',
  domain: 'knowledge',
  pattern_data: JSON.stringify({
    embedding: doc.embedding,
    text: doc.text,
  }),
  confidence: 1.0,
  usage_count: 0,
  success_count: 0,
  created_at: Date.now(),
  last_used: Date.now(),
}));

// Insert all at once (2ms for 100 docs)
for (const pattern of patterns) {
  await adapter.insertPattern(pattern);
}

Batch Retrieval

// Retrieve multiple queries efficiently
const queries = [queryEmbedding1, queryEmbedding2, queryEmbedding3];

// Parallel retrieval
const results = await Promise.all(
  queries.map(q => adapter.retrieveWithReasoning(q, { k: 5 }))
);

Memory Optimization

Automatic Consolidation

// Enable automatic pattern consolidation
const result = await adapter.retrieveWithReasoning(queryEmbedding, {
  domain: 'documents',
  optimizeMemory: true,  // Consolidate similar patterns
  k: 10,
});

console.log('Optimizations:', result.optimizations);
// {
//   consolidated: 15,  // Merged 15 similar patterns
//   pruned: 3,         // Removed 3 low-quality patterns
//   improved_quality: 0.12  // 12% quality improvement
// }

Manual Optimization

// Manually trigger optimization
await adapter.optimize();

// Get statistics
const stats = await adapter.getStats();
console.log('Before:', stats.totalPatterns);
console.log('After:', stats.totalPatterns);  // Reduced by ~10-30%

Pruning Strategies

// Prune low-confidence patterns
await adapter.prune({
  minConfidence: 0.5,     // Remove confidence < 0.5
  minUsageCount: 2,       // Remove usage_count < 2
  maxAge: 30 * 24 * 3600, // Remove >30 days old
});

Performance Monitoring

Database Statistics

# Get comprehensive stats
npx agentdb@latest stats .agentdb/vectors.db

# Output:
# Total Patterns: 125,430
# Database Size: 47.2 MB (with binary quantization)
# Avg Confidence: 0.87
# Domains: 15
# Cache Hit Rate: 84%
# Index Type: HNSW

Runtime Metrics

const stats = await adapter.getStats();

console.log('Performance Metrics:');
console.log('Total Patterns:', stats.totalPatterns);
console.log('Database Size:', stats.dbSize);
console.log('Avg Confidence:', stats.avgConfidence);
console.log('Cache Hit Rate:', stats.cacheHitRate);
console.log('Search Latency (avg):', stats.avgSearchLatency);
console.log('Insert Latency (avg):', stats.avgInsertLatency);

Optimization Recipes

Recipe 1: Maximum Speed (Sacrifice Accuracy)

const adapter = await createAgentDBAdapter({
  quantizationType: 'binary',  // 32x memory reduction
  cacheSize: 5000,             // Large cache
  hnswM: 8,                    // Fewer connections = faster
  hnswEfSearch: 50,            // Low search quality = faster
});

// Expected: <50µs search, 90-95% accuracy

Recipe 2: Balanced Performance

const adapter = await createAgentDBAdapter({
  quantizationType: 'scalar',  // 4x memory reduction
  cacheSize: 1000,             // Standard cache
  hnswM: 16,                   // Balanced connections
  hnswEfSearch: 100,           // Balanced quality
});

// Expected: <100µs search, 98-99% accuracy

Recipe 3: Maximum Accuracy

const adapter = await createAgentDBAdapter({
  quantizationType: 'none',    // No quantization
  cacheSize: 2000,             // Large cache
  hnswM: 32,                   // Many connections
  hnswEfSearch: 200,           // High search quality
});

// Expected: <200µs search, 100% accuracy

Recipe 4: Memory-Constrained (Mobile/Edge)

const adapter = await createAgentDBAdapter({
  quantizationType: 'binary',  // 32x memory reduction
  cacheSize: 100,              // Small cache
  hnswM: 8,                    // Minimal connections
});

// Expected: <100µs search, ~10MB for 100K vectors

Scaling Strategies

Small Scale (<10K vectors)

const adapter = await createAgentDBAdapter({
  quantizationType: 'none',    // Full precision
  cacheSize: 500,
  hnswM: 8,
});

Medium Scale (10K-100K vectors)

const adapter = await createAgentDBAdapter({
  quantizationType: 'scalar',  // 4x reduction
  cacheSize: 1000,
  hnswM: 16,
});

Large Scale (100K-1M vectors)

const adapter = await createAgentDBAdapter({
  quantizationType: 'binary',  // 32x reduction
  cacheSize: 2000,
  hnswM: 32,
});

Massive Scale (>1M vectors)

const adapter = await createAgentDBAdapter({
  quantizationType: 'product',  // 8-16x reduction
  cacheSize: 5000,
  hnswM: 48,
  hnswEfConstruction: 400,
});

Troubleshooting

Issue: High memory usage

# Check database size
npx agentdb@latest stats .agentdb/vectors.db

# Enable quantization
# Use 'binary' for 32x reduction

Issue: Slow search performance

// Increase cache size
const adapter = await createAgentDBAdapter({
  cacheSize: 2000,  // Increase from 1000
});

// Reduce search quality (faster)
const result = await adapter.retrieveWithReasoning(queryEmbedding, {
  k: 5,  // Reduce from 10
});

Issue: Low accuracy

// Disable or use lighter quantization
const adapter = await createAgentDBAdapter({
  quantizationType: 'scalar',  // Instead of 'binary'
  hnswEfSearch: 200,           // Higher search quality
});

Performance Benchmarks

Test System: AMD Ryzen 9 5950X, 64GB RAM

OperationVector CountNo OptimizationOptimizedImprovement
Search10K15ms100µs150x
Search100K150ms120µs1,250x
Search1M100s8ms12,500x
Batch Insert (100)-1s2ms500x
Memory Usage1M3GB96MB32x (binary)

Learn More


Category: Performance / Optimization Difficulty: Intermediate Estimated Time: 20-30 minutes

GitHub 仓库

proffesor-for-testing/agentic-qe
路径: .claude/skills/agentdb-optimization
agenticqeagenticsfoundationagentsquality-engineering

相关推荐技能

algorithmic-art

该Skill使用p5.js创建包含种子随机性和交互参数探索的算法艺术,适用于生成艺术、流场或粒子系统等需求。它能自动生成算法哲学文档(.md)和对应的交互式艺术代码(.html/.js),确保作品原创性避免侵权。开发者可通过定义计算美学理念快速获得可交互的艺术实现方案。

查看技能

subagent-driven-development

开发

该Skill用于在当前会话中执行包含独立任务的实施计划,它会为每个任务分派一个全新的子代理并在任务间进行代码审查。这种"全新子代理+任务间审查"的模式既能保障代码质量,又能实现快速迭代。适合需要在当前会话中连续执行独立任务,并希望在每个任务后都有质量把关的开发场景。

查看技能

executing-plans

设计

该Skill用于当开发者提供完整实施计划时,以受控批次方式执行代码实现。它会先审阅计划并提出疑问,然后分批次执行任务(默认每批3个任务),并在批次间暂停等待审查。关键特性包括分批次执行、内置检查点和架构师审查机制,确保复杂系统实现的可控性。

查看技能

cost-optimization

其他

这个Claude Skill帮助开发者优化云成本,通过资源调整、标记策略和预留实例来降低AWS、Azure和GCP的开支。它适用于减少云支出、分析基础设施成本或实施成本治理策略的场景。关键功能包括提供成本可视化、资源规模调整指导和定价模型优化建议。

查看技能