database-query-optimization

aj-geddes

Updated Today

15 views

Otherdata

About

This skill helps developers optimize database queries by analyzing execution plans, creating proper indexes, and rewriting inefficient queries. Use it when facing slow response times, high database load, or performance regressions. It provides actionable guidance to reduce query times and improve overall application performance.

Documentation

Database Query Optimization

Overview

Slow database queries are a common performance bottleneck. Optimization through indexing, efficient queries, and caching dramatically improves application performance.

When to Use

Slow response times
High database CPU usage
Performance regression
New feature deployment
Regular maintenance

Instructions

1. Query Analysis

-- Analyze query performance

EXPLAIN ANALYZE
SELECT users.id, users.name, COUNT(orders.id) as order_count
FROM users
LEFT JOIN orders ON users.id = orders.user_id
WHERE users.created_at > '2024-01-01'
GROUP BY users.id, users.name
ORDER BY order_count DESC;

-- Results show:
-- - Seq Scan (slow) vs Index Scan (fast)
-- - Rows: actual vs planned (high variance = bad)
-- - Execution time (milliseconds)

-- Key metrics:
-- - Sequential Scan: Full table read (slow)
-- - Index Scan: Uses index (fast)
-- - Nested Loop: Joins with loops
-- - Sort: In-memory or disk sort

2. Indexing Strategy

Index Types:

Single Column:
  CREATE INDEX idx_users_email ON users(email);
  Use: WHERE email = ?
  Size: Small, quick to create

Composite Index:
  CREATE INDEX idx_orders_user_date
    ON orders(user_id, created_at);
  Use: WHERE user_id = ? AND created_at > ?
  Order: Most selective first

Covering Index:
  CREATE INDEX idx_orders_covering
    ON orders(user_id) INCLUDE (total_amount);
  Benefit: No table lookup needed

Partial Index:
  CREATE INDEX idx_active_users
    ON users(id) WHERE status = 'active';
  Benefit: Smaller, faster

Full Text:
  CREATE FULLTEXT INDEX idx_search
    ON articles(title, content);
  Use: Text search queries

---

Index Rules:

- Create indexes for WHERE conditions
- Create indexes for JOIN columns
- Create indexes for ORDER BY
- Don't over-index (slows writes)
- Monitor index usage
- Remove unused indexes
- Update statistics regularly
- Partial indexes for filtered queries

Missing Index Query:
SELECT object_name, equality_columns
FROM sys.dm_db_missing_index_details
ORDER BY equality_columns;

3. Query Optimization Techniques

# Common optimization patterns

# BEFORE (N+1 queries)
for user in users:
    orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)
    # 1 + N queries

# AFTER (single query with JOIN)
orders = db.query("""
  SELECT u.*, o.* FROM users u
  LEFT JOIN orders o ON u.id = o.user_id
  WHERE u.created_at > ?
""", date_threshold)

# BEFORE (inefficient WHERE)
SELECT * FROM users
WHERE LOWER(email) = LOWER('[email protected]')
# Can't use index (function used)

# AFTER (index-friendly)
SELECT * FROM users
WHERE email = '[email protected]'
# Case-insensitive constraint + index

# BEFORE (wildcard at start)
SELECT * FROM users WHERE email LIKE '%example.com'
# Can't use index (wildcard at start)

# AFTER (wildcard at end)
SELECT * FROM users WHERE email LIKE 'user%'
# Can use index

# BEFORE (slow aggregation)
SELECT user_id, COUNT(*) as cnt
FROM orders
GROUP BY user_id
ORDER BY cnt DESC
LIMIT 10

# AFTER (pre-aggregated)
SELECT user_id, order_count
FROM user_order_stats
WHERE order_count IS NOT NULL
ORDER BY order_count DESC
LIMIT 10

4. Optimization Checklist

Analysis:
  [ ] Run EXPLAIN ANALYZE on slow queries
  [ ] Check actual vs estimated rows
  [ ] Look for sequential scans
  [ ] Identify expensive operations
  [ ] Compare execution plans

Indexing:
  [ ] Index WHERE columns
  [ ] Index JOIN columns
  [ ] Index ORDER BY columns
  [ ] Check unused indexes
  [ ] Remove duplicate indexes
  [ ] Create composite indexes strategically
  [ ] Analyze index statistics

Query Optimization:
  [ ] Remove unnecessary columns (SELECT *)
  [ ] Use JOINs instead of subqueries
  [ ] Avoid functions in WHERE
  [ ] Use wildcards carefully (avoid %)
  [ ] Batch operations
  [ ] Use LIMIT for result sets
  [ ] Archive old data

Caching:
  [ ] Implement query caching
  [ ] Cache aggregations
  [ ] Use Redis for hot data
  [ ] Invalidate strategically

Monitoring:
  [ ] Track slow queries
  [ ] Monitor index usage
  [ ] Set up alerts
  [ ] Regular statistics update
  [ ] Measure improvements

---

Expected Improvements:

With Proper Indexing:
  - Sequential Scan → Index Scan
  - Response time: 5 seconds → 50ms (100x faster)
  - CPU usage: 80% → 20%
  - Concurrent users: 100 → 1000

Quick Wins:
  - Add index to frequently filtered column
  - Fix N+1 queries
  - Use LIMIT for large results
  - Archive old data
  - Expected: 20-50% improvement

Key Points

EXPLAIN ANALYZE shows query execution
Indexes must match WHERE/JOIN/ORDER BY
Avoid functions in WHERE clauses
Fix N+1 queries (join instead of loop)
Monitor slow query log regularly
Stats updates needed for accuracy
Pre-calculate aggregations
Archive historical data
Use explain plans before/after
Measure and monitor continuously

Quick Install

/plugin add https://github.com/aj-geddes/useful-ai-prompts/tree/main/database-query-optimization

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

aj-geddes/useful-ai-prompts

Path: skills/database-query-optimization

Related Skills

llamaindex

Meta

LlamaIndex is a data framework for building RAG-powered LLM applications, specializing in document ingestion, indexing, and querying. It provides key features like vector indices, query engines, and agents, and supports over 300 data connectors. Use it for document Q&A, chatbots, and knowledge retrieval when building data-centric applications.

View skill

csv-data-summarizer

Meta

This skill automatically analyzes CSV files to generate comprehensive statistical summaries and visualizations using Python's pandas and matplotlib/seaborn. It should be triggered whenever a user uploads or references CSV data without prompting for analysis preferences. The tool provides immediate insights into data structure, quality, and patterns through automated analysis and visualization.

View skill

hybrid-cloud-networking

Meta

This skill configures secure hybrid cloud networking between on-premises infrastructure and cloud platforms like AWS, Azure, and GCP. Use it when connecting data centers to the cloud, building hybrid architectures, or implementing secure cross-premises connectivity. It supports key capabilities such as VPNs and dedicated connections like AWS Direct Connect for high-performance, reliable setups.

View skill

Excel Analysis

Meta

This skill enables developers to analyze Excel files and perform data operations using pandas. It can read spreadsheets, create pivot tables, generate charts, and conduct data analysis on .xlsx files and tabular data. Use it when working with Excel files, spreadsheets, or any structured tabular data within Claude Code.

View skill