Back to Skills

Excel Analysis

lifangda
Updated Today
78 views
11
11
View on GitHub
Metaexceldata

About

This skill enables developers to analyze Excel files and perform data operations using pandas. It can read spreadsheets, create pivot tables, generate charts, and conduct data analysis on .xlsx files and tabular data. Use it when working with Excel files, spreadsheets, or any structured tabular data within Claude Code.

Documentation

Excel Analysis

Quick start

Read Excel files with pandas:

import pandas as pd

# Read Excel file
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")

# Display first few rows
print(df.head())

# Basic statistics
print(df.describe())

Reading multiple sheets

Process all sheets in a workbook:

import pandas as pd

# Read all sheets
excel_file = pd.ExcelFile("workbook.xlsx")

for sheet_name in excel_file.sheet_names:
    df = pd.read_excel(excel_file, sheet_name=sheet_name)
    print(f"\n{sheet_name}:")
    print(df.head())

Data analysis

Perform common analysis tasks:

import pandas as pd

df = pd.read_excel("sales.xlsx")

# Group by and aggregate
sales_by_region = df.groupby("region")["sales"].sum()
print(sales_by_region)

# Filter data
high_sales = df[df["sales"] > 10000]

# Calculate metrics
df["profit_margin"] = (df["revenue"] - df["cost"]) / df["revenue"]

# Sort by column
df_sorted = df.sort_values("sales", ascending=False)

Creating Excel files

Write data to Excel with formatting:

import pandas as pd

df = pd.DataFrame({
    "Product": ["A", "B", "C"],
    "Sales": [100, 200, 150],
    "Profit": [20, 40, 30]
})

# Write to Excel
writer = pd.ExcelWriter("output.xlsx", engine="openpyxl")
df.to_excel(writer, sheet_name="Sales", index=False)

# Get worksheet for formatting
worksheet = writer.sheets["Sales"]

# Auto-adjust column widths
for column in worksheet.columns:
    max_length = 0
    column_letter = column[0].column_letter
    for cell in column:
        if len(str(cell.value)) > max_length:
            max_length = len(str(cell.value))
    worksheet.column_dimensions[column_letter].width = max_length + 2

writer.close()

Pivot tables

Create pivot tables programmatically:

import pandas as pd

df = pd.read_excel("sales_data.xlsx")

# Create pivot table
pivot = pd.pivot_table(
    df,
    values="sales",
    index="region",
    columns="product",
    aggfunc="sum",
    fill_value=0
)

print(pivot)

# Save pivot table
pivot.to_excel("pivot_report.xlsx")

Charts and visualization

Generate charts from Excel data:

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_excel("data.xlsx")

# Create bar chart
df.plot(x="category", y="value", kind="bar")
plt.title("Sales by Category")
plt.xlabel("Category")
plt.ylabel("Sales")
plt.tight_layout()
plt.savefig("chart.png")

# Create pie chart
df.set_index("category")["value"].plot(kind="pie", autopct="%1.1f%%")
plt.title("Market Share")
plt.ylabel("")
plt.savefig("pie_chart.png")

Data cleaning

Clean and prepare Excel data:

import pandas as pd

df = pd.read_excel("messy_data.xlsx")

# Remove duplicates
df = df.drop_duplicates()

# Handle missing values
df = df.fillna(0)  # or df.dropna()

# Remove whitespace
df["name"] = df["name"].str.strip()

# Convert data types
df["date"] = pd.to_datetime(df["date"])
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")

# Save cleaned data
df.to_excel("cleaned_data.xlsx", index=False)

Merging and joining

Combine multiple Excel files:

import pandas as pd

# Read multiple files
df1 = pd.read_excel("sales_q1.xlsx")
df2 = pd.read_excel("sales_q2.xlsx")

# Concatenate vertically
combined = pd.concat([df1, df2], ignore_index=True)

# Merge on common column
customers = pd.read_excel("customers.xlsx")
sales = pd.read_excel("sales.xlsx")

merged = pd.merge(sales, customers, on="customer_id", how="left")

merged.to_excel("merged_data.xlsx", index=False)

Advanced formatting

Apply conditional formatting and styles:

import pandas as pd
from openpyxl import load_workbook
from openpyxl.styles import PatternFill, Font

# Create Excel file
df = pd.DataFrame({
    "Product": ["A", "B", "C"],
    "Sales": [100, 200, 150]
})

df.to_excel("formatted.xlsx", index=False)

# Load workbook for formatting
wb = load_workbook("formatted.xlsx")
ws = wb.active

# Apply conditional formatting
red_fill = PatternFill(start_color="FF0000", end_color="FF0000", fill_type="solid")
green_fill = PatternFill(start_color="00FF00", end_color="00FF00", fill_type="solid")

for row in range(2, len(df) + 2):
    cell = ws[f"B{row}"]
    if cell.value < 150:
        cell.fill = red_fill
    else:
        cell.fill = green_fill

# Bold headers
for cell in ws[1]:
    cell.font = Font(bold=True)

wb.save("formatted.xlsx")

Performance tips

  • Use read_excel with usecols to read specific columns only
  • Use chunksize for very large files
  • Consider using engine='openpyxl' or engine='xlrd' based on file type
  • Use dtype parameter to specify column types for faster reading

Available packages

  • pandas - Data analysis and manipulation (primary)
  • openpyxl - Excel file creation and formatting
  • xlrd - Reading older .xls files
  • xlsxwriter - Advanced Excel writing capabilities
  • matplotlib - Chart generation

Quick Install

/plugin add https://github.com/lifangda/claude-plugins/tree/main/excel-analysis

Copy and paste this command in Claude Code to install this skill

GitHub 仓库

lifangda/claude-plugins
Path: cli-tool/skills-library/data-analysis/excel-analysis

Related Skills

llamaindex

Meta

LlamaIndex is a data framework for building RAG-powered LLM applications, specializing in document ingestion, indexing, and querying. It provides key features like vector indices, query engines, and agents, and supports over 300 data connectors. Use it for document Q&A, chatbots, and knowledge retrieval when building data-centric applications.

View skill

csv-data-summarizer

Meta

This skill automatically analyzes CSV files to generate comprehensive statistical summaries and visualizations using Python's pandas and matplotlib/seaborn. It should be triggered whenever a user uploads or references CSV data without prompting for analysis preferences. The tool provides immediate insights into data structure, quality, and patterns through automated analysis and visualization.

View skill

adk-deployment-specialist

Documentation

This skill manages A2A (Agent-to-Agent) communication for orchestrating Vertex AI ADK agents, handling AgentCard discovery, task submission, and status polling. It enables developers to build and deploy multi-agent systems on Google's Agent Engine with capabilities for agent orchestration and session management. Key features include an observability dashboard, Cloud Trace integration, and BigQuery analytics export for production monitoring.

View skill

nestjs

Meta

This skill provides NestJS development standards and architectural patterns for building domain-centric applications. It covers modular design, dependency injection, decorator patterns, and key framework features like controllers, services, middleware, and interceptors. Use it when developing NestJS applications, implementing APIs, configuring microservices, or integrating with databases.

View skill