cur-data

majiayu000

Updated Today

1 views

Documentationdata

About

This skill provides knowledge about AWS Cost and Usage Report (CUR) data structure, column formats, and analysis patterns. It helps developers understand CUR file formats (CSV, CSV.GZ, Parquet) and automatically handles both old and new column naming conventions. Use it when building or analyzing AWS cost data pipelines to correctly interpret CUR data fields and optimize processing.

Quick Install

Claude Code

Recommended

Plugin CommandRecommended

/plugin add https://github.com/majiayu000/claude-skill-registry

Git CloneAlternative

git clone https://github.com/majiayu000/claude-skill-registry.git ~/.claude/skills/cur-data

Copy and paste this command in Claude Code to install this skill

Documentation

AWS CUR Data Skill

CUR File Formats

The project supports three CUR file formats:

CSV: Plain text, largest file size
CSV.GZ: Gzip compressed CSV, smaller
Parquet: Columnar format, fastest and smallest (recommended)

Column Name Variants

AWS CUR has two naming conventions. The data processor handles both:

Canonical Name	Old Format	New Format
cost	`lineItem/UnblendedCost`	`line_item_unblended_cost`
account_id	`lineItem/UsageAccountId`	`line_item_usage_account_id`
service	`product/ProductName`	`product_product_name`
date	`lineItem/UsageStartDate`	`line_item_usage_start_date`
region	`product/Region`	`product_region`
line_item_type	`lineItem/LineItemType`	`line_item_line_item_type`

Key Cost Columns

# Unblended cost - actual cost before discounts
line_item_unblended_cost

# Blended cost - averaged across organization
line_item_blended_cost

# Net cost - after discounts applied
line_item_net_unblended_cost

# Usage amount
line_item_usage_amount

Line Item Types

LINE_ITEM_TYPES = {
    'Usage': 'Normal usage charges',
    'Tax': 'Tax charges',
    'Fee': 'AWS fees',
    'Refund': 'Refunds/credits',
    'Credit': 'Applied credits',
    'RIFee': 'Reserved Instance fees',
    'DiscountedUsage': 'RI/SP discounted usage',
    'SavingsPlanCoveredUsage': 'Savings Plan usage',
    'SavingsPlanNegation': 'SP cost adjustment',
    'SavingsPlanUpfrontFee': 'SP upfront payment',
    'SavingsPlanRecurringFee': 'SP monthly fee',
    'BundledDiscount': 'Free tier/bundled',
    'EdpDiscount': 'Enterprise discount',
}

Discount Analysis

To identify discounts and credits:

discount_types = ['Credit', 'Refund', 'EdpDiscount', 'BundledDiscount']
discounts = df[df['line_item_type'].isin(discount_types)]

Savings Plan Analysis

Key columns for savings plans:

savings_plan_columns = [
    'savings_plan_savings_plan_arn',
    'savings_plan_savings_plan_rate',
    'savings_plan_used_commitment',
    'savings_plan_total_commitment_to_date',
]

Common Aggregations

# Cost by service
df.groupby('service').agg({'cost': 'sum'}).sort_values('cost', ascending=False)

# Cost by account and service
df.groupby(['account_id', 'service']).agg({'cost': 'sum'})

# Daily trends
df.groupby(df['date'].dt.date).agg({'cost': 'sum'})

# Monthly summary
df.groupby(df['date'].dt.to_period('M')).agg({'cost': 'sum'})

Anomaly Detection

The project uses z-score based detection:

mean = daily_costs.mean()
std = daily_costs.std()
z_scores = (daily_costs - mean) / std
anomalies = daily_costs[abs(z_scores) > 2]  # 2 std deviations

Mock Data Reference

Test fixtures provide 6 months of data:

Production (111111111111): 87% of costs, steady growth
Development (210987654321): 13% of costs, spiky (load testing)
Services: EC2, RDS, S3, CloudFront, DynamoDB, Lambda
Regions: us-east-1, us-west-2, eu-west-1, ap-northeast-1, etc.
Total: ~$6.2M over 182 days

GitHub Repository

majiayu000/claude-skill-registry

Path: skills/cur-data

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

llamaindex

Meta

LlamaIndex is a data framework for building RAG-powered LLM applications, specializing in document ingestion, indexing, and querying. It provides key features like vector indices, query engines, and agents, and supports over 300 data connectors. Use it for document Q&A, chatbots, and knowledge retrieval when building data-centric applications.

View skill

hybrid-cloud-networking

Meta

This skill configures secure hybrid cloud networking between on-premises infrastructure and cloud platforms like AWS, Azure, and GCP. Use it when connecting data centers to the cloud, building hybrid architectures, or implementing secure cross-premises connectivity. It supports key capabilities such as VPNs and dedicated connections like AWS Direct Connect for high-performance, reliable setups.

View skill

polymarket

Meta

This skill enables developers to build applications with the Polymarket prediction markets platform, including API integration for trading and market data. It also provides real-time data streaming via WebSocket to monitor live trades and market activity. Use it for implementing trading strategies or creating tools that process live market updates.

View skill