Back to Skills

data-seeding

sgcarstrends
Updated Today
28 views
9
1
9
View on GitHub
Metatestingdata

About

The data-seeding skill creates and updates database seed scripts for development and testing environments. It helps developers initialize databases with test data, reset to known states, and generate realistic sample data. This skill can read, edit, and write seed files while also executing them via Bash.

Quick Install

Claude Code

Recommended
Plugin CommandRecommended
/plugin add https://github.com/sgcarstrends/sgcarstrends
Git CloneAlternative
git clone https://github.com/sgcarstrends/sgcarstrends.git ~/.claude/skills/data-seeding

Copy and paste this command in Claude Code to install this skill

Documentation

Data Seeding Skill

This skill helps you create and manage database seed scripts in packages/database/.

When to Use This Skill

  • Setting up development database with test data
  • Creating seed data for testing
  • Initializing demo environments
  • Resetting database to known state
  • Generating realistic sample data
  • Testing data migration scripts

Seed Script Structure

packages/database/
├── src/
│   └── seed/
│       ├── index.ts          # Main seed runner
│       ├── cars.ts           # Car data seeds
│       ├── coe.ts            # COE data seeds
│       ├── posts.ts          # Blog posts seeds
│       └── users.ts          # User seeds
├── scripts/
│   └── seed.ts               # Seed execution script
└── package.json

Basic Seed Pattern

Simple Seed Script

// packages/database/src/seed/cars.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { nanoid } from "nanoid";

export async function seedCars() {
  console.log("Seeding cars...");

  const carData = [
    {
      id: nanoid(),
      make: "Toyota",
      model: "Camry",
      vehicleClass: "Sedan",
      fuelType: "Petrol",
      month: "2024-01",
      number: 150,
    },
    {
      id: nanoid(),
      make: "Honda",
      model: "Civic",
      vehicleClass: "Sedan",
      fuelType: "Petrol",
      month: "2024-01",
      number: 120,
    },
    {
      id: nanoid(),
      make: "BMW",
      model: "3 Series",
      vehicleClass: "Sedan",
      fuelType: "Petrol",
      month: "2024-01",
      number: 80,
    },
  ];

  await db.insert(cars).values(carData);

  console.log(`✓ Seeded ${carData.length} cars`);
}

Main Seed Runner

// packages/database/src/seed/index.ts
import { db } from "../index";
import { seedCars } from "./cars";
import { seedCOE } from "./coe";
import { seedPosts } from "./posts";

export async function seed() {
  try {
    console.log("🌱 Starting database seed...\n");

    // Clear existing data (optional)
    await clearDatabase();

    // Run seed functions
    await seedCars();
    await seedCOE();
    await seedPosts();

    console.log("\n✅ Database seeded successfully!");
  } catch (error) {
    console.error("❌ Seed failed:", error);
    throw error;
  }
}

async function clearDatabase() {
  console.log("Clearing existing data...");

  // Delete in reverse order of dependencies
  await db.delete(posts);
  await db.delete(coe);
  await db.delete(cars);

  console.log("✓ Database cleared\n");
}

// Run if called directly
if (require.main === module) {
  seed()
    .then(() => process.exit(0))
    .catch(() => process.exit(1));
}

Execution Script

// packages/database/scripts/seed.ts
import { seed } from "../src/seed";

seed()
  .then(() => {
    console.log("Seed completed");
    process.exit(0);
  })
  .catch((error) => {
    console.error("Seed failed:", error);
    process.exit(1);
  });

Add to package.json:

{
  "scripts": {
    "db:seed": "tsx scripts/seed.ts"
  }
}

Run seed:

pnpm -F @sgcarstrends/database db:seed

Advanced Seed Patterns

Seed with Relationships

// packages/database/src/seed/posts.ts
import { db } from "../index";
import { users, posts } from "../db/schema";
import { nanoid } from "nanoid";

export async function seedPosts() {
  console.log("Seeding users and posts...");

  // First, create users
  const userData = [
    {
      id: nanoid(),
      name: "John Doe",
      email: "[email protected]",
    },
    {
      id: nanoid(),
      name: "Jane Smith",
      email: "[email protected]",
    },
  ];

  const createdUsers = await db.insert(users).values(userData).returning();

  // Then, create posts with user references
  const postData = [
    {
      id: nanoid(),
      title: "First Blog Post",
      content: "This is the first blog post content.",
      authorId: createdUsers[0].id,
      published: true,
      publishedAt: new Date(),
    },
    {
      id: nanoid(),
      title: "Second Blog Post",
      content: "This is the second blog post content.",
      authorId: createdUsers[1].id,
      published: true,
      publishedAt: new Date(),
    },
  ];

  await db.insert(posts).values(postData);

  console.log(`✓ Seeded ${createdUsers.length} users and ${postData.length} posts`);
}

Seed with Faker.js

pnpm -F @sgcarstrends/database add -D @faker-js/faker
// packages/database/src/seed/realistic-data.ts
import { faker } from "@faker-js/faker";
import { db } from "../index";
import { cars } from "../db/schema";
import { nanoid } from "nanoid";

export async function seedRealisticCars(count: number = 100) {
  console.log(`Seeding ${count} realistic cars...`);

  const makes = ["Toyota", "Honda", "BMW", "Mercedes", "Audi", "Nissan", "Mazda"];
  const fuelTypes = ["Petrol", "Diesel", "Electric", "Hybrid"];
  const vehicleClasses = ["Sedan", "SUV", "Hatchback", "MPV"];

  const carData = Array.from({ length: count }, () => ({
    id: nanoid(),
    make: faker.helpers.arrayElement(makes),
    model: faker.vehicle.model(),
    vehicleClass: faker.helpers.arrayElement(vehicleClasses),
    fuelType: faker.helpers.arrayElement(fuelTypes),
    month: faker.date.between({ from: "2020-01-01", to: "2024-12-31" })
      .toISOString()
      .slice(0, 7),
    number: faker.number.int({ min: 10, max: 500 }),
  }));

  // Batch insert for performance
  const batchSize = 50;
  for (let i = 0; i < carData.length; i += batchSize) {
    const batch = carData.slice(i, i + batchSize);
    await db.insert(cars).values(batch);
  }

  console.log(`✓ Seeded ${count} realistic cars`);
}

Seed from JSON File

// packages/database/src/seed/from-file.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { readFileSync } from "fs";
import { join } from "path";

export async function seedFromJSON() {
  console.log("Seeding from JSON file...");

  // Read data from JSON file
  const filePath = join(__dirname, "../../data/cars.json");
  const rawData = readFileSync(filePath, "utf-8");
  const carData = JSON.parse(rawData);

  await db.insert(cars).values(carData);

  console.log(`✓ Seeded ${carData.length} cars from JSON`);
}

Example JSON file:

// packages/database/data/cars.json
[
  {
    "id": "car-1",
    "make": "Toyota",
    "model": "Camry",
    "vehicleClass": "Sedan",
    "fuelType": "Petrol",
    "month": "2024-01",
    "number": 150
  },
  {
    "id": "car-2",
    "make": "Honda",
    "model": "Civic",
    "vehicleClass": "Sedan",
    "fuelType": "Petrol",
    "month": "2024-01",
    "number": 120
  }
]

Seed from CSV

pnpm -F @sgcarstrends/database add -D csv-parse
// packages/database/src/seed/from-csv.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { parse } from "csv-parse/sync";
import { readFileSync } from "fs";
import { join } from "path";
import { nanoid } from "nanoid";

export async function seedFromCSV() {
  console.log("Seeding from CSV file...");

  const filePath = join(__dirname, "../../data/cars.csv");
  const csvContent = readFileSync(filePath, "utf-8");

  const records = parse(csvContent, {
    columns: true,
    skip_empty_lines: true,
  });

  const carData = records.map((record: any) => ({
    id: nanoid(),
    make: record.make,
    model: record.model,
    vehicleClass: record.vehicle_class,
    fuelType: record.fuel_type,
    month: record.month,
    number: parseInt(record.number, 10),
  }));

  // Batch insert
  const batchSize = 100;
  for (let i = 0; i < carData.length; i += batchSize) {
    const batch = carData.slice(i, i + batchSize);
    await db.insert(cars).values(batch);
  }

  console.log(`✓ Seeded ${carData.length} cars from CSV`);
}

Environment-Specific Seeds

Development Seeds

// packages/database/src/seed/dev.ts
import { seedCars } from "./cars";
import { seedCOE } from "./coe";
import { seedRealisticCars } from "./realistic-data";

export async function seedDevelopment() {
  console.log("🔧 Seeding development environment...\n");

  // Small, predictable dataset
  await seedCars();
  await seedCOE();

  console.log("\n✅ Development seed complete!");
}

Testing Seeds

// packages/database/src/seed/test.ts
export async function seedTesting() {
  console.log("🧪 Seeding test environment...\n");

  // Minimal, deterministic data for tests
  const testCar = {
    id: "test-car-1",
    make: "Toyota",
    model: "Camry",
    vehicleClass: "Sedan",
    fuelType: "Petrol",
    month: "2024-01",
    number: 100,
  };

  await db.insert(cars).values([testCar]);

  console.log("\n✅ Test seed complete!");
}

Staging Seeds

// packages/database/src/seed/staging.ts
export async function seedStaging() {
  console.log("🎭 Seeding staging environment...\n");

  // Larger, realistic dataset similar to production
  await seedRealisticCars(1000);
  await seedRealisticCOE(500);
  await seedRealisticPosts(50);

  console.log("\n✅ Staging seed complete!");
}

Conditional Seeding

// packages/database/src/seed/index.ts
import { seedDevelopment } from "./dev";
import { seedTesting } from "./test";
import { seedStaging } from "./staging";

export async function seed() {
  const env = process.env.NODE_ENV || "development";

  switch (env) {
    case "development":
      await seedDevelopment();
      break;
    case "test":
      await seedTesting();
      break;
    case "staging":
      await seedStaging();
      break;
    default:
      throw new Error(`No seed strategy for environment: ${env}`);
  }
}

Idempotent Seeds

Upsert Pattern

// packages/database/src/seed/idempotent.ts
import { db } from "../index";
import { cars } from "../db/schema";

export async function seedIdempotent() {
  console.log("Seeding with upsert...");

  const carData = [
    {
      id: "fixed-id-1",
      make: "Toyota",
      model: "Camry",
      month: "2024-01",
      number: 150,
    },
  ];

  // Use onConflictDoUpdate for upsert
  await db
    .insert(cars)
    .values(carData)
    .onConflictDoUpdate({
      target: cars.id,
      set: {
        make: carData[0].make,
        model: carData[0].model,
        number: carData[0].number,
        updatedAt: new Date(),
      },
    });

  console.log("✓ Upserted cars");
}

Check Before Insert

export async function seedIfEmpty() {
  console.log("Checking if database needs seeding...");

  const existingCars = await db.select().from(cars).limit(1);

  if (existingCars.length > 0) {
    console.log("Database already has data, skipping seed");
    return;
  }

  console.log("Database is empty, proceeding with seed");
  await seedCars();
}

Performance Optimization

Batch Inserts

export async function seedLargeDataset() {
  const totalRecords = 10000;
  const batchSize = 1000;

  console.log(`Seeding ${totalRecords} records in batches of ${batchSize}...`);

  for (let i = 0; i < totalRecords; i += batchSize) {
    const batch = generateBatch(batchSize);

    await db.insert(cars).values(batch);

    console.log(`Progress: ${Math.min(i + batchSize, totalRecords)}/${totalRecords}`);
  }

  console.log("✓ Large dataset seeded");
}

function generateBatch(size: number) {
  return Array.from({ length: size }, () => ({
    id: nanoid(),
    make: "Toyota",
    model: "Camry",
    month: "2024-01",
    number: 100,
  }));
}

Use Transactions

import { db } from "../index";

export async function seedWithTransaction() {
  console.log("Seeding with transaction...");

  await db.transaction(async (tx) => {
    // All or nothing
    await tx.insert(cars).values([...carData]);
    await tx.insert(coe).values([...coeData]);
    await tx.insert(posts).values([...postData]);
  });

  console.log("✓ Transaction seed complete");
}

Utility Functions

Generate Realistic Dates

function generateMonthRange(start: string, end: string): string[] {
  const months = [];
  let current = new Date(start);
  const endDate = new Date(end);

  while (current <= endDate) {
    months.push(current.toISOString().slice(0, 7));
    current.setMonth(current.getMonth() + 1);
  }

  return months;
}

// Usage
const months = generateMonthRange("2020-01", "2024-12");

Generate Sequential IDs

function generateSequentialId(prefix: string, index: number): string {
  return `${prefix}-${String(index).padStart(5, "0")}`;
}

// Usage
const id = generateSequentialId("car", 42); // "car-00042"

Testing Seeds

// packages/database/src/seed/__tests__/seed.test.ts
import { describe, it, expect, beforeEach } from "vitest";
import { db } from "../../index";
import { cars } from "../../db/schema";
import { seedCars } from "../cars";

describe("Seed Scripts", () => {
  beforeEach(async () => {
    // Clear data before each test
    await db.delete(cars);
  });

  it("seeds cars successfully", async () => {
    await seedCars();

    const result = await db.select().from(cars);

    expect(result.length).toBeGreaterThan(0);
    expect(result[0].make).toBeDefined();
  });

  it("creates valid data", async () => {
    await seedCars();

    const [car] = await db.select().from(cars).limit(1);

    expect(car.id).toBeTruthy();
    expect(car.make).toBeTruthy();
    expect(car.number).toBeGreaterThanOrEqual(0);
  });
});

CLI for Selective Seeding

// packages/database/scripts/seed-cli.ts
import { seedCars } from "../src/seed/cars";
import { seedCOE } from "../src/seed/coe";
import { seedPosts } from "../src/seed/posts";

const seeders = {
  cars: seedCars,
  coe: seedCOE,
  posts: seedPosts,
  all: async () => {
    await seedCars();
    await seedCOE();
    await seedPosts();
  },
};

const target = process.argv[2] as keyof typeof seeders;

if (!target || !seeders[target]) {
  console.error("Usage: pnpm db:seed [cars|coe|posts|all]");
  process.exit(1);
}

seeders[target]()
  .then(() => {
    console.log(`✅ Seeded ${target} successfully`);
    process.exit(0);
  })
  .catch((error) => {
    console.error(`❌ Seed failed:`, error);
    process.exit(1);
  });

Add to package.json:

{
  "scripts": {
    "db:seed": "tsx scripts/seed-cli.ts all",
    "db:seed:cars": "tsx scripts/seed-cli.ts cars",
    "db:seed:coe": "tsx scripts/seed-cli.ts coe",
    "db:seed:posts": "tsx scripts/seed-cli.ts posts"
  }
}

Common Seed Data

COE Seed Data

// packages/database/src/seed/coe.ts
import { db } from "../index";
import { coe } from "../db/schema";
import { nanoid } from "nanoid";

export async function seedCOE() {
  console.log("Seeding COE data...");

  const coeData = [
    {
      id: nanoid(),
      biddingNo: 1,
      month: "2024-01",
      vehicleClass: "A",
      quota: 1000,
      bidsReceived: 1200,
      premium: "95000.00",
    },
    {
      id: nanoid(),
      biddingNo: 1,
      month: "2024-01",
      vehicleClass: "B",
      quota: 800,
      bidsReceived: 900,
      premium: "105000.00",
    },
  ];

  await db.insert(coe).values(coeData);

  console.log(`✓ Seeded ${coeData.length} COE records`);
}

References

  • Related files:
    • packages/database/src/seed/ - Seed scripts
    • packages/database/scripts/seed.ts - Seed runner
    • packages/database/CLAUDE.md - Database package documentation

Best Practices

  1. Idempotent: Seeds should be safe to run multiple times
  2. Environment-Specific: Different seed data for dev/test/staging
  3. Realistic Data: Use Faker for realistic test data
  4. Batch Inserts: Use batching for large datasets
  5. Relationships: Seed in correct order (parent tables first)
  6. Transactions: Use transactions for atomic seeding
  7. Clear First: Option to clear existing data
  8. Logging: Provide clear progress feedback

GitHub Repository

sgcarstrends/sgcarstrends
Path: .claude/skills/data-seeding
apiaws-lambdabackendhonojob-schedulerneon-postgres

Related Skills

content-collections

Meta

This skill provides a production-tested setup for Content Collections, a TypeScript-first tool that transforms Markdown/MDX files into type-safe data collections with Zod validation. Use it when building blogs, documentation sites, or content-heavy Vite + React applications to ensure type safety and automatic content validation. It covers everything from Vite plugin configuration and MDX compilation to deployment optimization and schema validation.

View skill

evaluating-llms-harness

Testing

This Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.

View skill

webapp-testing

Testing

This Claude Skill provides a Playwright-based toolkit for testing local web applications through Python scripts. It enables frontend verification, UI debugging, screenshot capture, and log viewing while managing server lifecycles. Use it for browser automation tasks but run scripts directly rather than reading their source code to avoid context pollution.

View skill

hybrid-cloud-networking

Meta

This skill configures secure hybrid cloud networking between on-premises infrastructure and cloud platforms like AWS, Azure, and GCP. Use it when connecting data centers to the cloud, building hybrid architectures, or implementing secure cross-premises connectivity. It supports key capabilities such as VPNs and dedicated connections like AWS Direct Connect for high-performance, reliable setups.

View skill