data-seeding
About
The data-seeding skill creates and updates database seed scripts for development and testing environments. It helps developers initialize databases with test data, reset to known states, and generate realistic sample data. This skill can read, edit, and write seed files while also executing them via Bash.
Documentation
Data Seeding Skill
This skill helps you create and manage database seed scripts in packages/database/.
When to Use This Skill
- Setting up development database with test data
- Creating seed data for testing
- Initializing demo environments
- Resetting database to known state
- Generating realistic sample data
- Testing data migration scripts
Seed Script Structure
packages/database/
├── src/
│ └── seed/
│ ├── index.ts # Main seed runner
│ ├── cars.ts # Car data seeds
│ ├── coe.ts # COE data seeds
│ ├── posts.ts # Blog posts seeds
│ └── users.ts # User seeds
├── scripts/
│ └── seed.ts # Seed execution script
└── package.json
Basic Seed Pattern
Simple Seed Script
// packages/database/src/seed/cars.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { nanoid } from "nanoid";
export async function seedCars() {
console.log("Seeding cars...");
const carData = [
{
id: nanoid(),
make: "Toyota",
model: "Camry",
vehicleClass: "Sedan",
fuelType: "Petrol",
month: "2024-01",
number: 150,
},
{
id: nanoid(),
make: "Honda",
model: "Civic",
vehicleClass: "Sedan",
fuelType: "Petrol",
month: "2024-01",
number: 120,
},
{
id: nanoid(),
make: "BMW",
model: "3 Series",
vehicleClass: "Sedan",
fuelType: "Petrol",
month: "2024-01",
number: 80,
},
];
await db.insert(cars).values(carData);
console.log(`✓ Seeded ${carData.length} cars`);
}
Main Seed Runner
// packages/database/src/seed/index.ts
import { db } from "../index";
import { seedCars } from "./cars";
import { seedCOE } from "./coe";
import { seedPosts } from "./posts";
export async function seed() {
try {
console.log("🌱 Starting database seed...\n");
// Clear existing data (optional)
await clearDatabase();
// Run seed functions
await seedCars();
await seedCOE();
await seedPosts();
console.log("\n✅ Database seeded successfully!");
} catch (error) {
console.error("❌ Seed failed:", error);
throw error;
}
}
async function clearDatabase() {
console.log("Clearing existing data...");
// Delete in reverse order of dependencies
await db.delete(posts);
await db.delete(coe);
await db.delete(cars);
console.log("✓ Database cleared\n");
}
// Run if called directly
if (require.main === module) {
seed()
.then(() => process.exit(0))
.catch(() => process.exit(1));
}
Execution Script
// packages/database/scripts/seed.ts
import { seed } from "../src/seed";
seed()
.then(() => {
console.log("Seed completed");
process.exit(0);
})
.catch((error) => {
console.error("Seed failed:", error);
process.exit(1);
});
Add to package.json:
{
"scripts": {
"db:seed": "tsx scripts/seed.ts"
}
}
Run seed:
pnpm -F @sgcarstrends/database db:seed
Advanced Seed Patterns
Seed with Relationships
// packages/database/src/seed/posts.ts
import { db } from "../index";
import { users, posts } from "../db/schema";
import { nanoid } from "nanoid";
export async function seedPosts() {
console.log("Seeding users and posts...");
// First, create users
const userData = [
{
id: nanoid(),
name: "John Doe",
email: "[email protected]",
},
{
id: nanoid(),
name: "Jane Smith",
email: "[email protected]",
},
];
const createdUsers = await db.insert(users).values(userData).returning();
// Then, create posts with user references
const postData = [
{
id: nanoid(),
title: "First Blog Post",
content: "This is the first blog post content.",
authorId: createdUsers[0].id,
published: true,
publishedAt: new Date(),
},
{
id: nanoid(),
title: "Second Blog Post",
content: "This is the second blog post content.",
authorId: createdUsers[1].id,
published: true,
publishedAt: new Date(),
},
];
await db.insert(posts).values(postData);
console.log(`✓ Seeded ${createdUsers.length} users and ${postData.length} posts`);
}
Seed with Faker.js
pnpm -F @sgcarstrends/database add -D @faker-js/faker
// packages/database/src/seed/realistic-data.ts
import { faker } from "@faker-js/faker";
import { db } from "../index";
import { cars } from "../db/schema";
import { nanoid } from "nanoid";
export async function seedRealisticCars(count: number = 100) {
console.log(`Seeding ${count} realistic cars...`);
const makes = ["Toyota", "Honda", "BMW", "Mercedes", "Audi", "Nissan", "Mazda"];
const fuelTypes = ["Petrol", "Diesel", "Electric", "Hybrid"];
const vehicleClasses = ["Sedan", "SUV", "Hatchback", "MPV"];
const carData = Array.from({ length: count }, () => ({
id: nanoid(),
make: faker.helpers.arrayElement(makes),
model: faker.vehicle.model(),
vehicleClass: faker.helpers.arrayElement(vehicleClasses),
fuelType: faker.helpers.arrayElement(fuelTypes),
month: faker.date.between({ from: "2020-01-01", to: "2024-12-31" })
.toISOString()
.slice(0, 7),
number: faker.number.int({ min: 10, max: 500 }),
}));
// Batch insert for performance
const batchSize = 50;
for (let i = 0; i < carData.length; i += batchSize) {
const batch = carData.slice(i, i + batchSize);
await db.insert(cars).values(batch);
}
console.log(`✓ Seeded ${count} realistic cars`);
}
Seed from JSON File
// packages/database/src/seed/from-file.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { readFileSync } from "fs";
import { join } from "path";
export async function seedFromJSON() {
console.log("Seeding from JSON file...");
// Read data from JSON file
const filePath = join(__dirname, "../../data/cars.json");
const rawData = readFileSync(filePath, "utf-8");
const carData = JSON.parse(rawData);
await db.insert(cars).values(carData);
console.log(`✓ Seeded ${carData.length} cars from JSON`);
}
Example JSON file:
// packages/database/data/cars.json
[
{
"id": "car-1",
"make": "Toyota",
"model": "Camry",
"vehicleClass": "Sedan",
"fuelType": "Petrol",
"month": "2024-01",
"number": 150
},
{
"id": "car-2",
"make": "Honda",
"model": "Civic",
"vehicleClass": "Sedan",
"fuelType": "Petrol",
"month": "2024-01",
"number": 120
}
]
Seed from CSV
pnpm -F @sgcarstrends/database add -D csv-parse
// packages/database/src/seed/from-csv.ts
import { db } from "../index";
import { cars } from "../db/schema";
import { parse } from "csv-parse/sync";
import { readFileSync } from "fs";
import { join } from "path";
import { nanoid } from "nanoid";
export async function seedFromCSV() {
console.log("Seeding from CSV file...");
const filePath = join(__dirname, "../../data/cars.csv");
const csvContent = readFileSync(filePath, "utf-8");
const records = parse(csvContent, {
columns: true,
skip_empty_lines: true,
});
const carData = records.map((record: any) => ({
id: nanoid(),
make: record.make,
model: record.model,
vehicleClass: record.vehicle_class,
fuelType: record.fuel_type,
month: record.month,
number: parseInt(record.number, 10),
}));
// Batch insert
const batchSize = 100;
for (let i = 0; i < carData.length; i += batchSize) {
const batch = carData.slice(i, i + batchSize);
await db.insert(cars).values(batch);
}
console.log(`✓ Seeded ${carData.length} cars from CSV`);
}
Environment-Specific Seeds
Development Seeds
// packages/database/src/seed/dev.ts
import { seedCars } from "./cars";
import { seedCOE } from "./coe";
import { seedRealisticCars } from "./realistic-data";
export async function seedDevelopment() {
console.log("🔧 Seeding development environment...\n");
// Small, predictable dataset
await seedCars();
await seedCOE();
console.log("\n✅ Development seed complete!");
}
Testing Seeds
// packages/database/src/seed/test.ts
export async function seedTesting() {
console.log("🧪 Seeding test environment...\n");
// Minimal, deterministic data for tests
const testCar = {
id: "test-car-1",
make: "Toyota",
model: "Camry",
vehicleClass: "Sedan",
fuelType: "Petrol",
month: "2024-01",
number: 100,
};
await db.insert(cars).values([testCar]);
console.log("\n✅ Test seed complete!");
}
Staging Seeds
// packages/database/src/seed/staging.ts
export async function seedStaging() {
console.log("🎭 Seeding staging environment...\n");
// Larger, realistic dataset similar to production
await seedRealisticCars(1000);
await seedRealisticCOE(500);
await seedRealisticPosts(50);
console.log("\n✅ Staging seed complete!");
}
Conditional Seeding
// packages/database/src/seed/index.ts
import { seedDevelopment } from "./dev";
import { seedTesting } from "./test";
import { seedStaging } from "./staging";
export async function seed() {
const env = process.env.NODE_ENV || "development";
switch (env) {
case "development":
await seedDevelopment();
break;
case "test":
await seedTesting();
break;
case "staging":
await seedStaging();
break;
default:
throw new Error(`No seed strategy for environment: ${env}`);
}
}
Idempotent Seeds
Upsert Pattern
// packages/database/src/seed/idempotent.ts
import { db } from "../index";
import { cars } from "../db/schema";
export async function seedIdempotent() {
console.log("Seeding with upsert...");
const carData = [
{
id: "fixed-id-1",
make: "Toyota",
model: "Camry",
month: "2024-01",
number: 150,
},
];
// Use onConflictDoUpdate for upsert
await db
.insert(cars)
.values(carData)
.onConflictDoUpdate({
target: cars.id,
set: {
make: carData[0].make,
model: carData[0].model,
number: carData[0].number,
updatedAt: new Date(),
},
});
console.log("✓ Upserted cars");
}
Check Before Insert
export async function seedIfEmpty() {
console.log("Checking if database needs seeding...");
const existingCars = await db.select().from(cars).limit(1);
if (existingCars.length > 0) {
console.log("Database already has data, skipping seed");
return;
}
console.log("Database is empty, proceeding with seed");
await seedCars();
}
Performance Optimization
Batch Inserts
export async function seedLargeDataset() {
const totalRecords = 10000;
const batchSize = 1000;
console.log(`Seeding ${totalRecords} records in batches of ${batchSize}...`);
for (let i = 0; i < totalRecords; i += batchSize) {
const batch = generateBatch(batchSize);
await db.insert(cars).values(batch);
console.log(`Progress: ${Math.min(i + batchSize, totalRecords)}/${totalRecords}`);
}
console.log("✓ Large dataset seeded");
}
function generateBatch(size: number) {
return Array.from({ length: size }, () => ({
id: nanoid(),
make: "Toyota",
model: "Camry",
month: "2024-01",
number: 100,
}));
}
Use Transactions
import { db } from "../index";
export async function seedWithTransaction() {
console.log("Seeding with transaction...");
await db.transaction(async (tx) => {
// All or nothing
await tx.insert(cars).values([...carData]);
await tx.insert(coe).values([...coeData]);
await tx.insert(posts).values([...postData]);
});
console.log("✓ Transaction seed complete");
}
Utility Functions
Generate Realistic Dates
function generateMonthRange(start: string, end: string): string[] {
const months = [];
let current = new Date(start);
const endDate = new Date(end);
while (current <= endDate) {
months.push(current.toISOString().slice(0, 7));
current.setMonth(current.getMonth() + 1);
}
return months;
}
// Usage
const months = generateMonthRange("2020-01", "2024-12");
Generate Sequential IDs
function generateSequentialId(prefix: string, index: number): string {
return `${prefix}-${String(index).padStart(5, "0")}`;
}
// Usage
const id = generateSequentialId("car", 42); // "car-00042"
Testing Seeds
// packages/database/src/seed/__tests__/seed.test.ts
import { describe, it, expect, beforeEach } from "vitest";
import { db } from "../../index";
import { cars } from "../../db/schema";
import { seedCars } from "../cars";
describe("Seed Scripts", () => {
beforeEach(async () => {
// Clear data before each test
await db.delete(cars);
});
it("seeds cars successfully", async () => {
await seedCars();
const result = await db.select().from(cars);
expect(result.length).toBeGreaterThan(0);
expect(result[0].make).toBeDefined();
});
it("creates valid data", async () => {
await seedCars();
const [car] = await db.select().from(cars).limit(1);
expect(car.id).toBeTruthy();
expect(car.make).toBeTruthy();
expect(car.number).toBeGreaterThanOrEqual(0);
});
});
CLI for Selective Seeding
// packages/database/scripts/seed-cli.ts
import { seedCars } from "../src/seed/cars";
import { seedCOE } from "../src/seed/coe";
import { seedPosts } from "../src/seed/posts";
const seeders = {
cars: seedCars,
coe: seedCOE,
posts: seedPosts,
all: async () => {
await seedCars();
await seedCOE();
await seedPosts();
},
};
const target = process.argv[2] as keyof typeof seeders;
if (!target || !seeders[target]) {
console.error("Usage: pnpm db:seed [cars|coe|posts|all]");
process.exit(1);
}
seeders[target]()
.then(() => {
console.log(`✅ Seeded ${target} successfully`);
process.exit(0);
})
.catch((error) => {
console.error(`❌ Seed failed:`, error);
process.exit(1);
});
Add to package.json:
{
"scripts": {
"db:seed": "tsx scripts/seed-cli.ts all",
"db:seed:cars": "tsx scripts/seed-cli.ts cars",
"db:seed:coe": "tsx scripts/seed-cli.ts coe",
"db:seed:posts": "tsx scripts/seed-cli.ts posts"
}
}
Common Seed Data
COE Seed Data
// packages/database/src/seed/coe.ts
import { db } from "../index";
import { coe } from "../db/schema";
import { nanoid } from "nanoid";
export async function seedCOE() {
console.log("Seeding COE data...");
const coeData = [
{
id: nanoid(),
biddingNo: 1,
month: "2024-01",
vehicleClass: "A",
quota: 1000,
bidsReceived: 1200,
premium: "95000.00",
},
{
id: nanoid(),
biddingNo: 1,
month: "2024-01",
vehicleClass: "B",
quota: 800,
bidsReceived: 900,
premium: "105000.00",
},
];
await db.insert(coe).values(coeData);
console.log(`✓ Seeded ${coeData.length} COE records`);
}
References
- Related files:
packages/database/src/seed/- Seed scriptspackages/database/scripts/seed.ts- Seed runnerpackages/database/CLAUDE.md- Database package documentation
Best Practices
- Idempotent: Seeds should be safe to run multiple times
- Environment-Specific: Different seed data for dev/test/staging
- Realistic Data: Use Faker for realistic test data
- Batch Inserts: Use batching for large datasets
- Relationships: Seed in correct order (parent tables first)
- Transactions: Use transactions for atomic seeding
- Clear First: Option to clear existing data
- Logging: Provide clear progress feedback
Quick Install
/plugin add https://github.com/sgcarstrends/sgcarstrends/tree/main/data-seedingCopy and paste this command in Claude Code to install this skill
GitHub 仓库
Related Skills
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
webapp-testing
TestingThis Claude Skill provides a Playwright-based toolkit for testing local web applications through Python scripts. It enables frontend verification, UI debugging, screenshot capture, and log viewing while managing server lifecycles. Use it for browser automation tasks but run scripts directly rather than reading their source code to avoid context pollution.
finishing-a-development-branch
TestingThis skill helps developers complete finished work by verifying tests pass and then presenting structured integration options. It guides the workflow for merging, creating PRs, or cleaning up branches after implementation is done. Use it when your code is ready and tested to systematically finalize the development process.
llamaindex
MetaLlamaIndex is a data framework for building RAG-powered LLM applications, specializing in document ingestion, indexing, and querying. It provides key features like vector indices, query engines, and agents, and supports over 300 data connectors. Use it for document Q&A, chatbots, and knowledge retrieval when building data-centric applications.
