SKILL·4EBFEF

design-serialization-schema

Name: design-serialization-schema
Author: pjt222

pjt222

更新于 1 month ago

9 次查看

测试wordapiautomationdesigndata

关于

This skill helps developers design and evolve serialization schemas using JSON Schema, Protocol Buffers, or Apache Avro. It covers versioning, backward compatibility, validation rules, and evolution strategies for long-lived data formats. Use it when defining new API contracts, modifying existing schemas without breaking consumers, or choosing between schema systems.

快速安装

Claude Code

技能文档

Design Serialization Schema

Make versioned serialization schemas. Evolve gracefully without breaking consumers.

When Use

Define new API contract or data interchange format
Add fields to existing schema without breaking consumers
Migrate between schema versions
Pick between schema systems (JSON Schema, Protobuf, Avro)
Document data validation rules for auto-enforcement

Inputs

Required: Data model (entity relations, field types, constraints)
Required: Compat needs (who consumes, how long must old formats read)
Optional: Existing schema to evolve
Optional: Perf needs (validation speed, schema registry integration)
Optional: Target serialization format (JSON, binary, columnar)

Steps

Step 1: Pick Schema System

System	Format	Strengths	Best For
JSON Schema	JSON	Widely supported, flexible validation	REST APIs, config validation
Protocol Buffers	Binary	Compact, fast, strong typing, built-in evolution	gRPC, microservices
Apache Avro	Binary/JSON	Schema in data, excellent evolution support	Kafka, data pipelines
XML Schema (XSD)	XML	Comprehensive typing, namespace support	Enterprise/legacy SOAP
TypeBox/Zod	TypeScript	Type inference, runtime validation	TypeScript APIs

Got: Schema system picked by ecosystem, perf, evolution needs.

If fail: Unsure? Start with JSON Schema — broadest tooling, layers onto existing JSON APIs.

Step 2: Design Core Schema

JSON Schema example:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/schemas/measurement/v1",
  "title": "Measurement",
  "description": "A sensor measurement reading",
  "type": "object",
  "required": ["sensor_id", "value", "unit", "timestamp"],
  "properties": {
    "sensor_id": {
      "type": "string",
      "pattern": "^[a-z]+-[0-9]+$",
      "description": "Unique sensor identifier (lowercase-digits format)"
    },
    "value": {
      "type": "number",
      "description": "Measured value"
    },
    "unit": {
      "type": "string",
      "enum": ["celsius", "fahrenheit", "kelvin", "percent", "ppm"],
      "description": "Unit of measurement"
    },
    "timestamp": {
      "type": "string",
      "format": "date-time",
      "description": "ISO 8601 timestamp with timezone"
    },
    "metadata": {
      "type": "object",
      "additionalProperties": true,
      "description": "Optional key-value metadata"
    }
  },
  "additionalProperties": false
}

Protocol Buffers example:

syntax = "proto3";
package sensors.v1;

import "google/protobuf/timestamp.proto";

// Measurement represents a single sensor reading.
message Measurement {
  string sensor_id = 1;         // Unique sensor identifier
  double value = 2;             // Measured value
  Unit unit = 3;                // Unit of measurement
  google.protobuf.Timestamp timestamp = 4;
  map<string, string> metadata = 5; // Optional key-value metadata
}

enum Unit {
  UNIT_UNSPECIFIED = 0;
  UNIT_CELSIUS = 1;
  UNIT_FAHRENHEIT = 2;
  UNIT_KELVIN = 3;
  UNIT_PERCENT = 4;
  UNIT_PPM = 5;
}

Apache Avro example:

{
  "type": "record",
  "name": "Measurement",
  "namespace": "com.example.sensors",
  "doc": "A sensor measurement reading",
  "fields": [
    {"name": "sensor_id", "type": "string", "doc": "Unique sensor identifier"},
    {"name": "value", "type": "double", "doc": "Measured value"},
    {"name": "unit", "type": {"type": "enum", "name": "Unit", "symbols": ["CELSIUS", "FAHRENHEIT", "KELVIN", "PERCENT", "PPM"]}},
    {"name": "timestamp", "type": {"type": "long", "logicalType": "timestamp-millis"}},
    {"name": "metadata", "type": ["null", {"type": "map", "values": "string"}], "default": null}
  ]
}

Got: Schema self-documenting. Descriptions, constraints, clear types.

If fail: Data model not stable? Mark schema draft, don't publish to registry.

Step 3: Plan Schema Evolution

Compat rules:

Change	Backwards Compatible?	Forwards Compatible?	Safe?
Add optional field	Yes	Yes	Yes
Add required field	No	Yes	No (breaks existing consumers)
Remove optional field	Yes	No	Careful (producers may still send)
Remove required field	Yes	No	Careful
Rename a field	No	No	No (use alias + deprecation)
Change field type	No	No	No (add new field, deprecate old)
Add enum value	Yes (if consumers ignore unknown)	No	Depends on implementation
Remove enum value	No	Yes	No

Safe evolution:

Only add optional fields with sensible defaults
Never remove or rename — deprecate instead
Version the schema in the identifier (v1, v2)
Use a schema registry for binary formats (Confluent Schema Registry for Avro/Protobuf)

Protobuf evolution rules:

// v1 — original
message Measurement {
  string sensor_id = 1;
  double value = 2;
  Unit unit = 3;
}

// v2 — safe evolution
message Measurement {
  string sensor_id = 1;
  double value = 2;
  Unit unit = 3;
  // NEW: added in v2 — old clients ignore this field
  google.protobuf.Timestamp timestamp = 4;
  // DEPRECATED: use sensor_id instead
  reserved 6;
  reserved "old_sensor_name";
}

JSON Schema versioning:

{
  "$id": "https://example.com/schemas/measurement/v2",
  "allOf": [
    {"$ref": "https://example.com/schemas/measurement/v1"},
    {
      "properties": {
        "location": {
          "type": "string",
          "description": "Added in v2: GPS coordinates"
        }
      }
    }
  ]
}

Got: Evolution plan documented. Safe changes vs new versions clear.

If fail: Breaking change unavoidable? Version schema (v1 → v2), keep parallel support during migration.

Step 4: Impl Schema Validation

# JSON Schema validation (Python)
from jsonschema import validate, ValidationError
import json

schema = json.load(open("measurement_v1.json"))

def validate_measurement(data: dict) -> list[str]:
    """Validate a measurement against the schema. Returns list of errors."""
    errors = []
    try:
        validate(instance=data, schema=schema)
    except ValidationError as e:
        errors.append(f"{e.json_path}: {e.message}")
    return errors

# Usage
errors = validate_measurement({"sensor_id": "s-01", "value": "not_a_number"})
# → ["$.value: 'not_a_number' is not of type 'number'"]

// TypeScript with Zod (runtime + compile-time)
import { z } from 'zod';

const MeasurementSchema = z.object({
  sensor_id: z.string().regex(/^[a-z]+-[0-9]+$/),
  value: z.number(),
  unit: z.enum(['celsius', 'fahrenheit', 'kelvin', 'percent', 'ppm']),
  timestamp: z.string().datetime(),
  metadata: z.record(z.string()).optional(),
});

type Measurement = z.infer<typeof MeasurementSchema>;

// Validation
const result = MeasurementSchema.safeParse(inputData);
if (!result.success) {
  console.error(result.error.issues);
}

Got: Validation runs on all incoming data at system boundaries (API endpoints, file ingestion).

If fail: Log validation errors with full payload (redact sensitive fields) for debugging.

Step 5: Document Schema

Make schema doc page:

# Measurement Schema (v1)

## Overview
Represents a single sensor reading with metadata.

## Fields
| Field | Type | Required | Description | Constraints |
|-------|------|----------|-------------|-------------|
| sensor_id | string | Yes | Unique sensor ID | Pattern: `^[a-z]+-[0-9]+$` |
| value | number | Yes | Measured value | Any valid IEEE 754 double |
| unit | enum | Yes | Unit of measurement | One of: celsius, fahrenheit, kelvin, percent, ppm |
| timestamp | string | Yes | Reading time | ISO 8601 with timezone |
| metadata | object | No | Key-value pairs | String keys and values |

## Changelog
| Version | Date | Changes |
|---------|------|---------|
| v1 | 2025-03-01 | Initial schema |

## Compatibility
- **Backwards**: Consumers of v1 will continue to work with future versions
- **Policy**: Only additive, optional field changes between minor versions

Got: Docs auto-generated or stay in sync with schema definition.

If fail: Docs drift from schema? Add CI check validating docs against schema source.

Checks

Schema uses right system (JSON Schema, Protobuf, Avro)
All fields have types, descriptions, constraints
Required vs optional fields explicit
Evolution strategy documented (safe changes, versioning policy)
Validation at system boundaries
Schema versioned with changelog
Round-trip test: serialize → deserialize → compare, no data loss

Pitfalls

Over-constraining too early: Strict validation on new schema blocks iteration. Start permissive (additionalProperties: true), tighten later.
No default values: New required field without default breaks existing data. Always provide defaults for new fields.
Ignoring null: Many schemas don't handle null/missing cleanly. Be explicit: nullable vs optional.
Version in payload, not URL: Long-lived data (storage, events) → embed schema version in data itself, not just endpoint URL.
Enum exhaustiveness: New enum value can crash consumers using exhaustive switch. Document: unknown values handled gracefully.

GitHub 仓库

pjt222/agent-almanac

路径: i18n/caveman/skills/design-serialization-schema

agentsagentskillsai-assisted-developmentclaude-codeskillsteams

FAQ

Frequently asked questions

What is the design-serialization-schema skill?

design-serialization-schema is a Claude Skill by pjt222. Skills package instructions and resources that Claude loads on demand, so Claude can perform design-serialization-schema-related tasks without extra prompting.

How do I install design-serialization-schema?

Use the install commands on this page: add design-serialization-schema to Claude Code as a plugin, or clone its repository into your skills directory, then restart Claude so it picks up the skill.

What category does design-serialization-schema belong to?

design-serialization-schema is in the Testing category, tagged word, api, automation, design and data.

Is design-serialization-schema free to use?

Yes. design-serialization-schema is listed on AIMCP and free to install. It runs inside Claude, so no separate service account is required to use the skill itself.