Schema Evolution and Compatibility#

Every service contract changes over time. New fields get added, old fields get removed, types change. In a monolith, you update the schema and redeploy. In microservices, producers and consumers deploy independently. A schema change that breaks consumers causes production failures. Schema evolution rules and tooling exist to prevent this.

Compatibility Modes#

There are four compatibility modes. Understanding them is essential for operating any schema registry.

Backward compatible: New schema can read data written by the old schema. You can deploy a new consumer before the producer updates. This is the most common mode. Adding an optional field with a default value is backward compatible.

Forward compatible: Old schema can read data written by the new schema. You can deploy a new producer before consumers update. Removing an optional field is forward compatible (old readers ignore the missing field).

Full compatible: Both backward and forward compatible simultaneously. New and old schemas can read each other’s data. This is the safest mode but the most restrictive.

None: No compatibility checking. Use this only for development or when you can coordinate simultaneous deployment of all producers and consumers.

Avro Schema Evolution Rules#

Avro uses a schema-on-read approach: the reader schema and writer schema can differ, and Avro resolves the differences.

Safe Changes (Backward Compatible)#

Adding a field with a default value:

// v1
{
  "type": "record",
  "name": "OrderEvent",
  "fields": [
    {"name": "order_id", "type": "string"},
    {"name": "amount", "type": "double"}
  ]
}

// v2 - added optional field with default
{
  "type": "record",
  "name": "OrderEvent",
  "fields": [
    {"name": "order_id", "type": "string"},
    {"name": "amount", "type": "double"},
    {"name": "currency", "type": "string", "default": "USD"}
  ]
}

A v2 reader consuming v1 data fills in "USD" for the missing currency field.

Dangerous Changes#

Removing a field without a default (breaks backward compatibility): A new reader expects a field that does not exist in old data.

Changing a field type: Avro allows some promotions (int to long, float to double) but changing string to int breaks compatibility.

Renaming a field: Avro matches by field name. Renaming is a remove + add, which is a breaking change. Use aliases instead:

{"name": "total_amount", "type": "double", "aliases": ["amount"]}

Protobuf Schema Evolution Rules#

Protobuf uses field numbers for wire format, not names. This changes the evolution rules significantly.

Safe Changes#

// v1
message OrderEvent {
  string order_id = 1;
  double amount = 2;
}

// v2 - added new field (safe: old readers ignore unknown fields)
message OrderEvent {
  string order_id = 1;
  double amount = 2;
  string currency = 3;  // New field, new number
}

Adding fields with new field numbers is always safe. Old readers skip unknown field numbers.

Breaking Changes#

Reusing a field number: If you remove field 3 and later add a new field 3 with a different type, old data with the original field 3 will be misinterpreted. Always use reserved to prevent reuse:

message OrderEvent {
  string order_id = 1;
  double amount = 2;
  reserved 3;  // Previously "currency", never reuse
  string currency_code = 4;  // New field, new number
}

Changing field types: Changing int32 to string on the same field number is a wire-format break.

Changing field cardinality: Changing optional to repeated or vice versa changes the wire format.

JSON Schema Evolution#

JSON Schema has no wire format, so evolution is purely about validation. The rules are simpler but less enforceable.

Safe Changes#

  • Adding an optional property (not in required)
  • Relaxing a constraint (changing maxLength: 50 to maxLength: 100)
  • Adding a value to an enum

Breaking Changes#

  • Adding a property to required
  • Tightening a constraint
  • Removing a property that consumers depend on
  • Changing a property type

JSON Schema lacks built-in evolution support. You enforce compatibility through CI checks and versioned schemas.

Schema Registries#

Confluent Schema Registry#

The standard for Kafka-based systems. Stores Avro, Protobuf, and JSON Schema.

# Register a schema
curl -X POST http://schema-registry:8081/subjects/order-events-value/versions \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d '{"schemaType": "AVRO", "schema": "{\"type\":\"record\",\"name\":\"OrderEvent\",\"fields\":[{\"name\":\"order_id\",\"type\":\"string\"},{\"name\":\"amount\",\"type\":\"double\"}]}"}'
# Returns: {"id": 1}

# Set compatibility for a subject
curl -X PUT http://schema-registry:8081/config/order-events-value \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d '{"compatibility": "BACKWARD"}'

# Check compatibility before registering a new version
curl -X POST http://schema-registry:8081/compatibility/subjects/order-events-value/versions/latest \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d '{"schemaType": "AVRO", "schema": "<new schema json>"}'

Producer and consumer clients fetch schemas by ID from the registry at runtime. The schema ID is embedded in the first bytes of each Kafka message.

Apicurio Registry#

Open-source alternative that supports the same Confluent API plus additional features: schema groups, custom rules, and a broader set of formats (GraphQL, OpenAPI, AsyncAPI).

# Register with Apicurio (Confluent-compatible API)
curl -X POST http://apicurio:8080/apis/ccompat/v7/subjects/order-events-value/versions \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d '{"schemaType": "AVRO", "schema": "<schema json>"}'

Breaking Change Detection in CI#

Protobuf with buf#

# .github/workflows/schema-check.yml
jobs:
  schema-lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: bufbuild/buf-setup-action@v1
      - run: buf lint
      - run: buf breaking --against '.git#branch=main'

Avro with Schema Registry#

# In CI, test compatibility against the registry before merge
COMPAT=$(curl -s -X POST \
  http://schema-registry:8081/compatibility/subjects/${SUBJECT}/versions/latest \
  -H "Content-Type: application/vnd.schemaregistry.v1+json" \
  -d "{\"schemaType\": \"AVRO\", \"schema\": \"$(cat schema.avsc | jq -Rs .)\"}")

if echo "$COMPAT" | jq -e '.is_compatible == false' > /dev/null; then
  echo "BREAKING CHANGE DETECTED"
  exit 1
fi

Migration Strategies#

Additive evolution (preferred): Only add optional fields. Never remove or rename. Old and new schemas coexist indefinitely. This is the simplest strategy but leads to schema bloat over time.

Dual-write migration: When a breaking change is unavoidable: (1) deploy a new topic/API version with the new schema, (2) have producers write to both old and new, (3) migrate consumers to the new version, (4) stop writing to the old version, (5) decommission old topic/endpoint.

Schema versioning in topics: Use versioned topic names (orders.v1, orders.v2) for breaking changes. Consumers migrate at their own pace. Run both topics in parallel until all consumers have migrated.

API versioning for HTTP: Use URL path versioning (/api/v1/orders, /api/v2/orders) or content negotiation (Accept: application/vnd.company.order.v2+json). Run both versions simultaneously with a deprecation timeline.

Practical Rules#

  1. Default to backward compatibility. It is the safest mode for independent deployments.
  2. Never reuse field numbers in Protobuf. Mark removed fields as reserved.
  3. Always add defaults to new Avro fields. Fields without defaults break backward compatibility.
  4. Run compatibility checks in CI. Do not rely on humans to catch breaking changes during code review.
  5. Version your schemas in source control. Track schema files alongside service code so changes are reviewed together.
  6. Plan for deprecation. When adding a new field that replaces an old one, document the timeline for removing the old field and communicate it to consuming teams.