Guides

Schemas

Define and evolve message schemas

Overview

The Schema Registry provides a centralized repository for managing and validating schemas for your Kafka data. Schemas define the structure of your messages and ensure that producers and consumers agree on the data format.

Using schemas helps prevent data corruption, enables safe schema evolution, and provides documentation for your event structures.

Schema Types

Avro

Apache Avro is a compact, fast, binary data format. It provides rich data structures and excellent schema evolution support. Recommended for most use cases.

JSON Schema

JSON Schema validates JSON documents. It's human-readable and works well when you need to inspect messages directly. Larger message sizes than Avro.

Protobuf (Coming Soon)

Protocol Buffers is Google's language-neutral serialization format. It's very compact and efficient, with excellent multi-language support. Not yet supported by this API.

Subject Naming Convention

Schema subjects follow the TopicNameStrategy, where the subject is derived from the topic name with a suffix indicating whether it's for keys or values:

{topic_name}-{key|value}

Since topic names follow the landing zone naming convention, schema subjects are structured as:

Dedicated Clusters

{business_unit}-{stage}-{topic_identifier}-{key|value}

Example: sales-dev-crm-orders-value

Shared Clusters

{landing_zone}-{business_unit}-{stage}-{topic_identifier}-{key|value}

Example: edh-shared-scada-dev-crm-orders-value

landing_zone: Shared cluster landing zone identifier (e.g., "edh-shared")

business_unit: Your organization's identifier (e.g., "scada", "sales")

stage: Environment stage (dev, qas, or run)

topic_identifier: The topic's identifier (e.g., "crm-orders")

key|value: Whether this schema is for message keys or values

Note

When creating a schema, you provide the topic name and subject type (key or value). The system automatically constructs the full subject name by appending the subject type suffix.

Schema Compatibility

Compatibility rules determine what schema changes are allowed:

Mode Description Allowed Changes
BACKWARD New schema can read old data Delete fields, add optional fields
FORWARD Old schema can read new data Add fields, delete optional fields
FULL Both backward and forward compatible Add/delete optional fields only
NONE No compatibility checking Any change allowed

Schema Evolution

As your data requirements change, you'll need to evolve your schemas. Safe evolution practices:

  • Always provide default values for new fields
  • Mark fields as optional when they might not always be present
  • Never rename fields; add new ones and deprecate old ones
  • Don't change field types; add new fields with the new type
  • Test schema changes against existing data before deploying

Best Practices

  • Always use schemas for production topics
  • Choose Avro for most use cases due to its compact size and evolution support
  • Use BACKWARD compatibility for consumer-first development
  • Document your schemas with descriptions and field documentation
  • Version your schemas and track changes over time
  • Test schema compatibility before registering new versions
  • Keep schemas simple and focused on a single business entity
Esc