Guides
Schemas
Define and evolve message schemas
Overview
The Schema Registry provides a centralized repository for managing and validating schemas for your Kafka data. Schemas define the structure of your messages and ensure that producers and consumers agree on the data format.
Using schemas helps prevent data corruption, enables safe schema evolution, and provides documentation for your event structures.
Confluent Documentation
Schema Types
Avro
Apache Avro is a compact, fast, binary data format. It provides rich data structures and excellent schema evolution support. Recommended for most use cases.
JSON Schema
JSON Schema validates JSON documents. It's human-readable and works well when you need to inspect messages directly. Larger message sizes than Avro.
Protobuf (Coming Soon)
Protocol Buffers is Google's language-neutral serialization format. It's very compact and efficient, with excellent multi-language support. Not yet supported by this API.
Subject Naming Convention
Schema subjects follow the TopicNameStrategy, where the subject is derived from the topic name with a suffix indicating whether it's for keys or values:
{topic_name}-{key|value}
Since topic names follow the landing zone naming convention, schema subjects are structured as:
Dedicated Clusters
{business_unit}-{stage}-{topic_identifier}-{key|value}
Example: sales-dev-crm-orders-value
Shared Clusters
{landing_zone}-{business_unit}-{stage}-{topic_identifier}-{key|value}
Example: edh-shared-scada-dev-crm-orders-value
landing_zone: Shared cluster landing zone identifier (e.g., "edh-shared")
business_unit: Your organization's identifier (e.g., "scada", "sales")
stage: Environment stage (dev, qas, or run)
topic_identifier: The topic's identifier (e.g., "crm-orders")
key|value: Whether this schema is for message keys or values
Note
Schema Compatibility
Compatibility rules determine what schema changes are allowed:
| Mode | Description | Allowed Changes |
|---|---|---|
BACKWARD |
New schema can read old data | Delete fields, add optional fields |
FORWARD |
Old schema can read new data | Add fields, delete optional fields |
FULL |
Both backward and forward compatible | Add/delete optional fields only |
NONE |
No compatibility checking | Any change allowed |
Schema Evolution
As your data requirements change, you'll need to evolve your schemas. Safe evolution practices:
- Always provide default values for new fields
- Mark fields as optional when they might not always be present
- Never rename fields; add new ones and deprecate old ones
- Don't change field types; add new fields with the new type
- Test schema changes against existing data before deploying
Best Practices
- Always use schemas for production topics
- Choose Avro for most use cases due to its compact size and evolution support
- Use BACKWARD compatibility for consumer-first development
- Document your schemas with descriptions and field documentation
- Version your schemas and track changes over time
- Test schema compatibility before registering new versions
- Keep schemas simple and focused on a single business entity