Data Formats (Serialization)
Protocol Buffers and Descriptors
A Descriptor defines the data serialization format for a record’s native data format. Descriptors are defined for an entity within a Store. This provides different serialization formats for records within a store, while defining a clear schema definition for a stream of records through the store. See Working with ProtoBuf Serialized Data and DeltaStream Descriptors and CREATE DESCRIPTOR_SOURCE for more information on how to import and create descriptors in DeltaStream. Currently, DeltaStream uses descriptors to support data in ProtoBuf format.
JSON
A JSON serialization format is assumed for an entity, if no schema registry is defined for the corresponding store and no descriptor is defined for the entity. If a schema registry is present in the store and a descriptor is defined for an entity, DeltaStream uses the descriptor to serialize that entity.
Avro and Schema Registry
A schema registry is a service to manage message schemas in streaming stores such as Apache Kafka. Message schemas are used to serialize and deserialize messages stored in entities. In DeltaStream, a schema registry is a representation of a schema registry service that can be used to fetch and store schemas for entities in the service. A store can use one schema registry at a time, but a schema registry can be used by multiple stores. When a schema registry is attached to a store, any entity in that store using data serialization formats requiring a schema registry will use the store’s associated schema registry to fetch the schemas/metadata necessary for marshalling and unmarshalling data events. The schema for an event, fetched by the schema registry, should not be confused with the schema belonging to a database for organizing relations.
Last updated