-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support for Protocol Buffer references (imports) #135
base: master
Are you sure you want to change the base?
Implement support for Protocol Buffer references (imports) #135
Conversation
1c0d36a
to
a47dd57
Compare
Tested it out, seems to be the only solution for imports atm. few issues here:
A.proto
B.proto
C.proto
D.proto
Error would be something like this: |
What:
Add support for Protocol Buffer imports and schema references.
I believe this is a feature change without any breaking changes.
Why:
Imports are very useful when writing schema files using Protocol Buffers. And in order to register a schema that uses imports, we need to use Schema Register's
references
feature to let it know where to find those imported types. Otherwise, it fails to register the schema.Previous work:
I believe this relates to #82
There are two other related PRs open in #92 and #121. As far as I can tell, the first extends
getSchema
to retrieve references from Schema Registry. The second extendsregister()
by asking the user to provide the reference schema ids inuserOpts
. This PR attempts to extendregister()
to automatically register needed references when registering Protocol Buffer schemas, and tries to makegetSchema()
create a Protocol Buffer schema instance that includes these references.How:
When registering a new schema, the
register
method now makes use of a new user-providedfetchSchema
helper to fetch definitions of imported schemas, as well as a newreferencedSchemas
SchemaHelper that this PR implements for Protocol Buffers, but which could also be implemented for at least JSON schemas.The
fetchSchema
helper tells us how to fetch schema definitions, so we can get the definitions of any imported files. The user of the library decides how to implement this helper, but I guess a common pattern would be to read the proto from a location on disk.The
referencedSchemas
SchemaHelper tells us which other schemas are referenced. This PR implements it for Protos and leaves it returning[]
for AVRO and JSON schemas. It could be implemented for other schema formats now or later.The
register
method is then extended to use these helpers to get the schema references needed, use get their schema definitions, make sure those are all registered recursively and then included as thereferences
of the newly registered schema.Schema registry references are a list of
[(Subject, Version)]
where theSubject
matches the imported name. For protocol buffers, the Subject is the name of the imported protocolbuffer.https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#referenced-schemas
To get
define
andencode
to work, I extended the ProtoSchema constructor so it can take references into account when building the schema instance. It fetches the Schema instances for the references and adds them to the Root context for the new Schema instance, so it has all the types it needs.I then extended
schemaFromConfluentSchema
to pass through the referenced schema.Testing
I have added tests to validate behaviour in a few cases:
TODO before merging
this.root.add()
or whether it assumes ownership over its input argument and mutates it.protobufjs
library: https://github.com/protobufjs/protobuf.js/blob/master/tests/data/common.protoFuture work
In order to use this new functionality, we should implement support for Message Indexes. Since Protocol Buffers can contain multiple type definitions in a single schema file, each encoded Kafka message includes a Message Index that tells the consumer which Protocol Buffer
Message
type to use for parsing the message. I think this feature will either need to be feature flagged, or a breaking change, since it changes the binary wire format.https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#wire-format