Encoding should include MessageIndexes for parity with Java #248

pcoleman00st · 2023-08-04T07:49:26Z

There is a discrepancy between the way the java KafkaProtobufSerializer works and the current @kafkajs. We are told from Confluent's documentation that when serializing to Protobuf the message looks like:

0: 1-4: 5-End
[Magic_Byte] [RegistryId] [Serialized-Protobuf]

However this is not how the java serializer / deserializer is behaving; between the 5th byte and the start of the serialized protobuf they serialize the MessageIndex. In the case the .protofile contains just a single message then the 6th byte will be 0. In the case it contains several messages then the 6th byte is represents the size of the indexes collection, and the 7th byte represents the index of the message. E.g.: A .protofile containing 3 messages, were we serializing the first message then the 6th and 7th byte respectively would be 0x02,0x00, 2nd message: 0x02, 0x02, 3rd message 0x02, 0x04. It gets a bit more complex when dealing with nested types - nested types end up adding to the collection of message indexes to represent the nested type, so more bytes are used to denote the type hierarchy, and the byte prior to the serialized object represents the specific type as per above.

The difficulty here is that the java KafkaProtobuf(De)Serializer doesn't recognize those messages encoded by the confluent-schema-registry. The reason we chose Protobuf as a serialization technology was for speed, size but most importantly, cross platform. We lose this with the missing bytes representing MessageIndexes as per: io.confluent:kafka-protobuf-serializer:7.4.0 MessageIndexes L40, ProtoSchema L2157

Thanks

pcoleman00st · 2023-08-04T08:37:40Z

Ooops, duplicate of #152, didn't see that sorry.

pcoleman00st closed this as not planned Won't fix, can't repro, duplicate, stale Aug 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding should include MessageIndexes for parity with Java #248

Encoding should include MessageIndexes for parity with Java #248

pcoleman00st commented Aug 4, 2023

pcoleman00st commented Aug 4, 2023

Encoding should include MessageIndexes for parity with Java #248

Encoding should include MessageIndexes for parity with Java #248

Comments

pcoleman00st commented Aug 4, 2023

pcoleman00st commented Aug 4, 2023