-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[substrait] Add support for ExtensionTable #13772
base: main
Are you sure you want to change the base?
Conversation
a65e926
to
bac38cc
Compare
"Deserializing user defined logical plan node `{name}` is not supported" | ||
) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SerializerRegistry
trait now has two more methods for handling tables (with default implementations for backwards compatibility), so it makes sense for the existing methods to have default implementations as well.
This will allow implementors to conveniently implement the trait for user-defined logical nodes only or for tables only.
Since the implementations here are perfect as trait defaults, this PR just moves them into the trait itself.
8db92b8
to
a12007b
Compare
@@ -994,8 +994,34 @@ pub async fn from_substrait_rel( | |||
) | |||
.await | |||
} | |||
_ => { | |||
not_impl_err!("Unsupported ReadType: {:?}", &read.as_ref().read_type) | |||
Some(ReadType::ExtensionTable(ext)) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't actually get to handling ExtensionTables in #13803, but I think I've got most what we would need in place for it.
The way I would approach it, based on the work in that PR, is to add a method to the SubstraitConsumer
trait like:
async fn consume_extension_table(&self, extension_table: &ExtensionTable) -> Result<LogicalPlan>
and wire it in here. Then, as a user you would be able to provide your own implementation of the decoder. This might user the SerializerRegistry, but it doesn't necessarily need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great! I like where that is going.
My goal here was to add the missing support for reading & writing extension tables leveraging only what's available (to keep the patch as small as possible).
But I do agree that uniform handling of Substrait extensions would make more sense and would overcome some of the current limitations.
I think the code here is really easy to migrate to the new SubstraitConsumer
(and SubstraitProducer
?) interfaces once they're available, by just replacing the SerializerRegistry
calls with the new dedicated read/write methods. But the rest would be mostly unchanged by this migration.
FWIW, once your refactoring is complete, I think there would be no place for SerializerRegistry
anymore, and it should be removed (or at least deprecated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and SubstraitProducer?)
I haven't gotten around to the producer yet, but if the SubstraitConsumer pattern makes sense to folks it should be easy enough to hammer it out.
I think the code here is really easy to migrate to the new SubstraitConsumer
Would you be open to having #13803 merged first, and then porting your code over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can do that.
|
Which issue does this PR close?
Closes #13771.
Addresses the first bullet in #13318.
Rationale for this change
Adds support for encoding/decoding custom table providers as
ExtensionTable
s in Substrait.What changes are included in this PR?
Two more methods for
SerializerRegistry
to handleTableSource
s and the necessary changes infrom/to _substrait_plan
to transparently map custom tables to ExtensionTable nodes.Are these changes tested?
A round trip test is included.
Are there any user-facing changes?
No breaking changes, only a couple of convenience changes, such as default implementations for trait methods.