-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for lambda/higher order functions #14205
Comments
Hello! I'm interested in working on this. Agreed that tweaking the existing
A more prohibitive interface certainly seems like it would help implementation here, albeit at the expense of consumer flexibility. Any thoughts/feedback would be much appreciated. Thanks! |
Hi @rkrishn7, have you already started any work? Since creating this issue, I’ve been working on a PoC to refine the proposal, and I managed to get it working. I’d like to continue with it if you’re OK with that—I probably should have assigned the issue to myself earlier, my bad 🤦🏻♂️! Anyway, many thanks for your interest and for your questionings, and I would really appreciate your input in futures questions (I expect many to arise!)
I added a new ScalarUDF method invoke_with_lambda_args that receives ScalarFunctionArgs instead of ScalarFunctionArgs, and has a default implementation calling invoke_with_args, returning an error if any arg is a lambda
The lambda implementation should build a RecordBatch, likely using some other concrete argument, and pass it to the lambda PhysicalExpr evaluate method That said, if you have any thoughts on the current interface, I’d love to hear your feedback. It's a |
Lambda syntax only works with databricks dialect, proposed fix here apache/datafusion-sqlparser-rs#1686 |
Hi @gstvg! No worries at all, please continue working on it since you've started! Happy to take a look at the work you've done already and give feedback where it makes sense 👍🏾 |
Is your feature request related to a problem or challenge?
Some engines, like DuckDB and Clickhouse, supports lambda functions, like:
There's already support for the syntax in sqlparser-rs
From #12206 (comment) by @jayzhan211
Describe the solution you'd like
One of
Expr::Lambda{arg_names: Vec<String>, expr: Expr}
variantScalarFunction.args
:Create a
LambdaPhysicalExpr
that holds the lambda physical expr and returns ScalarValue::Null inPhysicalExpr::evaluate
or an error(similar toNoOp
andUnkownColumn
)Add
ScalarUDFImpl::lambdas_schema
to return the types of the args of all lambda arguments it receives, so a input schema can be built and used to generated theLambdaPhysicalExpr
And then, one of:
ScalarFunctionArgs
generic over the arg type, withColumnarValue
as default, and add a newScalarUDF
methodinvoke_with_lambda_args
/invoke_higher_order_with_args
that receivesScalarFunctionArgs<ColumnarValueOrLambda>
instead ofScalarFunctionArgs
, and has a default implementation callinginvoke_with_args
, returning an error if any arg is an lambda. The lambda arg is created if any childrenPhysicalExpr
is aLambdaPhysicalExpr
physical_exprs: Vec<Arc<dyn PhysicalExpr>>
toScalarFunctionArgs
, so lambda expressions can be extracted fromLambdaScalarUDF
trait (Duplication withScalarUDF
, more work and more code to mantain, less flexible than 1, one more trait to document and to users to reason about)ScalarFunctionArgs.args
toVec<ColumnarValueOrLambda>
instead ofVec<ColumnarValue>
(A lot of breakage, including public)Lambda
variant toColumnarValue
(Even more breakge than 2, and a lambda doesn't quite fit the concept of a columnar value)Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: