Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Filters in Trino Rest #292

Open
ihsanciftci opened this issue Aug 1, 2024 · 3 comments
Open

Dynamic Filters in Trino Rest #292

ihsanciftci opened this issue Aug 1, 2024 · 3 comments

Comments

@ihsanciftci
Copy link

I have a rest catalog and a query like:

SELECT * FROM mycat.public.comments where media IN (SELECT id FROM mycat.public.media)

or

SELECT * FROM mycat.public.comments where media = (SELECT id FROM mycat.public.media LIMIT 1)

Suppose comments table is a rest API like GET /{media}/comments and media table returns all media.

No constraint came to applyFilter or getRows.
When I debug the application, I have found that there is DynamicFilter parameter in RecordPageSourceProvider, but it wasn't passed to RecordPageSource.

I have a couple of questions?

  1. Is there a built-in trino feature to call comments N times (for each media, assuming N media are returned)? Or do I need to call comments api N times manually in getRows?
  2. As I described above, I couldn't get the media id in the domain constraint (N media id, or 1 media id as in the second sql). How can I get this constraint?
@Rohlend1
Copy link

I have the same questions

@Rohlend1
Copy link

@nineinchnick can you help us with this?

@nineinchnick
Copy link
Owner

There are a few things to unpack here.

Predicate pushdown support

The API endpoints may support filters with only an equality operator (key=value). If there's a predicate in the SQL that's correctly pushed down, the connector may attempt at making multiple API calls for every value, and create a union of the results.

If an API supported multiple-value filters, then the pushdown implementation in the connector would have to be more advanced.

Dynamic filter support

Dynamic filters might be used when a JOIN is not pushed down. They are predicates generated by the engine, after fetching data from one side of the join. Note that predicates with subqueries can be turned into a join, so a query like SELECT * FROM tableA WHERE column IN (SELECT column FROM tableB) could be equivalent to SELECT tableA.* FROM tableA JOIN tableB USING (column). The tricky part with implementing this is you not only need dynamic filters, but also table statistics, so the engine can properly assign the right tables to the build and probe sides of the join.

Also check out a newer connector that's more generic: https://github.com/nineinchnick/trino-openapi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants