You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running SHOW SCHEMAS IN iceberg, Trino is making one "list namespaces" API request to the REST catalog to get a list of top-level namespaces, and then one "list namespaces" API request per namespace returned in the first request. If I have several thousand namespaces, this results in thousands of API requests being sent to the iceberg REST catalog, and takes a long time to return the results. For our use case, we're planning on having thousands to millions of namespaces, so this would put a lot of load on our REST catalog if we were to run this command.
As an example with 10000 schemas, Trino made ~10000 API requests to the REST catalog and took 54 seconds to return the list of schemas:
I think this was introduced in #22916 as a way to support nested namespaces. Since the Iceberg REST API only has a way to list direct descendants of a given namespace, you have to recursively make "list namespace" requests to list all of them.
I checked what spark does in this case. If I run SHOW SCHEMAS, it makes a single "list namespace" request, and only returns the top level namespaces.
spark-sql ()> SHOW SCHEMAS;
<top level results...>
Time taken: 0.223 seconds, Fetched 10025 row(s)
If I want to show nested namespaces, I need to explicitly request the list of nested namespaces in the parent:
spark-sql ()> SHOW SCHEMAS in a;
SHOW SCHEMAS in a
a.a
a.b
a.c
a.d
Time taken: 0.174 seconds, Fetched 4 row(s)
The text was updated successfully, but these errors were encountered:
Effectively Trino doesn't support SHOW SCHEMAS in "catalog.schema"; syntax https://trino.io/docs/current/sql/show-schemas.html
Recursively calling listnamespaces seems the way to support nested namespace. @ebyhr does it make sense to control nested namespace support via a config to avoid making recursive calls in case a catalog is not configured to query nested namespace?
Description
Trino version: 463
When running
SHOW SCHEMAS IN iceberg
, Trino is making one "list namespaces" API request to the REST catalog to get a list of top-level namespaces, and then one "list namespaces" API request per namespace returned in the first request. If I have several thousand namespaces, this results in thousands of API requests being sent to the iceberg REST catalog, and takes a long time to return the results. For our use case, we're planning on having thousands to millions of namespaces, so this would put a lot of load on our REST catalog if we were to run this command.As an example with 10000 schemas, Trino made ~10000 API requests to the REST catalog and took 54 seconds to return the list of schemas:
I think this was introduced in #22916 as a way to support nested namespaces. Since the Iceberg REST API only has a way to list direct descendants of a given namespace, you have to recursively make "list namespace" requests to list all of them.
I checked what spark does in this case. If I run
SHOW SCHEMAS
, it makes a single "list namespace" request, and only returns the top level namespaces.If I want to show nested namespaces, I need to explicitly request the list of nested namespaces in the parent:
The text was updated successfully, but these errors were encountered: