Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for case-insensitive object names in Iceberg REST catalog #23867

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mayankvadariya
Copy link
Contributor

@mayankvadariya mayankvadariya commented Oct 22, 2024

Certain Rest catalog implementation(eg. Polaris) supports case-sensitive object(namespace, table, view etc.) names. This change allows querying mixed/upper case letter objects in rest catalog from Trino.
iceberg.rest-catalog.case-insensitive-name-matching controls whether to match lowercase object names in Trino with different case object names in rest catalog with the limitations that only single object name with the same name is supported.

Fixes #23715

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Add support for case-insensitive object names in Iceberg REST catalog. ({issue}`23715`)

@cla-bot cla-bot bot added the cla-signed label Oct 22, 2024
@github-actions github-actions bot added docs iceberg Iceberg connector labels Oct 22, 2024
@@ -56,6 +60,10 @@ public class TrinoIcebergRestCatalogFactory
private final SecurityProperties securityProperties;
private final boolean uniqueTableLocation;
private final TypeManager typeManager;
private final boolean caseInsensitiveNameMatching;
private final Cache<Namespace, Namespace> remoteNamespaceMappingCache;
private final Cache<TableIdentifier, TableIdentifier> remoteTableMappingCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remoteTableMappingCache looks having a cache of views. Could you confirm that? You can check by creating a view, setting the breakpoint listTables and executing SHOW TABLES statement.

Copy link
Contributor Author

@mayankvadariya mayankvadariya Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find 🙌 , thanks for debugging cache states. Apparently some methods require live mapped object names without those being cached. With 23f0b4b, now table/view cache has correct entries. I'll explore if there are possibility of sharing table/view cache now given methods interfering caches now uses live mapped entries instead of loading other object's cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit to use single cache for table/view. Please review.

@mayankvadariya mayankvadariya force-pushed the mayank/add-nested-namespace-support-in-rest-catalog branch 3 times, most recently from a5c27dc to 83ca21d Compare October 29, 2024 03:44
* - `iceberg.rest-catalog.case-insensitive-name-matching`
- Match namespace, table, and view names case insensitively. Defaults to `false`.
* - `iceberg.rest-catalog.case-insensitive-name-matching.cache-ttl`
- [Duration](prop-type-duration) for which case-insensitive namespace, table, and view
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move and view to the next line?

Comment on lines +64 to +65
private final Optional<Cache<Namespace, Namespace>> remoteNamespaceMappingCache;
private final Optional<Cache<TableIdentifier, TableIdentifier>> remoteTableMappingCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change to optional type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because cache is not always required unless caseinsenstive is enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache provides shareNothingWhenDisabled option. No need to use optional type in my opinion. It reduces readability and also causes warning (e.g. L197) unless we confirm the presence in all places.

@@ -533,10 +581,11 @@ public Optional<ConnectorViewDefinition> getView(ConnectorSession session, Schem
});
}

private Optional<View> getIcebergView(ConnectorSession session, SchemaTableName viewName)
private Optional<View> getIcebergView(ConnectorSession session, SchemaTableName viewName, boolean findLiveView)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic is little hard to understand than before. I don't think code readers can understand the meaning of "Live" objects easily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about using same method and decide based on getCached parameter as done in 6a5a081

Certain Rest catalog implementation(eg. Polaris) supports
case-sensitive object(namespace, table, view etc.) names. This
change allows querying mixed/upper case letter objects in rest
catalog from Trino.
`iceberg.rest-catalog.case-insensitive-name-matching` controls
whether to match lowercase object names in Trino with different case
object names in rest catalog with the limitations that only single
object name with the same name is supported.
@mayankvadariya mayankvadariya force-pushed the mayank/add-nested-namespace-support-in-rest-catalog branch from 83ca21d to 181d065 Compare November 4, 2024 04:00
@mayankvadariya mayankvadariya force-pushed the mayank/add-nested-namespace-support-in-rest-catalog branch from 181d065 to 6a5a081 Compare November 4, 2024 04:04
@mayankvadariya
Copy link
Contributor Author

ran newly added tests by modifying ci.yml. All of the trino-iceberg jobs have passed.

one of the trino-iceberg job was cancelled for some unrelated reason https://github.com/trinodb/trino/actions/runs/11674050614/job/32506029684?pr=23867

@mayankvadariya mayankvadariya force-pushed the mayank/add-nested-namespace-support-in-rest-catalog branch from 6e7a8ee to 4af6c9d Compare November 4, 2024 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

Snowflake created objects in external Polaris catalog are not accessible from Trino
3 participants