-
Notifications
You must be signed in to change notification settings - Fork 306
Description
Is your feature request related to a problem or challenge?
iceberg-rust
should implement a HadoopCatalog
to read file-based catalogs, through existing FileIO
stores like file://
, s3://
, s3a://
, etc.
I see there has been a PR with a community contribution that has been effectively declined, due to concerns over deprecating file-based catalogs: #1313
However, my opinion is that there are surely many users with large Hadoop Catalogs who could benefit from this crate implementing a Hadoop Catalog. As the user from the PR mentions, StaticTable
can be ineffective for large catalogs.
Describe the solution you'd like
My proposal is that instead of implementing a Hadoop Catalog with full read-write functionality, the catalog implements read-only functionality. This may favor better with the community, by avoiding creating more file-based Hadoop Catalogs - while still providing users with these catalogs a method to read their catalogs (and potentially migrate to other catalogs).
As a disclaimer, I am already working on an implementation for this solution. I would love to contribute it back to the crate, so I am creating this issue early for discussion. I will be primarily focusing on file://
, s3://
and/or s3a://
FileIO
connections.
Willingness to contribute
I can contribute to this feature independently