Can't iceberg_scan specific manifest version from S3: No such file or directory #62

ssanchozz · 2024-08-02T15:47:43Z

I'm trying to use iceberg extension to read iceberg data from S3.
As a test data I'm using an example attached to the iceberg extension doc page.
As s3 storage I'm using MinIO in docker.
I execute my query from java code, using org.duckdb:duckdb_jdbc:1.0.0.

When I query the whole table with allow_moved_paths = true, the tool works fine.
When I query a specific metadata (e.g. v1), following the example from the doc I get an error:

java.sql.SQLException: IO Error: Cannot open file "lineitem_iceberg/metadata/snap-3776207205136740581-1-cf3d0be5-cf70-453d-ad8f-48fdc412e608.avro": No such file or directory
java.sql.SQLException: java.sql.SQLException: IO Error: Cannot open file "lineitem_iceberg/metadata/snap-3776207205136740581-1-cf3d0be5-cf70-453d-ad8f-48fdc412e608.avro": No such file or directory

Query looks like this:
SELECT * FROM iceberg_scan('s3://bucketname/lineitem_iceberg/metadata/v1.metadata.json');

If I try to use another example:
SELECT * FROM iceberg_scan('s3://bucketname/lineitem_iceberg/metadata/02701-1e474dc7-4723-4f8d-a8b3-b5f0454eb7ce.metadata.json', allow_moved_paths = true);

I get an exception: Enabling allow_moved_paths is not enabled for directly scanning metadata files., because of this line.

The questions are:

Why do we prohibit to use allow_moved_paths when querying specific version of metadata? Maybe we can remove this check and allow allow_moved_paths?
Any other idea what's wrong here and how can we fix?

The text was updated successfully, but these errors were encountered:

mike-luabase · 2024-08-03T12:49:16Z

does this work for you locally?

ssanchozz · 2024-08-03T18:50:44Z

does this work for you locally?

What do you mean by locally? I'm running this locally, but querying data which is in minIO in docker on local machine.

However I've tried to do the same, storing the data on the local filesystem and querying it and got the same error:

java.sql.SQLException: IO Error: Cannot open file "lineitem_iceberg/metadata/snap-3776207205136740581-1-cf3d0be5-cf70-453d-ad8f-48fdc412e608.avro": No such file or directory

The query I've used for this is:
SELECT count(*) FROM iceberg_scan('/<absolute_path_on_my_local_machine>/lineitem_iceberg/metadata/v1.metadata.json');

And if I query like this, it works fine:
SELECT count(*) FROM iceberg_scan('/<absolute_path_on_my_local_machine>/lineitem_iceberg', allow_moved_paths = true);

fabito · 2024-11-28T10:23:09Z

I was facing the same issue.
Creating a secret solved the issue:

CREATE SECRET secret3 (
     TYPE S3,
     PROVIDER CREDENTIAL_CHAIN,
     CHAIN 'env;config'
 );

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't iceberg_scan specific manifest version from S3: No such file or directory #62

Can't iceberg_scan specific manifest version from S3: No such file or directory #62

ssanchozz commented Aug 2, 2024

mike-luabase commented Aug 3, 2024

ssanchozz commented Aug 3, 2024 •

edited

Loading

fabito commented Nov 28, 2024

Can't iceberg_scan specific manifest version from S3: No such file or directory #62

Can't iceberg_scan specific manifest version from S3: No such file or directory #62

Comments

ssanchozz commented Aug 2, 2024

mike-luabase commented Aug 3, 2024

ssanchozz commented Aug 3, 2024 • edited Loading

fabito commented Nov 28, 2024

ssanchozz commented Aug 3, 2024 •

edited

Loading