Iceberg planning time significantly longer than Hive

Example:
TPCHQ7
Hive

<img width="941" height="228" alt="Image" src="https://github.com/user-attachments/assets/4320f885-1b88-4850-ad80-96c516a4fc2b" />

https://grafana.ibm.prestodb.dev/d/cc6a4ccf-2fc7-4220-8f4e-397b03061acf/query-detail-comparisons?orgId=1&var-query1=20251204_084902_00055_xhkuu&var-query2=20251204_084725_00052_xhkuu

TPCHQ8

<img width="932" height="207" alt="Image" src="https://github.com/user-attachments/assets/0f467468-b5f3-4c23-8d94-0bc89cba24b9" />

https://grafana.ibm.prestodb.dev/d/cc6a4ccf-2fc7-4220-8f4e-397b03061acf/query-detail-comparisons?orgId=1&var-query1=20251204_084213_00049_xhkuu&var-query2=20251204_084227_00050_xhkuu

I reran TPCDS Q60 on a medium cluster, and the query elapsed time was stable at 16s and planning time at 13s. I took JFR profile on the coordinator and saw
- The planning thread(s) is repeatedly blocking in socket reads to the remote service.
- The total blocking time on those reads is ~11–12s.
- end-to-end planning time (~13s) lines up almost exactly with the cumulated socket-read time.
The hotspot was in IcebergHiveMetadata.getTableStatistics and through ThrifthiveMetastore. This confirms that network I/O to the metastore is the dominant cause of slow planning, not local CPU.

<img width="1714" height="1261" alt="Image" src="https://github.com/user-attachments/assets/8e500d65-f235-43a8-abb7-463d134220f6" />

By manually enabling the metastore cache in com/facebook/presto/hive/metastore/InMemoryCachingHiveMetastore.java, 
```
private InMemoryCachingHiveMetastore(...) 
{
    ...
    //        switch (metastoreCacheScope) {
//            case PARTITION:
//                partitionCacheExpiresAfterWriteMillis = expiresAfterWriteMillis;
//                partitionCacheRefreshMills = refreshMills;
//                partitionCacheMaxSize = maximumSize;
//                cacheExpiresAfterWriteMillis = OptionalLong.of(0);
//                cacheRefreshMills = OptionalLong.of(0);
//                cacheMaxSize = 0;
//                break;
//
//            case ALL:
//                partitionCacheExpiresAfterWriteMillis = expiresAfterWriteMillis;
//                partitionCacheRefreshMills = refreshMills;
//                partitionCacheMaxSize = maximumSize;
//                cacheExpiresAfterWriteMillis = expiresAfterWriteMillis;
//                cacheRefreshMills = refreshMills;
//                cacheMaxSize = maximumSize;
//                break;
//
//            default:
//                throw new IllegalArgumentException("Unknown metastore-cache-scope: " + metastoreCacheScope);
//        }

        partitionCacheExpiresAfterWriteMillis = OptionalLong.of(999999999999L);
        partitionCacheRefreshMills = OptionalLong.of(999999999999L);
        partitionCacheMaxSize = 999999999999L;
        cacheExpiresAfterWriteMillis = OptionalLong.of(999999999999L);
        cacheRefreshMills = OptionalLong.of(999999999999L);
        cacheMaxSize = 999999999999L;
    ...
}
```
The query planning time for TPCDS 65 and 56 dropped from >13s to 1.6s. However the config property for it was disabled because the CacheTtl was hard coded to 0.

@agrawalreetika  pointed out that Iceberg metadata caching was disabled in https://github.com/prestodb/presto/pull/24326. In the issue @imjalpreet  mentioned "The ideal long-term solution would involve allowing caching of specific metadata, such as table statistics or the list of table names, which do not violate the ACID contract. However, as an immediate fix, we should disallow enabling the metastore cache when using the Iceberg connector." However the mentioned metadata caches were not implemented yet. @imjalpreet's https://github.com/prestodb/presto/pull/24376 add caching for getTable(), but it doesn't work for getTableStatistics().

@zacw7 : When you did the runs earlier this year which showed similar performance between Iceberg vs Hive, was it before https://github.com/prestodb/presto/pull/24326 was merged? I'm wondering why you didn't see this before. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iceberg planning time significantly longer than Hive #26789

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iceberg planning time significantly longer than Hive #26789

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions