-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Trino-created Delta Lake tables can be left in an unreadable state if recreated with a different schema after a checkpoint has been written.
To reproduce,
- Create a table
- Run actions on it until a checkpoint is written (default every 10 actions)
- Run CREATE OR REPLACE with a different schema
- Result: any reads to this table will now fail with the error message:
Failed to generate splits
Reproduction (using the S3DeltaLakeQueryRunnerMain from DeltaLakeQueryRunner and Trino CLI):
trino> CREATE TABLE delta.tpch.orders2 (orderkey bigint, orderdate timestamp);
CREATE TABLE
trino> INSERT INTO delta.tpch.orders2 (SELECT orderkey, orderdate from tpch.sf1000.orders limit 100);
INSERT: 100 rows
Query 20251231_103237_00026_3ra56, FINISHED, 3 nodes
Splits: 62 total, 62 done (100.00%)
0.68 [2.02M rows, 20.2KiB] [2.97M rows/s, 29.7KiB/s]
trino> INSERT INTO delta.tpch.orders2 (SELECT orderkey, orderdate from tpch.sf1000.orders limit 100);
INSERT: 100 rows
Query 20251231_103238_00027_3ra56, FINISHED, 3 nodes
Splits: 62 total, 62 done (100.00%)
0.40 [1.72M rows, 29.9KiB] [4.35M rows/s, 75.6KiB/s]
trino> INSERT INTO delta.tpch.orders2 (SELECT orderkey, orderdate from tpch.sf1000.orders limit 100);
INSERT: 100 rows
Query 20251231_103239_00028_3ra56, FINISHED, 3 nodes
Splits: 62 total, 62 done (100.00%)
0.39 [1.72M rows, 12.8KiB] [4.41M rows/s, 32.8KiB/s]
// ... Repeat the INSERT INTO 10 times so that Trino writes a checkpoint
trino> CREATE OR REPLACE TABLE delta.tpch.orders2 (orderkey bigint, orderdate date);
CREATE TABLE
trino> SELECT * FROM delta.tpch.orders2;
Query 20251231_103256_00038_3ra56 failed: Failed to generate splits for tpch.orders2
trino> SELECT * FROM delta.tpch.orders2;
Query 20251231_103339_00039_3ra56 failed: Failed to generate splits for tpch.orders2Full stacktrace (via debugging in IntelliJ):
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr java.util.concurrent.CompletionException: io.trino.spi.TrinoException: Unsupported Trino column type (timestamp(6)) for Parquet column ([add, stats_parsed, minvalues, orderdate] optional int32 orderdate (DATE))
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture.wrapInCompletionException(CompletableFuture.java:323)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:359)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:364)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run$$$capture(CompletableFuture.java:1828)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at --- Async.Stack.Trace --- (captured by IntelliJ IDEA debugger)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture$AsyncRun.<init>(CompletableFuture.java:1811)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1839)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:2054)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.DeltaLakeSplitSource.queueSplits(DeltaLakeSplitSource.java:191)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.DeltaLakeSplitSource.<init>(DeltaLakeSplitSource.java:91)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.DeltaLakeSplitManager.getSplits(DeltaLakeSplitManager.java:136)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:51)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.split.SplitManager.getSplits(SplitManager.java:89)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory$Visitor.createSplitSource(SplitSourceFactory.java:191)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory$Visitor.visitTableScan(SplitSourceFactory.java:158)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory$Visitor.visitTableScan(SplitSourceFactory.java:132)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.plan.TableScanNode.accept(TableScanNode.java:219)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory$Visitor.visitOutput(SplitSourceFactory.java:368)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory$Visitor.visitOutput(SplitSourceFactory.java:132)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.plan.OutputNode.accept(OutputNode.java:82)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.sql.planner.SplitSourceFactory.createSplitSources(SplitSourceFactory.java:112)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.createStageScheduler(PipelinedQueryScheduler.java:1075)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.create(PipelinedQueryScheduler.java:949)
2025-12-31T14:52:56.831+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.scheduler.PipelinedQueryScheduler.createDistributedStagesScheduler(PipelinedQueryScheduler.java:328)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.scheduler.PipelinedQueryScheduler.start(PipelinedQueryScheduler.java:311)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.SqlQueryExecution.start(SqlQueryExecution.java:442)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.execution.QueryManager.createQuery(QueryManager.java:317)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.dispatcher.LocalDispatchQuery.startExecution(LocalDispatchQuery.java:151)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.dispatcher.LocalDispatchQuery.lambda$waitForMinimumWorkers$1(LocalDispatchQuery.java:135)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.airlift.concurrent.MoreFutures.lambda$addSuccessCallback$0(MoreFutures.java:570)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.airlift.concurrent.MoreFutures$3.onSuccess(MoreFutures.java:545)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.$gen.Trino_testversion____20251231_101237_1.run(Unknown Source)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.lang.Thread.run(Thread.java:1474)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr Caused by: io.trino.spi.TrinoException: Unsupported Trino column type (timestamp(6)) for Parquet column ([add, stats_parsed, minvalues, orderdate] optional int32 orderdate (DATE))
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ColumnReaderFactory.unsupportedException(ColumnReaderFactory.java:367)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ColumnReaderFactory.create(ColumnReaderFactory.java:282)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ParquetReader.initializeColumnReaders(ParquetReader.java:718)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ParquetReader.advanceToNextRowGroup(ParquetReader.java:500)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ParquetReader.nextBatch(ParquetReader.java:453)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.parquet.reader.ParquetReader.nextPage(ParquetReader.java:270)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.hive.parquet.ParquetPageSource.getNextSourcePage(ParquetPageSource.java:82)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator.tryAdvancePage(CheckpointEntryIterator.java:721)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator.fillNextEntries(CheckpointEntryIterator.java:753)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator.computeNext(CheckpointEntryIterator.java:700)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointEntryIterator.computeNext(CheckpointEntryIterator.java:107)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1939)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:803)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:293)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:214)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.Collections$2.tryAdvance(Collections.java:5182)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.Collections$2.forEachRemaining(Collections.java:5190)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:315)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
2025-12-31T14:52:56.832+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:153)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:176)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.stream.ReferencePipeline.forEachOrdered(ReferencePipeline.java:637)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at io.trino.plugin.deltalake.DeltaLakeSplitSource.lambda$queueSplits$0(DeltaLakeSplitSource.java:193)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run$$$capture(CompletableFuture.java:1825)
2025-12-31T14:52:56.833+0100 INFO delta-split-source-delta-12 stderr ... 33 more
The checkpoint is necessary for this bug to be triggered. When updating a schema before a checkpoint is written, everything works:
trino> CREATE OR REPLACE TABLE delta.tpch.orders3 (orderkey bigint, orderdate timestamp);
CREATE TABLE
trino> INSERT INTO delta.tpch.orders3 (SELECT orderkey, orderdate from tpch.sf1000.orders limit 100);
INSERT: 100 rows
Query 20251231_122146_00101_3ra56, FINISHED, 1 node
Splits: 40 total, 40 done (100.00%)
0.35 [599K rows, 15.3KiB] [1.72M rows/s, 43.8KiB/s]
trino> CREATE OR REPLACE TABLE delta.tpch.orders3 (orderkey bigint, orderdate date);
CREATE TABLE
trino> SELECT * FROM delta.tpch.orders3;
orderkey | orderdate
----------+-----------
(0 rows)
Query 20251231_122206_00104_3ra56, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.08 [0 rows, 0B] [0 rows/s, 0B/s]
I assume this is due to Trino trying to use statistics from the checkpoint, but these statistics contain values types that don't match the new schema, leading to the error above.
Amusingly: writes still work (I assume as they do not rely on statistics?), and if we write enough times for a new checkpoint to be created, then the table becomes queryable again.
I also thought it would be interesting to try and change the column type directly, but this isn't supported in the Trino Delta Lake connector.
(Edit) Affected versions: we noticed the bug in Trino 467, and the repro above has been run from a local build of the master branch. I don't know whether this is an (old!) regression or an edge case that has flown under the radar for a while.
A potential solution could be to force writing a checkpoint when doing a CREATE OR REPLACE TABLE, so that subsequent reads would never use old statistics?