Build: Bump Apache Parquet 1.14.4 #11502

Fokko · 2024-11-08T20:56:13Z

No description provided.

…)" (apache#11462)" This reverts commit 7cc16fa.

singhpk234 · 2024-11-09T05:23:48Z

...18/flink/src/test/java/org/apache/iceberg/flink/source/TestMetadataTableReadableMetrics.java

+    Row booleanCol = Row.of(36L, 4L, 0L, null, false, true);
+    Row decimalCol = Row.of(91L, 4L, 1L, null, new BigDecimal("1.00"), new BigDecimal("2.00"));
+    Row doubleCol = Row.of(91L, 4L, 0L, 1L, 1.0D, 2.0D);


[optional] should we refactor this to pick file_size from the Datafiles themselve like we did we did in JDK 17 upgrade PR #7391 (comment)

Never the less looks like size in bytes is increasing in this version is it because they are more accurate now ?

Hey @singhpk234, that's an excellent suggestion. I've copied your approach here as well. Parquet now also tracks how large the data is in memory after compression (this is handy for strings where you don't know that upfront) so you can allocate buffers directly to the right size.

how large the data is in memory after compression (this is handy for strings where you don't know that upfront) so you can allocate buffers directly to the right size.

This is precisely what we needed in Redshift as well, our CBO was falling behind with variable length data types, will give them HeadsUp ! Thankyou @Fokko

jbonofre

The Parquet update looks good, I'm just wondering about the row size increase in the test. I would add at least in the comment in the test to explain the reason.

jbonofre · 2024-11-09T09:39:09Z

build.gradle

@@ -119,6 +119,9 @@ allprojects {
  repositories {
    mavenCentral()
    mavenLocal()
+    maven {
+      url = uri("https://repository.apache.org/content/repositories/orgapacheparquet-1065")


Just a note: this is temporary during the Parquet release vote (just to not forget to remove this once Parquet release is out 😄 )

jbonofre · 2024-11-09T09:40:11Z

...18/flink/src/test/java/org/apache/iceberg/flink/source/TestMetadataTableReadableMetrics.java

@@ -217,27 +217,27 @@ public void testPrimitiveColumns() throws Exception {

    Row binaryCol =
        Row.of(
-            52L,
+            55L,


Why the size is growing here (I mean in the test) ?
Should we have two tests ?

See #11502 (comment) for the reason. What's the suggestion for the second test?

jbonofre · 2024-11-09T09:40:24Z

...18/flink/src/test/java/org/apache/iceberg/flink/source/TestMetadataTableReadableMetrics.java

    Row fixedCol =
        Row.of(
-            44L,
+            47L,


Same question here about the size.

build.gradle

Fokko added 2 commits November 8, 2024 21:45

Revert "Revert "Build: Bump parquet from 1.13.1 to 1.14.3 (apache#11264…

d5f6087

…)" (apache#11462)" This reverts commit 7cc16fa.

Bump to Parquet 1.14.4

665487a

github-actions bot added flink build labels Nov 8, 2024

Fokko changed the title ~~Test out Apache Parquet 1.14.4~~ Test out Apache Parquet 1.14.4 RC2 Nov 8, 2024

singhpk234 reviewed Nov 9, 2024

View reviewed changes

jbonofre self-requested a review November 9, 2024 09:38

jbonofre reviewed Nov 9, 2024

View reviewed changes

Lookup sizes instead

68645be

Fokko force-pushed the fd-parq branch from 057a067 to 68645be Compare November 9, 2024 21:55

Fokko mentioned this pull request Nov 11, 2024

Build: Bump parquet from 1.13.1 to 1.14.3 #11507

Closed

Fokko changed the title ~~Test out Apache Parquet 1.14.4 RC2~~ Build: Bump Apache Parquet 1.14.4 Nov 12, 2024

Fokko marked this pull request as ready for review November 12, 2024 12:45

Fokko commented Nov 12, 2024

View reviewed changes

build.gradle Outdated Show resolved Hide resolved

Update build.gradle

48e9524

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build: Bump Apache Parquet 1.14.4 #11502

Build: Bump Apache Parquet 1.14.4 #11502

Fokko commented Nov 8, 2024

singhpk234 Nov 9, 2024

Fokko Nov 9, 2024

singhpk234 Nov 11, 2024

jbonofre left a comment

jbonofre Nov 9, 2024

jbonofre Nov 9, 2024

Fokko Nov 9, 2024

jbonofre Nov 9, 2024

Build: Bump Apache Parquet 1.14.4 #11502

Are you sure you want to change the base?

Build: Bump Apache Parquet 1.14.4 #11502

Conversation

Fokko commented Nov 8, 2024

singhpk234 Nov 9, 2024

Choose a reason for hiding this comment

Fokko Nov 9, 2024

Choose a reason for hiding this comment

singhpk234 Nov 11, 2024

Choose a reason for hiding this comment

jbonofre left a comment

Choose a reason for hiding this comment

jbonofre Nov 9, 2024

Choose a reason for hiding this comment

jbonofre Nov 9, 2024

Choose a reason for hiding this comment

Fokko Nov 9, 2024

Choose a reason for hiding this comment

jbonofre Nov 9, 2024

Choose a reason for hiding this comment