Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancies reading values less than 1582-10-15 of type DATE between Trino, Hive and Spark #23904

Open
marcinsbd opened this issue Oct 24, 2024 · 0 comments

Comments

@marcinsbd
Copy link
Contributor

marcinsbd commented Oct 24, 2024

It seems that there is no support for proleptic Gregorian Calendar in Trino when reading values of type Date from Parquet files.
Parquet spec (https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date) does not mention about dates before year 1970 but there are discrepancies between Hive, Trino and Spark.
1.Hive introduced in 3.1.3 (https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277&styleName=Html&projectId=12310843) support for proleptic Gregorian calendar (https://issues.apache.org/jira/browse/HIVE-22405)
2.Spark supports it within version 3.0+ , and adds additional property to decide how to handle dates before 1582-10-15 (https://issues.apache.org/jira/browse/SPARK-31408)

Spark in mode CORRECTED works as Trino now - doesn’t do any adjustments (rebasing days before 1582-10-15 )
However in LEGACY mode there are discrepancies:

spark-sql (default)> create table t1 (s date, e date ) using PARQUET;
spark-sql (default)> set spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY;
spark-sql (default)> insert into t1 values (cast ('0001-01-01' as date), cast('1000-01-01' as date) );
spark-sql (default)> select * from t1;
0001-01-01	1000-01-01

Whereas:

trino:default> show create table t1;
     Create Table
--------------------------------
 CREATE TABLE hive.default.t1 (
  s date,
  e date
 )
 WITH (
  format = 'PARQUET'
 )
(1 row)
trino:default> select * from t1;
   s   |   e
------------+------------
 0000-12-30 | 1000-01-06
(1 row)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant