You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Spark in mode CORRECTED works as Trino now - doesn’t do any adjustments (rebasing days before 1582-10-15 )
However in LEGACY mode there are discrepancies:
spark-sql (default)> create table t1 (s date, e date ) using PARQUET;
spark-sql (default)> set spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY;
spark-sql (default)> insert into t1 values (cast ('0001-01-01' as date), cast('1000-01-01' as date) );
spark-sql (default)> select * from t1;
0001-01-01 1000-01-01
Whereas:
trino:default> show create table t1;
Create Table
--------------------------------
CREATE TABLE hive.default.t1 (
s date,
e date
)
WITH (
format = 'PARQUET'
)
(1 row)
trino:default> select * from t1;
s | e
------------+------------
0000-12-30 | 1000-01-06
(1 row)
The text was updated successfully, but these errors were encountered:
It seems that there is no support for proleptic Gregorian Calendar in Trino when reading values of type Date from Parquet files.
Parquet spec (https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date) does not mention about dates before year 1970 but there are discrepancies between Hive, Trino and Spark.
1.Hive introduced in 3.1.3 (https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277&styleName=Html&projectId=12310843) support for proleptic Gregorian calendar (https://issues.apache.org/jira/browse/HIVE-22405)
2.Spark supports it within version 3.0+ , and adds additional property to decide how to handle dates before 1582-10-15 (https://issues.apache.org/jira/browse/SPARK-31408)
Spark in mode CORRECTED works as Trino now - doesn’t do any adjustments (rebasing days before 1582-10-15 )
However in LEGACY mode there are discrepancies:
Whereas:
The text was updated successfully, but these errors were encountered: