Feature list while using ArrowFileFormat read or write parquet #1171

jackylee-ch · 2022-11-23T10:08:33Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In #1161 , we are trying to use ArowFileFormat to read or write parquet file. But we meet some suite test failed, some of them are due to the lack of ArrowFileFormat functionality. The feature list is below:

Highest priority

Support Parquet Schema merge in infer_schema. read from parquet files with changing schema, Enabling/disabling merging partfiles when merging parquet schema, SPARK-10005 Schema merging for nested struct, alter datasource table add columns - parquet, alter datasource table add columns - partitioned - parquet, SPARK-10301 requested schema clipping - requested schema contains physical schema, schema mismatch failure error message for parquet vectorized reader
Support writing with other codec. compression codec
Fix: read and write timestamp with wrong value. store and retrieve column stats in different time zones,analyze column command,writing with aggregation,Migration from INT96 to TIMESTAMP_MICROS timestamp type
Fix: Throw RuntimeException when reading duplicate fields in case-insensitive mode

Low priority

The text was updated successfully, but these errors were encountered:

jackylee-ch added the enhancement New feature or request label Nov 23, 2022

This was referenced Nov 28, 2022

[NSE-1171] Throw RuntimeException when reading duplicate fields in case-insensitive mode #1173

Merged

[NSE-1171] Support merge parquet schema and read missing schema #1175

Merged

jackylee-ch mentioned this issue Dec 4, 2022

Parse int96 to milliseconds oap-project/arrow#185

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature list while using ArrowFileFormat read or write parquet #1171

Feature list while using ArrowFileFormat read or write parquet #1171

jackylee-ch commented Nov 23, 2022 •

edited

Loading

Feature list while using ArrowFileFormat read or write parquet #1171

Feature list while using ArrowFileFormat read or write parquet #1171

Comments

jackylee-ch commented Nov 23, 2022 • edited Loading

jackylee-ch commented Nov 23, 2022 •

edited

Loading