[Parquet] Support page level cache for reading

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
Previously in parquet, we should read a whold RowGroup into memory and then extract what we need. This is obviously wasted. 
Therefore, I thought of to only read the page we need, and cache the pages for future read. 
The previous part is solved thanks to #7850 , and I begin to work after this pr released.

**Describe the solution you'd like**
I thought of adding a cache mechanism into `decode_page` in `impl RowGroupReader for SerializedRowGroupReader`. In this way we can avoid some decode and decompress cost.

**Describe alternatives you've considered**
I have considered to also add cache to filter stage, but this part is already implemented.
I have also considered about page level prefetch, but I think it may be not so profitable.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Parquet] Support page level cache for reading #8246

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Parquet] Support page level cache for reading #8246

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions