`PartitionDataset` Caching Support #974

lordsoffallen · 2025-01-03T19:02:22Z

Description

I have a node which returns dict[str, Callable] for kedro to save my partitioned data. I've often had cases where it was failing mid way due to edge case i didn't cover and execution starts from all over again.

Context

I would need this to speed up experimentation in kedro and reduce unnecessary costs which may occur by re-running the node.

Possible Implementation

Adding a new parameter to PartitionDataset to support skipping already existing files. Something like use_cache: True

Possible Alternatives

I can def inherit the class and implement this but i thought it would be useful feature to have it in the core code.

The text was updated successfully, but these errors were encountered:

merelcht added the Community Issue/PR opened by the open-source community label Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`PartitionDataset` Caching Support #974

`PartitionDataset` Caching Support #974

lordsoffallen commented Jan 3, 2025

PartitionDataset Caching Support #974

PartitionDataset Caching Support #974

Comments

lordsoffallen commented Jan 3, 2025

Description

Context

Possible Implementation

Possible Alternatives

`PartitionDataset` Caching Support #974

`PartitionDataset` Caching Support #974