How do I specify multiple data folders with in catalog.yml file? #1894
praneeth5222
started this conversation in
Idea
Replies: 1 comment 2 replies
-
So there isn't a native way of doing this. The idea of a partitioned, partitioned dataset has come up before so it's always good to see this sort of request come up as it can influence our next set of priorities. This does somewhat break one of Kedro's core tenants of reproducibility, but it may be possible to do by subclassing partitioned dataset and introducing some custom logic that handles the date window parts you need. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I wanted to know if there is a way to specify multiple data folders in the catalog.yml file
At the moment my data storage has a root folder which has one folder corresponding to each date like - 2022-09-30, 2022-10-01... etc. and internally each date folder has many parquet files.
I was able to load the multiple parquet files corresponding to one date folder in this way using the PartitionedDataSet-
Can anyone suggest me if there is a way to load the specified folders instead of a single folder? (for e.g -> to load all the folders between 2022-09-26 till 2022-10-03)
Ideally I wanted to load the last 7 day folders based on the current date
Thank you!!
Beta Was this translation helpful? Give feedback.
All reactions