Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multi-node for %load_node or %load_pipeline #4170

Open
noklam opened this issue Sep 16, 2024 · 0 comments
Open

Add support for multi-node for %load_node or %load_pipeline #4170

noklam opened this issue Sep 16, 2024 · 0 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@noklam
Copy link
Contributor

noklam commented Sep 16, 2024

Description

%load_node assume error comes from the node but this is not always the case.

This happens more commonly in Data Engineering pipeline, where you apply a series of transformation, aggregation on a set of table and pass it to next node. For example, you may get a "Column not found" error. loading the node that thrown out an error is only the first step to inspect the data, but there are still couple of manual steps to figure out where is the source of error.

The process roughly work as a binary search of the upstream nodes.

Context

This augment the existing debugging feature of Kedro and making this much easier for DS & DE

Runner is an abstraction that is powerful but not beginner friendly, bring the execution explicitly into a notebook cell is helpful

It's not a trivial task to figure out the correct execution order from a Kedro pipeline to a imperative manner (i.e. cells run sequentially in a notebook). The abstraction is a distraction mostly during debugging.

Possible Implementation

Limitation: Creating multiple cells is not easy, I tried in %load_node the first time but settle with the current solution because IPython do have limitations. We may be able to do this in Jupyter Notebook (not VSCode notebook) because there are better support

Possible Alternatives

@noklam noklam added the Issue: Feature Request New feature or improvement to existing feature label Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
Status: No status
Development

No branches or pull requests

2 participants