Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: generated columns #3123

Merged
merged 7 commits into from
Jan 15, 2025
Merged

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Jan 12, 2025

Description

Adds generated columns.

Related Issue(s)

@github-actions github-actions bot added the binding/rust Issues for the Rust crate label Jan 12, 2025
@ion-elgreco ion-elgreco force-pushed the feat/generated-columns branch 2 times, most recently from be67624 to e3f55b7 Compare January 12, 2025 16:47
@ion-elgreco ion-elgreco changed the title feat: generated columns feat: generated columns [WIP] Jan 12, 2025
@ion-elgreco ion-elgreco force-pushed the feat/generated-columns branch from 6efe64b to 18a3db8 Compare January 12, 2025 18:49
@rtyler rtyler marked this pull request as draft January 12, 2025 18:52
@ion-elgreco ion-elgreco force-pushed the feat/generated-columns branch 2 times, most recently from 4378bd9 to 173659c Compare January 12, 2025 19:22
Copy link

codecov bot commented Jan 12, 2025

Codecov Report

Attention: Patch coverage is 60.70588% with 167 lines in your changes missing coverage. Please review.

Project coverage is 72.14%. Comparing base (0cfde96) to head (45ae708).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
crates/core/src/operations/write.rs 34.48% 51 Missing and 6 partials ⚠️
crates/core/src/operations/merge/mod.rs 53.84% 30 Missing and 12 partials ⚠️
crates/core/src/kernel/models/schema.rs 72.50% 20 Missing and 2 partials ⚠️
crates/core/src/operations/add_column.rs 0.00% 16 Missing ⚠️
crates/core/src/kernel/models/actions.rs 83.58% 3 Missing and 8 partials ⚠️
crates/core/src/table/mod.rs 50.00% 9 Missing ⚠️
crates/core/src/operations/cast/merge_schema.rs 70.00% 6 Missing ⚠️
crates/core/src/delta_datafusion/mod.rs 85.00% 2 Missing and 1 partial ⚠️
crates/core/src/operations/create.rs 94.44% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3123      +/-   ##
==========================================
- Coverage   72.27%   72.14%   -0.13%     
==========================================
  Files         134      134              
  Lines       42973    43262     +289     
  Branches    42973    43262     +289     
==========================================
+ Hits        31059    31212     +153     
- Misses       9926    10043     +117     
- Partials     1988     2007      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ion-elgreco ion-elgreco force-pushed the feat/generated-columns branch 2 times, most recently from 91633d3 to 5fd0b6d Compare January 13, 2025 10:13
@@ -439,7 +438,11 @@ async fn write_execution_plan_with_predicate(
let checker = if let Some(snapshot) = snapshot {
DeltaDataChecker::new(snapshot)
} else {
DeltaDataChecker::empty()
debug!("Using plan schema to derive generated columns, since no shapshot was provided. Implies first write.");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cleaner way maybe, is to check upstream whether an metadata action existed, then we convert that schema into StructType and grab the generated columns from there

@github-actions github-actions bot added the binding/python Issues for the Python package label Jan 13, 2025
@ion-elgreco ion-elgreco changed the title feat: generated columns [WIP] feat: generated columns Jan 13, 2025
@ion-elgreco ion-elgreco marked this pull request as ready for review January 13, 2025 16:06
@ion-elgreco ion-elgreco requested a review from fvaleye as a code owner January 13, 2025 16:06
@ion-elgreco ion-elgreco force-pushed the feat/generated-columns branch from c12d5e1 to d541160 Compare January 13, 2025 16:06
Copy link
Member

@rtyler rtyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, it does touch a lot of areas of the write path my performance work has been touching 😬 so I have some brutal conflict resolutions in my future.

I will release this in the Python 0.24 and let it marinate a couple more days before I cut the 🦀 crates

@rtyler rtyler enabled auto-merge January 15, 2025 03:00
@rtyler rtyler added this pull request to the merge queue Jan 15, 2025
auto-merge was automatically disabled January 15, 2025 03:24

Pull Request is not mergeable

Merged via the queue into delta-io:main with commit b7f75dd Jan 15, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generated Columns
2 participants