-
Notifications
You must be signed in to change notification settings - Fork 139
feat: Add CreateTable API with simplified single-stage flow #1629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1629 +/- ##
==========================================
- Coverage 84.15% 84.12% -0.04%
==========================================
Files 123 124 +1
Lines 34180 34377 +197
Branches 34180 34377 +197
==========================================
+ Hits 28764 28919 +155
- Misses 4021 4051 +30
- Partials 1395 1407 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1861faa to
c41a7cc
Compare
| use serde_json::Value; | ||
| use tempfile::tempdir; | ||
| use test_utils::create_default_engine; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test for CTAS. Ensure that CREATE table transaction cannot add remove actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ill add that in write integration tests. Issue filed #1647.
c41a7cc to
13ec117
Compare
13ec117 to
53702a3
Compare
| /// * `storage` - The storage handler to use for listing | ||
| /// * `delta_log_url` - URL to the `_delta_log` directory | ||
| /// * `table_path` - Original table path (for error messages) | ||
| fn ensure_table_does_not_exist( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style question: @nicklan or @OussamaSaoudi -- Do we have a style guide on where/when pub functions should be vs private associated functions?
e.g. I (a "java" guy) would imagine a reader would expect/want pub fn create_table to be the top fn in this class, not a private function
(Not a blocker for this review -- can figure out offline)
| /// | ||
| /// This function checks the `_delta_log` directory to determine if a table already exists. | ||
| /// It handles various storage backend behaviors gracefully: | ||
| /// - If the directory doesn't exist (FileNotFound), returns Ok (new table can be created) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Perhaps in a future PR, we add tests for each of these cases, to be thorough?
scottsand-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks! Two minor questions (one on code organization, one on test followup) -- non blockers
Implement create-table functionality where CreateTableTransactionBuilder::build()
returns a Transaction with stored actions to be used for a commit.
API Usage:
let result = create_table(path, schema, engine_info)
.build(engine, committer)?
.commit(engine)?;
This specific change doesn't allow table properties and features to be set and
has validations in the transaction module which error if unsupported features or
properties such as row tracking and ICT are set in the table configuration being
pushed down.
Key Changes:
- CreateTableTransactionBuilder::build() takes committer and returns Transaction
with commit info, protocol and metadata actions.
- Transaction struct holds optional protocol/metadata actions for
create-table
- Adds try_new_create_table() constructor alongside
try_new_existing_table()
- commit() handles both existing-table and create-table flows
- get_write_context() now returns DeltaResult<WriteContext> for
proper error handling
This aligns the Rust Kernel's create-table flow with the Java Kernel's
approach where Transaction is the single unit for all commit operations.
Testing:
Unit tests
Add comprehensive tests validating the basic create_table() functionality introduced. Test coverage includes: - test_create_simple_table: Verifies basic table creation with a multi-column schema, snapshot version (0), and field preservation - test_create_table_already_exists: Validates error handling when attempting to create a table at an existing path - test_create_table_empty_schema: Ensures empty schema validation fails at build time with appropriate error message - test_create_table_log_actions: Verifies delta log structure with correct action ordering (CommitInfo, Protocol, Metadata) for ICT compliance, and validates action contents including engineInfo, operation type, protocol versions, and kernelVersion
53702a3 to
15afd12
Compare
🥞 Stacked PR
Use this link to review incremental changes.
Implement create-table functionality where CreateTableTransactionBuilder::build()
returns a Transaction with stored actions to be used for a commit.
This specific change doesn't allow table properties and features to be set and
has validations in the transaction module which error if unsupported features or
properties such as row tracking and ICT are set in the table configuration being
pushed down.
Key Changes:
with commit info, protocol and metadata actions.
This aligns the Rust Kernel's create-table flow with the Java Kernel's
approach where Transaction is the single unit for all commit operations.
Testing:
Unit and functional tests