Skip to content

Conversation

@sanujbasu
Copy link
Collaborator

@sanujbasu sanujbasu commented Jan 20, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Implement create-table functionality where CreateTableTransactionBuilder::build()
returns a Transaction with stored actions to be used for a commit.

API Usage:
let result = create_table(path, schema, engine_info)
.build(engine, committer)?
.commit(engine)?;

This specific change doesn't allow table properties and features to be set and
has validations in the transaction module which error if unsupported features or
properties such as row tracking and ICT are set in the table configuration being
pushed down.

Key Changes:

  1. CreateTableTransactionBuilder::build() takes committer and returns Transaction
    with commit info, protocol and metadata actions.
  2. Transaction struct holds optional protocol/metadata actions for create-table
  3. Adds try_new_create_table() constructor alongside try_new_existing_table()
  4. commit() handles both existing-table and create-table flows
  5. get_write_context() now returns DeltaResult for proper error handling

This aligns the Rust Kernel's create-table flow with the Java Kernel's
approach where Transaction is the single unit for all commit operations.

Testing:
Unit and functional tests

@sanujbasu sanujbasu linked an issue Jan 20, 2026 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 80.08850% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.12%. Comparing base (435b94e) to head (15afd12).

Files with missing lines Patch % Lines
kernel/src/transaction/mod.rs 72.41% 23 Missing and 9 partials ⚠️
kernel/src/transaction/create_table.rs 85.39% 8 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1629      +/-   ##
==========================================
- Coverage   84.15%   84.12%   -0.04%     
==========================================
  Files         123      124       +1     
  Lines       34180    34377     +197     
  Branches    34180    34377     +197     
==========================================
+ Hits        28764    28919     +155     
- Misses       4021     4051      +30     
- Partials     1395     1407      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sanujbasu sanujbasu force-pushed the stack/create_table_1 branch 2 times, most recently from 1861faa to c41a7cc Compare January 20, 2026 09:21
use serde_json::Value;
use tempfile::tempdir;
use test_utils::create_default_engine;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for CTAS. Ensure that CREATE table transaction cannot add remove actions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ill add that in write integration tests. Issue filed #1647.

@sanujbasu sanujbasu force-pushed the stack/create_table_1 branch from c41a7cc to 13ec117 Compare January 21, 2026 03:29
/// * `storage` - The storage handler to use for listing
/// * `delta_log_url` - URL to the `_delta_log` directory
/// * `table_path` - Original table path (for error messages)
fn ensure_table_does_not_exist(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style question: @nicklan or @OussamaSaoudi -- Do we have a style guide on where/when pub functions should be vs private associated functions?

e.g. I (a "java" guy) would imagine a reader would expect/want pub fn create_table to be the top fn in this class, not a private function

(Not a blocker for this review -- can figure out offline)

///
/// This function checks the `_delta_log` directory to determine if a table already exists.
/// It handles various storage backend behaviors gracefully:
/// - If the directory doesn't exist (FileNotFound), returns Ok (new table can be created)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Perhaps in a future PR, we add tests for each of these cases, to be thorough?

Copy link
Collaborator

@scottsand-db scottsand-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks! Two minor questions (one on code organization, one on test followup) -- non blockers

Sanuj Basu and others added 2 commits January 21, 2026 21:42
Implement create-table functionality where CreateTableTransactionBuilder::build()
returns a Transaction with stored actions to be used for a commit.

API Usage:
  let result = create_table(path, schema, engine_info)
      .build(engine, committer)?
      .commit(engine)?;

This specific change doesn't allow table properties and features to be set and
has validations in the transaction module which error if unsupported features or
properties such as row tracking and ICT are set in the table configuration being
pushed down.

Key Changes:
- CreateTableTransactionBuilder::build() takes committer and returns Transaction
  with commit info, protocol and metadata actions.
- Transaction struct holds optional protocol/metadata actions for
  create-table
- Adds try_new_create_table() constructor alongside
  try_new_existing_table()
- commit() handles both existing-table and create-table flows
- get_write_context() now returns DeltaResult<WriteContext> for
  proper error handling

This aligns the Rust Kernel's create-table flow with the Java Kernel's
approach where Transaction is the single unit for all commit operations.

Testing:
Unit tests
Add comprehensive tests validating the basic create_table()
functionality introduced.

Test coverage includes:
- test_create_simple_table: Verifies basic table creation with a
  multi-column schema, snapshot version (0), and field preservation
- test_create_table_already_exists: Validates error handling when
  attempting to create a table at an existing path
- test_create_table_empty_schema: Ensures empty schema validation
  fails at build time with appropriate error message
- test_create_table_log_actions: Verifies delta log structure with
  correct action ordering (CommitInfo, Protocol, Metadata) for ICT
  compliance, and validates action contents including engineInfo,
  operation type, protocol versions, and kernelVersion
@sanujbasu sanujbasu force-pushed the stack/create_table_1 branch from 53702a3 to 15afd12 Compare January 21, 2026 21:43
@github-actions github-actions bot removed the breaking-change Change that require a major version bump label Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[EPIC] Create table API support for Kernel Rust

3 participants