Skip to content

Conversation

@DrakeLin
Copy link
Collaborator

What changes are proposed in this pull request?

How was this change tested?

@github-actions github-actions bot added the breaking-change Change that require a major version bump label Jan 18, 2026
@DrakeLin DrakeLin force-pushed the drake-lin_data/stats-all branch from 73fa938 to e3b6498 Compare January 19, 2026 02:15
})
.collect();
let data = StructArray::try_new(output_fields.into(), output_cols, None)?;
let data = StructArray::try_new(output_fields.into(), output_cols, source_null_buffer)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! Should we split this out into a separate PR? This is a pretty serious bug

@Fokko
Copy link
Collaborator

Fokko commented Jan 19, 2026

I'm still learning, but I would also expect that we extend the Add action with parsed_stats, or is there a reason not to do this?


StructTypeBuilder::from_schema(mandatory_add_file_schema())
.add_field(StructField::nullable(
"stats",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we re-use this name? I think this might cause confusion since it will also contain the blob

stats_schema: Option<&StructType>,
) -> DeltaResult<(
impl Iterator<Item = DeltaResult<ActionsBatch>> + Send,
Option<bool>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to wrap the bool in an option?

checkpoint_read_schema: SchemaRef,
meta_predicate: Option<PredicateRef>,
) -> DeltaResult<impl Iterator<Item = DeltaResult<ActionsBatch>> + Send> {
stats_schema: Option<&StructType>,
Copy link
Collaborator

@Fokko Fokko Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a SchemaRef as well? In the case of wide tables, it can be pretty big (a multiple of the columns in the table itself). And should we add it the docstring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Change that require a major version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants