Extend synchronization primitives to allow cross-chunk sync

Imagine we want to do type inference on columns of CSV. We'd start with some initial guess for each column and then process the file in chunks, as we always do. Now one of the workers discovers that a column that was until now considered an `Int` has to be a `Float64`. How should we spread this information to the other chunks? Should this be a responsibility of the `consume` context? Should we define specific callback that would work as a barrier -- we'd stop parsing, wait for all results to enter the barrier, sync their schemas, and release them? Doing this in `sync_tasks` is problematic, because that only synchronizes chunks belonging to one of the two buffers.

More design work is needed on this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend synchronization primitives to allow cross-chunk sync #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend synchronization primitives to allow cross-chunk sync #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions