Dataset design

Creating this issue to discuss the detailed design of `Dataset[A]` (and potentially `DataStream[A]`).

As discussed in the meeting with John, I think its worth thinking through what this API would look like as both batch and stream. We can go severals ways with this:

The first would be a unified API like Spark's Dataset (and structured streaming).
https://spark.apache.org/docs/2.3.1/structured-streaming-programming-guide.html#programming-model

Another would be Flink's Dataset/DataStream API (which are build on top of their stateful streams abstraction).
https://ci.apache.org/projects/flink/flink-docs-release-1.5/concepts/programming-model.html

I'd like to flesh out what this API should look like and how it should function in more detail here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dataset design #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataset design #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions