-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add protocol methods to protocol source #17
base: main
Are you sure you want to change the base?
Conversation
Hey there and thank you for opening this pull request! 👋🏼 We require pull request titles to follow the Conventional Commits specification and it looks like your proposed title needs to be adjusted. Details:
|
|
||
**interface of both source and destination** | ||
``` | ||
spec() -> Stream<AirbyteConnectorSpecification> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth making these objects links into the protocol.yaml file so it is easy to lookup the AirbyteConnectorSpecification
, etc or having an appendix at the bottom with links to them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, but I think the miss in the current implementation is that you can't determine the actual --flag names from the information provided. Some notes on that below.
**source only** | ||
``` | ||
discover(Config) -> AirbyteCatalog | ||
read(Config, ConfiguredAirbyteCatalog, State) -> Stream<AirbyteRecordMessage | AirbyteStateMessage | AirbyteControlMessage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method-style representation is a bit misleading. The flag argument isn't --ConfiguredAirbyteCatalog
it is --catalog
. We also don't explain if are passing the object itself (e.g. stringified JSON) or a file path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a way to keep this method-like signature could be:
read(config -> File<Config>, catalog -> File<ConfiguredAirbyteCatalog>, state -> File<State>) -> Stream<...>
... but now I'm just making things up.
I think JSONSchema for this might work better for this:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"additionalProperties": true // this is important now
"required": ["config", "catalog"] // showing that state is optional
"arguments": {
"config": { "type": "file_path", "$ref": Config.yaml },
"catalog": { "type": "file_path", "$ref": ConfiguredAirbyteCatalog.yaml },
"state": { "type": "file_path", "$ref": State.yaml },
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a real READ command for reference:
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-faker:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the JSON schema version is a little clearer, maybe there's a way to designate STDIN/STDOUT parameters from arguments to the method call?
|
||
In addition to the return types mentioned below, all methods can return the following message types: `AirbyteLogMessage | AirbyteTraceMessage`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I missed this note at first
* If an input parameter has no name, then it is passed via STDIN. | ||
|
||
In addition to the return types mentioned below, all methods can return the following message types: `AirbyteLogMessage | AirbyteTraceMessage`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note about additional arguments should be ignored and not validated
should be here somewhere
## Method Interfaces | ||
|
||
We describe these interfaces in pseudocode for clarity. Clarifications on the pseudocode semantics: | ||
* Any `Stream~ that is mentioned as input arg, is passed to the docker contained via STDIN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the ~ is throwing me off here, I was expecting it to mean something, maybe just empty brackets or filled with a generic Type would be clearer?
Stream<>
Stream<T>
Stream<...>
* Any `Stream~ that is mentioned as input arg, is passed to the docker contained via STDIN. | ||
* All other parameters are passed in as command line args (e.g. --config <path to config file>). | ||
* Each input parameter is described as its type (as defined in airbyte_protocol.yml and the name of the parameter). | ||
* If an input parameter has no name, then it is passed via STDIN. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this holds up, or I am misunderstanding it.
Reading this I expected all signatures to be like
methodName(argName: ArgType)
// or
methodName(ArgType argName)
And then there would be a distinction for args passed via stdin which would only be type
**source only** | ||
``` | ||
discover(Config) -> AirbyteCatalog | ||
read(Config, ConfiguredAirbyteCatalog, State) -> Stream<AirbyteRecordMessage | AirbyteStateMessage | AirbyteControlMessage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the JSON schema version is a little clearer, maybe there's a way to designate STDIN/STDOUT parameters from arguments to the method call?
|
||
We describe these interfaces in pseudocode for clarity. Clarifications on the pseudocode semantics: | ||
* Any `Stream~ that is mentioned as input arg, is passed to the docker contained via STDIN. | ||
* All other parameters are passed in as command line args (e.g. --config <path to config file>). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a controversial opinion - I think trying to represent both the stdin/stdout values and the method parameters in the same step is confusing.
Maybe stating that the method returns an I/O stream and define what that I/O accepts and returns as a secondary step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I also think this is a problem in our other docs describing these methods
Here's a first-pass at adding the protocol methods to the protocol repo. I could not find a way to make JSONSchema to work nicely. Definitely open to other ways of expressing this. If we like this approach we can clean it up and move forward.