Skip to content

Conversation

shunping
Copy link
Collaborator

@shunping shunping commented Sep 2, 2025

Managed JDBCIO (Part 1) - Postgres

The first PR to make JDBCIO into managed IO.

Notice that we are still using the GCP IO expansion service jar. I will move it to the designated expansion service jar after finishing all the four supported databases.

@shunping shunping changed the title Add JDBCIO to managed io. Support managed jdbc io (Postgres) Sep 2, 2025
@shunping shunping requested a review from ahmedabu98 September 3, 2025 02:40
@shunping shunping marked this pull request as ready for review September 3, 2025 02:40
@shunping shunping self-assigned this Sep 3, 2025
@shunping shunping requested a review from chamikaramj September 3, 2025 02:40
Copy link
Contributor

github-actions bot commented Sep 3, 2025

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@ahmedabu98
Copy link
Contributor

Will take a look later, but from a quick skim, it looks like we're missing translation logic in this PR

Copy link
Contributor

github-actions bot commented Sep 3, 2025

Assigning reviewers:

R: @liferoad for label python.
R: @Abacn for label java.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@shunping
Copy link
Collaborator Author

shunping commented Sep 3, 2025

Will take a look later, but from a quick skim, it looks like we're missing translation logic in this PR

Thanks for taking a look. I've been able to call both the managed and unmanaged versions of these transforms from Java and Python. Could you provide some pointers on where this translation logic is used?

@shunping
Copy link
Collaborator Author

shunping commented Sep 3, 2025

Will take a look later, but from a quick skim, it looks like we're missing translation logic in this PR

Thanks for taking a look. I've been able to call both the managed and unmanaged versions of these transforms from Java and Python. Could you provide some pointers on where this translation logic is used?

Added translation logic. However for my educational purpose, where is this translation called?

@ahmedabu98
Copy link
Contributor

where is this translation called

when converting the transform to its proto representation:

FunctionSpec spec = payloadTranslator.translate(appliedPTransform, components);

@shunping
Copy link
Collaborator Author

shunping commented Sep 3, 2025

where is this translation called

when converting the transform to its proto representation:

FunctionSpec spec = payloadTranslator.translate(appliedPTransform, components);

Then why could my previous integration tests and my manual call in python work without it?

@ahmedabu98
Copy link
Contributor

Then why could my previous integration tests and my manual call in python work without it?

Because it can still build a viable transform proto just fine, and run in a normal pipeline.

The extra translation logic is just needed to add more information that the Dataflow managed service can make use of to do upgrades.

Copy link
Contributor

@ahmedabu98 ahmedabu98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but looks good overall! Thanks @shunping

@shunping
Copy link
Collaborator Author

shunping commented Sep 4, 2025

Then why could my previous integration tests and my manual call in python work without it?

Because it can still build a viable transform proto just fine, and run in a normal pipeline.

The extra translation logic is just needed to add more information that the Dataflow managed service can make use of to do upgrades.

Thanks for clarifying this. cc'ed @liferoad

@liferoad
Copy link
Contributor

liferoad commented Sep 4, 2025

Then why could my previous integration tests and my manual call in python work without it?

Because it can still build a viable transform proto just fine, and run in a normal pipeline.
The extra translation logic is just needed to add more information that the Dataflow managed service can make use of to do upgrades.

Thanks for clarifying this. cc'ed @liferoad

We should have the readme file to document all these requirements.

@shunping shunping requested a review from ahmedabu98 September 5, 2025 11:09
@shunping shunping merged commit be4fb97 into apache:master Sep 8, 2025
138 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants