-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Fix jdbc logical type not found when python sdk worker running in docker env #36014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can we confirm it is caused by #34695? This suggests apache_beam.typehints.schemas.LogicalType.register_logical_type is fundamentally broken in the case of "python sdk worker running in docker env". We should fix the root cause |
cc'ed @claudevdm |
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
Assigning reviewers: R: @claudevdm for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
@@ -690,7 +690,16 @@ def add(self, urn, logical_type): | |||
self.by_language_type[logical_type.language_type()] = logical_type | |||
|
|||
def get_logical_type_by_urn(self, urn): | |||
return self.by_urn.get(urn, None) | |||
logical_type = self.by_urn.get(urn, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just register the beam:logical_type:javasdk_date:v1 and beam:logical_type:javasdk_time:v1 logical types in shcemas.py? Or do they they conflict with something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import here is for importing classes and logical type registration. We need those classes to be imported first before registration.
Also, we cannot put the import at the beginning of schemas.py due to circular dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I meant we also define them in schemas.py, like other logical types in schemas.py
class MillisInstant(NoArgumentLogicalType[Timestamp, np.int64]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed with @Abacn who initially implemented these JDBC logical types. We think we can move them in schemas.py
.
Note that we will inevitably introduce this non-portable piece into schemas.py
, either by the import method in the current PR or by moving those logical types directly into the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand we can revert "import apache_beam.io.jdbc" here now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Fixed.
@@ -96,10 +95,11 @@ | |||
from apache_beam.transforms.external import BeamJarExpansionService | |||
from apache_beam.transforms.external import ExternalTransform | |||
from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder | |||
from apache_beam.typehints.schemas import JdbcDateType # pylint: disable=unused-import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be imported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the backward compatibility when user uses from apache_beam.io.jdbc import JdbcDateType
in their own scripts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To maintain backward compatibility. We don't want to introduce breaking changes for users who were previously importing these logical types directly from apache_beam.io.jdbc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably want to add this comments to justify "# pylint: disable=unused-import" as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. We will eventually revert/remove this temporary fix, so let's get it in and unblock the release first. Thanks both!
fixes #36013