Skip to content

Commit

Permalink
[FLINK-TODO] [pyflink] Replace deprecated avro-python3 with avro
Browse files Browse the repository at this point in the history
- avro-python3 was deprecated and replaced by avro, the first hasn't had a release since March 17, 2021 while the second has had multiple fixes and updated, latest is August 5, 2024
- Both libraries are the exact same (i.e. `avro-python3` was just renamed to `avro` and had multiple updates since), but the problem is that they overlap in package name, so using PyFlink with any updated library that relies on `avro` fails starting the pipeline even if the pipeline doesn't actually do any avro encoding/decoding
- This updates the library to the latest one, and fixes a few imports, other than that the library's functionality is exactly the same
  • Loading branch information
mina-asham committed Jan 17, 2025
1 parent 33de4ea commit 7c11f44
Show file tree
Hide file tree
Showing 5 changed files with 7 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Apache Flink
# Apache Flink

Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities.

Expand Down
2 changes: 1 addition & 1 deletion flink-python/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Apache Flink
# Apache Flink

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

Expand Down
2 changes: 1 addition & 1 deletion flink-python/dev/dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ cython>=0.29.24
py4j==0.10.9.7
python-dateutil>=2.8.0,<3
cloudpickle~=2.2.0
avro-python3>=1.8.1,!=1.9.2
avro>=1.12.0
pandas>=1.3.0
pyarrow>=5.0.0
pytz>=2018.3
Expand Down
7 changes: 3 additions & 4 deletions flink-python/pyflink/fn_execution/formats/avro.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,13 @@
################################################################################
import struct

from avro.errors import AvroTypeException, SchemaResolutionException
from avro.io import (
AvroTypeException,
BinaryDecoder,
BinaryEncoder,
DatumReader,
DatumWriter,
SchemaResolutionException,
Validate,
validate,
)

STRUCT_FLOAT = struct.Struct('>f') # big-endian float
Expand Down Expand Up @@ -224,7 +223,7 @@ def write_union(self, writer_schema, datum, encoder):
# resolve union
index_of_schema = -1
for i, candidate_schema in enumerate(writer_schema.schemas):
if Validate(candidate_schema, datum):
if validate(candidate_schema, datum):
index_of_schema = i
if index_of_schema < 0:
raise AvroTypeException(writer_schema, datum)
Expand Down
2 changes: 1 addition & 1 deletion flink-python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ def extracted_output_files(base_dir, file_path, output_directory):

install_requires = ['py4j==0.10.9.7', 'python-dateutil>=2.8.0,<3',
'apache-beam>=2.43.0,<2.49.0',
'cloudpickle>=2.2.0', 'avro-python3>=1.8.1,!=1.9.2',
'cloudpickle>=2.2.0', 'avro>=1.12.0',
'pytz>=2018.3', 'fastavro>=1.1.0,!=1.8.0', 'requests>=2.26.0',
'protobuf>=3.19.0',
'numpy>=1.22.4',
Expand Down

0 comments on commit 7c11f44

Please sign in to comment.