Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(merge): merge apache kafka 3.9 398b4c4fa1a5678be28a3fb9196ba4553f356291 #2105

Merged
merged 124 commits into from
Oct 31, 2024

Conversation

superhx
Copy link
Collaborator

@superhx superhx commented Oct 31, 2024

Conflict files:

        both modified:   build.gradle
        both modified:   core/src/main/scala/kafka/log/UnifiedLog.scala
        both modified:   core/src/main/scala/kafka/server/BrokerServer.scala
        both modified:   core/src/main/scala/kafka/server/ControllerServer.scala
        both modified:   core/src/main/scala/kafka/server/KafkaRequestHandler.scala
        both modified:   core/src/main/scala/kafka/tools/StorageTool.scala
        both modified:   core/src/test/scala/unit/kafka/server/KafkaApisTest.scala
        both modified:   gradle/dependencies.gradle
        both modified:   licenses/APL
        both modified:   metadata/src/main/java/org/apache/kafka/controller/ClusterControlManager.java
        both modified:   metadata/src/main/java/org/apache/kafka/controller/ConfigurationControlManager.java
        both modified:   metadata/src/main/java/org/apache/kafka/controller/QuorumController.java
        both modified:   metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java
        both modified:   storage/src/main/java/org/apache/kafka/storage/internals/log/LogConfig.java
        both modified:   tests/docker/Dockerfile
        both modified:   tests/kafkatest/services/kafka/kafka.py
        both modified:   tests/kafkatest/version.py
        both modified:   tests/setup.py

cmccabe and others added 30 commits July 29, 2024 15:58
…(#16433)

This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Production exception handler and deprecate the previous handle signature.

Co-authored-by: Dabz <[email protected]>
Co-authored-by: loicgreffier <[email protected]>

Reviewers: Bruno Cadonna <[email protected]>, Matthias J. Sax <[email protected]>
…dler (#16432)

This PR is part of KIP1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Deserialization exception handlers and deprecate the previous handle signature.

Co-authored-by: Dabz <[email protected]>
Co-authored-by: loicgreffier <[email protected]>

Reviewers: Bruno Cadonna <[email protected]>, Matthias J. Sax <[email protected]>
* KAFKA-17214: Add 3.8.0 version to streams system tests

Reviewers: Bill Bejeck <[email protected]>
…675)

This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR catch the exceptions thrown while handling a processing exception

Co-authored-by: Dabz <[email protected]>
Co-authored-by: loicgreffier <[email protected]>

Reviewers: Bruno Cadonna <[email protected]>, Matthias J. Sax <[email protected]>
… exceptions (#16450)" (#16738)

This reverts commit 15a4501.

We consider this change backward incompatible and will fix forward for 4.0
release via KIP-1065, but need to revert for 3.9 release.

Reviewers: Josep Prat <[email protected]>, Bill Bejeck <[email protected]>
…ue (#16736)

Part of KIP-1033. Minor code cleanup.

Reviewers: Matthias J. Sax <[email protected]>
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR actually catches processing exceptions from punctuate.

Co-authored-by: Dabz <[email protected]>
Co-authored-by: loicgreffier <[email protected]>

Reviewers: Bruno Cadonna <[email protected]>, Matthias J. Sax <[email protected]>
…53 (#16759)

describe --status now includes directory id and endpoint information for voter and observers.
describe --replication now includes directory id.

Reviewers: Colin P. McCabe <[email protected]>, José Armando García Sancio <[email protected]>
…16748)

we need to migate GroupMetadataMessageFormatter from scala code to java code,and make the message format is json pattern

Reviewers: Chia-Ping Tsai <[email protected]>
Updated docs for KIP-1033.

Reviewers: Matthias J. Sax <[email protected]>
…mer` to broker configs when system tests need the new coordinator (#16715)

Fix an issue that cause system test failing when using AsyncKafkaConsumer.
A configuration option, group.coordinator.rebalance.protocols, was introduced to specify the rebalance protocols used by the group coordinator. By default, the rebalance protocol is set to classic. When the new group coordinator is enabled, the rebalance protocols are set to classic,consumer.

Reviewers: Chia-Ping Tsai <[email protected]>, David Jacot <[email protected]>, Lianet Magrans <[email protected]>, Kirk True <[email protected]>, Justine Olshan <[email protected]>
Revert KAFKA-16257 changes because KIP-950 doesn't need it anymore.

Reviewers: Luke Chen <[email protected]>
…sion update (#16781)

   1. Use oldestAllowedVersion as 9 if using ListOffsetsRequest#EARLIEST_LOCAL_TIMESTAMP or ListOffsetsRequest#LATEST_TIERED_TIMESTAMP.
   2. Add test cases to ListOffsetsRequestTest#testListOffsetsRequestOldestVersion to make sure requireTieredStorageTimestamp return 9 as minVersion.
   3. Add EarliestLocalSpec and LatestTierSpec to OffsetSpec.
   4. Add more cases to KafkaAdminClient#getOffsetFromSpec.
   5. Add testListOffsetsEarliestLocalSpecMinVersion and testListOffsetsLatestTierSpecSpecMinVersion to KafkaAdminClientTest to make sure request builder has oldestAllowedVersion as 9.

Signed-off-by: PoAn Yang <[email protected]>

Reviewers: Luke Chen <[email protected]>
…ft mode (#16693)

Reviewers: Kamal Chandraprakash <[email protected]>, Christo Lolov <[email protected]>
Follow up code cleanup for KIP-1033.

This PR unifies the handling of both error cases for exception handlers:
 - handler throws an exception
 - handler returns null

The unification happens for all 5 handler cases:
 - deserialzation
 - production / serialization
 - production / send
 - processing
 - punctuation

Reviewers:  Sebastien Viale <[email protected]>, Loic Greffier <[email protected]>, Bill Bejeck <[email protected]>
This pr support EarliestLocalSpec LatestTierSpec in GetOffsetShell, and add integration tests.

Reviewers: Luke Chen <[email protected]>, Chia-Ping Tsai <[email protected]>, PoAn Yang <[email protected]>
* KAFKA-17227: Update zstd-jni lib
* Add note in upgrade docs
* Change zstd-jni version in docker native file and add warning in dependencies.gradle file
* Add reference to snappy in upgrade

Reviewers:  Chia-Ping Tsai <[email protected]>,  Mickael Maison <[email protected]>
As part of KIP-853, storage-tool.sh now has two new flags: --standalone, and --initial-voters. This PR implements these two flags in storage-tool.sh.

There are currently two valid ways to format a cluster:

The pre-KIP-853 way, where you use a statically configured controller quorum. In this case, neither --standalone nor --initial-voters may be specified, and kraft.version must be set to 0.

The KIP-853 way, where one of --standalone and --initial-voters must be specified with the initial value of the dynamic controller quorum. In this case, kraft.version must be set to 1.

This PR moves the formatting logic out of StorageTool.scala and into Formatter.java. The tool file was never intended to get so huge, or to implement complex logic like generating metadata records. Those things should be done by code in the metadata or raft gradle modules. This is also useful for junit tests, which often need to do formatting. (The 'info' and 'random-uuid' commands remain in StorageTool.scala, for now.)

Reviewers: José Armando García Sancio <[email protected]>
Add support for handling the update voter RPC. The update voter RPC is used to automatically update
the voters supported kraft versions and available endpoints as the operator upgrades and
reconfigures the KRaft controllers.

The add voter RPC is handled as follow:

1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known;
   otherwise, return the REQUEST_TIMED_OUT error.

2. Check that the cluster supports kraft.version 1; otherwise, return the UNSUPPORTED_VERSION error.

3. Check that there are no uncommitted voter changes, otherwise return the REQUEST_TIMED_OUT error.

4. Check that the updated voter still supports the currently finalized kraft.version; otherwise
   return the INVALID_REQUEST error.

5. Check that the updated voter is still listening on the default listener.

6. Append the updated VotersRecord to the log. The KRaft internal listener will read this
   uncommitted record from the log and update the voter in the set of voters.

7. Wait for the VotersRecord to commit using the majority of the voters. Return a REQUEST_TIMED_OUT
   error if it doesn't commit in time.

8. Send the UpdateVoter successful response to the voter.

This change also implements the ability for the leader to update its own entry in the voter
set when it becomes leader for an epoch. This is done by updating the voter set and writing a
control batch as the first batch in a new leader epoch.

Finally, fix a bug in KafkaAdminClient's handling of removeRaftVoterResponse where we tried to cast
the response to the wrong type.

Reviewers: Alyssa Huang <[email protected]>, Colin P. McCabe <[email protected]>
related to https://issues.apache.org/jira/browse/KAFKA-17235

The root cause of this issue is a change we introduced in KAFKA-16879, where we modified the PushHttpMetricsReporter constructor to use Time.System [1]. However, Time.System doesn't exist in Kafka versions 0.8.2 and 0.9.

In test_performance_services.py, we have system tests for Kafka versions 0.8.2 and 0.9 [2]. These tests always use the tools JAR from the trunk branch, regardless of the Kafka version being tested [3], while the client JAR aligns with the Kafka version specified in the test suite [4]. This discrepancy is what causes the issue to arise.

To resolve this issue, we have a few options:

1) Add Time.System to Kafka 0.8.2 and 0.9: This isn't practical, as we no longer maintain these versions.
2) Modify the PushHttpMetricsReporter constructor to use new SystemTime() instead of Time.System: This would contradict the intent of KAFKA-16879, which aims to make SystemTime a singleton.
3) Implement Time in PushHttpMetricsReporter use the time to get current time
4) Remove system tests for Kafka 0.8.2 and 0.9 from test_performance_services.py

Given that we no longer maintain Kafka 0.8.2 and 0.9, and altering the constructor goes against the design goals of KAFKA-16879, option 4 appears to be the most feasible solution. However, I'm not sure whether it's acceptable to remove these old version tests. Maybe someone else has a better solution

"We'll proceed with option 3 since support for versions 0.8 and 0.9 is still required, meaning we can't remove those Kafka versions from the system tests."

Reviewers: Chia-Ping Tsai <[email protected]>
ahuang98 and others added 21 commits October 1, 2024 17:21
When a replica restarts in the follower state it is possible for the set of leader endpoints to not match the latest set of leader endpoints. Voters will discover the latest set of leader endpoints through the BEGIN_QUORUM_EPOCH request. This means that KRaft needs to allow for the replica to transition from Follower to Follower when only the set of leader endpoints has changed.

Reviewers: Colin P. McCabe <[email protected]>, Alyssa Huang <[email protected]>
When reverting the ZK migration, we must also remove the /migration ZNode in order to allow the migration to be re-attempted in the future.

Reviewers: Colin P. McCabe <[email protected]>, Chia-Ping Tsai <[email protected]>
Previously, Apache Kafka was uploading release candidate (RC) artifacts
to users' home directories on home.apache.org. However, since this
resource has been decommissioned, we need to follow the standard
approach of putting release candidate artifacts into the appropriate
subversion directory, at https://dist.apache.org/repos/dist/dev/kafka/.

Reviewers: Justine Olshan <[email protected]>
KAFKA-16534 introduced a change to send UpdateVoterRequest every "3 * fetchTimeoutMs" if the voter's configure endpoints are different from the endpoints persisted in the KRaft log. It also introduced a regression where if the voter nodes do not need an update then updateVoterTimer wasn't reset. This resulted in a busy-loop in KafkaRaftClient#poll method resulting in high CPU usage.

This PR modifies the conditions in pollFollowerAsVoter to reset updateVoterTimer appropriately.

Reviewers: José Armando García Sancio <[email protected]>
… (#16960) (#17461)

Co-authored-by: Mickael Maison <[email protected]>

Reviewers: Chia-Ping Tsai <[email protected]>, Colin P. McCabe <[email protected]>
…er.name in advertisedBrokerListeners

During ZK migration, always include control.plane.listener.name in advertisedBrokerListeners, to be
bug-compatible with earlier Apache Kafka versions that ignored this misconfiguration. (Just as
before, control.plane.listener.name is not supported in KRaft mode itself.)

Reviewers: Luke Chen <[email protected]>
…efore ZK migration is finished (#17501)

Reviewers: Luke Chen <[email protected]>
…lt handling (#17499)

According to KIP-950, remote.log.manager.thread.pool.size should be marked as deprecated and replaced by two new configurations: remote.log.manager.copier.thread.pool.size and remote.log.manager.expiration.thread.pool.size. Fix default handling so that -1 works as expected.

Reviewers: Luke Chen <[email protected]>, Gaurav Narula <[email protected]>, Satish Duggana <[email protected]>, Colin P. McCabe <[email protected]>
KIP-853 adds support for dynamic KRaft quorums. This means that the quorum topology is
no longer statically determined by the controller.quorum.voters configuration. Instead, it
is contained in the storage directories of each controller and broker.

Users of dynamic quorums must format at least one controller storage directory with either
the --initial-controllers or --standalone flags.  If they fail to do this, no quorum can be
established. This PR changes the storage tool to warn about the case where a KIP-853 flag has
not been supplied to format a KIP-853 controller. (Note that broker storage directories
can continue to be formatted without a KIP-853 flag.)

There are cases where we don't want to specify initial voters when formatting a controller. One
example is where we format a single controller with --standalone, and then dynamically add 4
more controllers with no initial topology. In this case, we want the 4 later controllers to grab
the quorum topology from the initial one. To support this case, this PR adds the
--no-initial-controllers flag.

Reviewers: José Armando García Sancio <[email protected]>, Federico Valeri <[email protected]>
Signed-off-by: Robin Han <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Oct 31, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 13 committers have signed the CLA.

✅ superhx
❌ mumrah
❌ bbejeck
❌ ahuang98
❌ gaurav-narula
❌ cmccabe
❌ apoorvmittal10
❌ jsancio
❌ jlprat
❌ fvaleri
❌ m1a2st
❌ mimaison
❌ FrankYang0529
You have signed the CLA already but the status is still pending? Let us recheck it.

Signed-off-by: Robin Han <[email protected]>
@superhx superhx merged commit 72dfbc3 into 1.3 Oct 31, 2024
5 of 6 checks passed
@superhx superhx deleted the merge_3_9 branch October 31, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.