Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade style guide #157

Merged
merged 4 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion modules/install-upgrade/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,4 @@
* xref:supported-versions.adoc[]
* xref:production-cluster-sizing.adoc[]
* xref:cluster-sizing-reference.adoc[]
* xref:migration.adoc[]
* xref:upgrade-guide.adoc[]
3 changes: 0 additions & 3 deletions modules/install-upgrade/pages/migration.adoc

This file was deleted.

203 changes: 78 additions & 125 deletions modules/install-upgrade/pages/upgrade-guide.adoc
Original file line number Diff line number Diff line change
@@ -1,25 +1,27 @@
= Upgrade Luna Streaming from 2.10 to 3.1
:navtitle: Upgrade from 2.10 to 3.1

Upgrading to Luna Streaming 3.1 should only be done from Luna Streaming 2.10.
This guide provides instructions and recommendations for upgrading Luna Streaming from version 2.10 to 3.1.

Upgrade is fully supported for all the components including connectors.
Upgrades to Luna Streaming 3.1 should only be performed from Luna Streaming 2.10.

All Luna Streaming 2.10 components support the upgrade to 3.1.

== Functional impacts

This section describes changes in Luna Streaming 3.1 that may impact how your deployment functions.

=== Default system topics

In Pulsar 3.1, system topics are enabled by default.
In Pulsar 3.1, system topics are now enabled by default.

=== Prometheus metrics
=== Prometheus metrics changes

Changes in Prometheus Metrics:
Prometheus metrics have been updated in Luna Streaming 3.1.

* Prometheus Client version has changed from 0.5.0 to 0.16.0
* Prometheus Metric type UNTYPED is renamed to UNKNOWN
* Metrics have been renamed because OpenMetrics's counter name needs a _total suffix
* Prometheus Client version has changed from `0.5.0` to `0.16.0`
* Prometheus Metric type `UNTYPED` is renamed to `UNKNOWN`
* Metrics have been renamed because OpenMetrics's counter name needs a `_total` suffix

.Renamed metrics
[cols="2,2"]
Expand Down Expand Up @@ -111,7 +113,7 @@ Changes in Prometheus Metrics:
|pulsar_txn_append_log_total
|===

The following PRs have been merged to update metrics:
The following PRs were merged to update metrics:

* https://github.com/apache/pulsar/pull/13785[#13785 - Bump prometheus client version from 0.5.0 to 0.15.0]
* https://github.com/apache/pulsar/pull/16581[#16581 - Rename Pulsar txn metrics to specify OpenMetrics]
Expand All @@ -120,199 +122,156 @@ The following PRs have been merged to update metrics:
* https://github.com/apache/pulsar/pull/16591[#16591 - Bump prometheus client version from 0.15.0 to 0.16.0]
* https://github.com/apache/pulsar/pull/17419[#17419 - Removed timestamp from all prometheus metrics.]

=== Pulsar-SQL

If you're upgrading Pulsar SQL from 2.11 or earlier, you should copy the config files from `conf/presto` to `trino/conf`.

If you're downgrading Pulsar SQL to 2.11 or earlier from newer versions, copy the config files from `trino/conf` to `conf/presto`.

=== Other functional impacts

The following PRs were merged in Luna Streaming 3.1 that may impact your deployment's functionality.

[cols="1,2,3"]
|===
|PR Link |Title |Functional Impact

|https://github.com/apache/pulsar/pull/19180[#19180]
|[cleanup][broker] Deprecate blocking AuthorizationService, AuthorizationProvider methods
|Deprecate blocking AuthorizationService, AuthorizationProvider methods
|This will affect the public API for the AuthorizationService and the AuthorizationProvider, which only impacts users that are running custom code inside the Pulsar Broker

|https://github.com/apache/pulsar/pull/19182[#19182]
|[cleanup][broker] Remove AuthorizationProvider methods deprecated in 2.7 and 2.9
|Remove AuthorizationProvider methods deprecated in 2.7 and 2.9
|Removing deprecated methods allowTenantOperationAsync, allowTenantOperation, allowNamespaceOperationAsync, allowNamespaceOperation, allowNamespacePolicyOperationAsync, allowNamespacePolicyOperation, allowTopicOperationAsync, allowTopicOperation. These methods could be used by third party extensions

|https://github.com/apache/pulsar/pull/19197[#19197]
|[feat][broker] Update AuthenticationProvider to simplify HTTP Authn
|Update AuthenticationProvider to simplify HTTP Authn
|This changes the public API within the broker as some methods are marked as @Deprecated

|https://github.com/apache/pulsar/pull/19295[#19295]
|[feat][broker] OneStageAuth State: move authn out of constructor
|OneStageAuth State: move authn out of constructor
|This could break 3rd party plugins in the broker if they were relying on authentication to happen in the constructor. In order to make those implementations fail fast, this PR includes a change to throw an exception when the getAuthRole is called without first calling authenticateAsync or authenticate. That makes these changes semi-backwards compatible.

|https://github.com/apache/pulsar/pull/19314[#19314]
|[fix][broker] TokenAuthenticationState: authenticate token only once
|TokenAuthenticationState: authenticate token only once
|In a sense, this breaks an implicit contract that the class had. However, because the getAuthRole() method will throw an exception if called incorrectly, it is likely that misuse of this class will result in a fail fast behavior.

|https://github.com/apache/pulsar/pull/19455[#19455]
|[improve][broker] Require authRole is proxyRole to set originalPrincipal
|Require authRole is proxyRole to set originalPrincipal
|This change affects the binary protocol's usage without changing the binary protocol itself. Upgrading existing proxies will not work if the proxyRoles is not correctly configured in the broker.conf.

|https://github.com/apache/pulsar/pull/19486[#19486]
|[improve][client] Remove default 30s ackTimeout when setting DLQ policy on java consumer
|Remove default 30s ackTimeout when setting DLQ policy on java consumer
|Removed setting default ackTimeoutMillis in java ConsumerBuilder when a deadLetterPolicy is set. It has to be specified exclusively to use.
|===

== Configuration impacts

=== Removed in 3.1
This section describes changes in Luna Streaming 3.1 that may impact your deployment's configuration.

* https://github.com/apache/pulsar/pull/14506[#14506] removes `managedLedgerNumWorkerThreads`. The `MetadataStore` instance is passed from the `PulsarService` directly to the `ManagedLedgerFactory`.
=== Configuration values removed in 3.1

* The `conf/presto` directory has been removed.
* https://github.com/apache/pulsar/pull/14506[PR #14506] removes `managedLedgerNumWorkerThreads`.
The `MetadataStore` instance is now passed from the `PulsarService` directly to the `ManagedLedgerFactory`.

=== Deprecated and default values changed in 3.1
* The Pulsar SQL `conf/presto` directory has been removed.
** If you're upgrading Pulsar SQL from 2.11 or earlier, copy the Pulsar SQL config files from `conf/presto` to `trino/conf`.
** If you're downgrading Pulsar SQL to 2.11 or earlier from newer versions, copy the Pulsar SQL config files from `trino/conf` to `conf/presto`.

.`broker.conf` and `standalone.conf` values
[cols="1,1,1"]
|===
|Configuration |Luna Streaming 2.10 Default | Luna Streaming 3.1 Default

|Managed ledger cache eviction frequency
|`managedLedgerCacheEvictionFrequency=100.0`
|`managedLedgerCacheEvictionFrequency=0`

|Max unacked ranges to persist in ZooKeeper
|`managedLedgerMaxUnackedRangesToPersistInZooKeeper=1000`
|`managedLedgerMaxUnackedRangesToPersistInZooKeeper=-1`
|===
=== Default values changed or deprecated in 3.1

=== Changed in 3.1
The following default values in `broker.conf` and `standalone.conf` have changed or been deprecated in Luna Streaming 3.1.

.`broker.conf` and `standalone.conf` values
[cols="1,1,1"]
|===
|Configuration |Luna Streaming 2.10 Default | Luna Streaming 3.1 Default

|`managedLedgerCacheEvictionFrequency``
|`100.0`
|`0`

|`managedLedgerMaxUnackedRangesToPersistInZooKeeper`
|`1000`
|`-1`

|`systemTopicEnabled`
Enable or disable system topic
|false
|true
|`false`
|`true`

|`topicLevelPoliciesEnabled`
Enable or disable topic level policies (depends on system topic)
|false
|true
|`false`
|`true`

|`supportedNamespaceBundleSplitAlgorithms`
Supported algorithms for namespace bundle split
|`range_equally_divide`,`topic_count_equally_divide`,`specified_positions_divide`
|`range_equally_divide`,`topic_count_equally_divide`,`specified_positions_divide`,`flow_or_qps_equally_divide`

|`loadBalancerDirectMemoryResourceWeight`
Direct memory usage weight for calculating resource usage in `ThresholdShedder ` strategy
|1.0
|0
|`1.0`
|`0`

|`fileSystemProfilePath`
File System Storage profile path
|`../conf/filesystem_offload_core_site.xml`
|`conf/filesystem_offload_core_site.xml`

|`gcsManagedLedgerOffloadMaxBlockSizeInBytes`
Max block size in bytes for Google Cloud Storage ledger offload
|67108864
|134217728
|`67108864`
|`134217728`
|===

== Operational impacts

This section describes changes in Luna Streaming 3.1 that may impact how your deployment operates.

=== Upgrade to JDK 17
=== JDK 17 upgrade

Luna Streaming 3.1 uses JDK 17. This changes the Pulsar server module's javac release version to 17.
Luna Streaming 3.1 uses JDK 17.

Client and client-server shared modules will remain at the target Java 8 release.
The Pulsar server module's `javac` release version is `17`.

The modification is described in detail in PIP-156 in https://github.com/apache/pulsar/pull/15207[#15207].
Client and client-server shared modules remain at the target Java 8 release.

=== Removed Python 2 support
This modification is described in detail in https://github.com/apache/pulsar/pull/15207[PIP-156].

Luna Streaming 3.1 removes Python 2 from build scripts.

Python3 is used in the build image.
=== Python 2 support removed

The build image has been updated to ubuntu:20.04 as there is no Python 3.7 support in the old Ubuntu.
Luna Streaming 3.1 removes Python 2 from build scripts.

Executable scripts have been updated to use python3 instead of python.
Python 3 is used in the build image.

The modification is described in detail in PIP-155 in https://github.com/apache/pulsar/pull/15376[#15376]
The build image is updated to use `ubuntu:20.04`, as there is no Python 3.7 support in the previous Ubuntu image.

=== Updated Prometheus metrics
Executable scripts have been updated to invoke `python3` instead of `python`.

Prometheus metrics have been updated in Luna Streaming 3.1.

See <<Prometheus metrics>> for details.
This modification is described in detail in https://github.com/apache/pulsar/pull/15376[PIP-155]

== Known issues

This section describes known issues encountered when upgrading to Luna Streaming 3.1.

=== Bookkeeper / RocksDB format

Pulsar 3.1 uses RocksDB 7.x, which writes in a format that is not compatible with RocksDB 6.x, which is used by LunaStreaming 2.10 via Bookkeeper 4.14.

**Downgrading to 2.10 from 3.1 is not supported for Bookies and ZooKeeper**. All other components such as Broker, Proxy and Functions Worker can be downgraded at any time.
**Downgrading to Luna Streaming 2.10 from Luna Streaming 3.1 is not supported for Bookies and ZooKeeper**.

For more information, see https://github.com/apache/pulsar/issues/22051[(Bug) Downgrade issue #22051 - apache/pulsar · GitHub].s
Pulsar 3.1 uses RocksDB `7.x`, which writes in a format that is not compatible with RocksDB `6.x`.

To reproduce the issue where Bookkeeper instances fail to downgrade:
Luna Streaming 2.10 uses Bookkeeper 4.14, which uses RocksDB `6.x`.

. Install Luna Streaming 2.10.
. Upgrade to Luna Streaming 3.1.
. Downgrade to Luna Streaming 2.10.
All other components such as Broker, Proxy, and Functions Worker can be downgraded at any time.

Stack trace for the downgrade failure:

[%collapsible]
=====
[source,java]
----
2024-02-23T11:42:13,993+0000 [main] INFO org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage - Creating single directory db ledger storage on data/bookkeeper/ledgers/current
2024-02-23T11:42:14,146+0000 [main] INFO org.apache.bookkeeper.proto.BookieNettyServer - Shutting down BookieNettyServer
2024-02-23T11:42:14,155+0000 [main] ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
java.io.IOException: Error open RocksDB database
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:200) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:89) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:63) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:170) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:818) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.server.Main.doMain(Main.java:226) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
at org.apache.bookkeeper.server.Main.main(Main.java:208) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
Caused by: org.rocksdb.RocksDBException: unknown checksum type 4 in data/bookkeeper/ledgers/current/ledgers/000006.sst offset 1020 size 33
at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:197) ~[com.datastax.oss-bookkeeper-server-4.14.5.1.0.2.jar:4.14.5.1.0.2]
... 13 more
----
=====
For more information, see https://github.com/apache/pulsar/issues/22051[Issue 22051].

== Upgrade procedure

Luna Streaming can be deployed on Bare metal, Docker, and Kubernetes.
Luna Streaming can be deployed on bare metal, Docker, and Kubernetes.

This guide will only address Kubernetes deployment.
This guide only addresses Kubernetes deployment.

For more information on upgrading bare metal and Docker Pulsar deployments, see the https://pulsar.apache.org/docs/3.3.x/administration-upgrade/[Pulsar documentation].

=== Kubernetes deployment using KAAP Operator
=== Upgrade Kubernetes deployment with KAAP Operator

Deploying Luna Streaming on Kubernetes with KAAP (Kubernetes Autoscaling for Apache Pulsar) Operator is a common method for running Pulsar in a cloud-native environment.
Upgrade to Luna Streaming 3.1 on Kubernetes with the KAAP (Kubernetes Autoscaling for Apache Pulsar) operator.

For more information, see the xref:kaap-operator::index.adoc[KAAP documentation].

. Back up your existing Pulsar data and configurations to prevent data loss.
. To prevent data loss, back up your existing Pulsar data and configuration files.
. To save your current Helm release configuration, run the following command:
+
[source,bash,subs="+quotes"]
Expand All @@ -327,7 +286,7 @@ helm get values *RELEASE-NAME* > pulsar-backup-values.yaml
helm repo update
----
+
. Open `helm/kaap-stack/values.yaml` and update the image tag to 3.1.0 (or the specific tag you wish to use).
. Open `helm/kaap-stack/values.yaml`, and then update the image tag to `3.1.0` (or the specific tag you wish to use).
+
[source,yaml]
----
Expand All @@ -343,7 +302,7 @@ kaap:
datastax/lunastreaming-all: 3.1_4.5
----
+
. Review and modify any other configuration parameters that may have changed between versions, such as resource limits, storage classes, and additional components. To modify other configurations, update `values.yaml` as needed. For example, to modify the broker's namespace shedding and splitting configurations, update the following fields:
. To modify other configurations, update `values.yaml` as needed. For example, to modify the broker's namespace shedding and splitting configurations, update the following fields:
+
[source,yaml]
----
Expand Down Expand Up @@ -387,18 +346,13 @@ kubectl get pods --namespace *NAMESPACE*
kubectl logs *POD-NAME* -n *NAMESPACE*
----

. After upgrading, check if any additional configurations are required for new features in version 3.1. Adjust settings related to multi-tenancy, security, and observability as needed. Ensure all necessary configurations are in place and correct after the upgrade.
. Test the functionality of your Pulsar cluster by sending messages and ensuring that consumers can read them without issues. Conduct functional tests to ensure that the upgrade did not impact existing applications and that new features work as expected.

// known issues
. After the upgrade, ensure all necessary configurations are in place and correct.

=== Kubernetes deployment using Helm chart
=== Upgrade Kubernetes deployment with Helm chart

The Helm chart for Luna Streaming is available in the https://github.com/datastax/pulsar-helm-chart/blob/master/helm-chart-sources/pulsar/values.yaml[Helm chart sources] repository.

Deploying Luna Streaming on Kubernetes using the DataStax Helm chart is another common method for running Pulsar in a cloud-native environment.

. Back up your existing Pulsar data and configurations to prevent data loss.
. To prevent data loss, back up your existing Pulsar data and configuration files.
. To save your current Helm release configuration, run the following command:
+
[source,bash,subs="+quotes"]
Expand All @@ -413,7 +367,7 @@ helm get values *RELEASE-NAME* > pulsar-backup-values.yaml
helm repo update
----
+
. Open `helm-chart-sources/pulsar/values.yaml` and update the image tag to 3.1.0 (or the specific tag you wish to use).
. Open `helm-chart-sources/pulsar/values.yaml` and update the image tag to `3.1.0` (or the specific tag you wish to use).
+
[source,yaml]
----
Expand Down Expand Up @@ -487,7 +441,6 @@ kubectl get pods --namespace *NAMESPACE*
kubectl logs *POD-NAME* -n *NAMESPACE*
----

. After upgrading, check if any additional configurations are required for new features in version 3.1. Adjust settings related to multi-tenancy, security, and observability as needed. Ensure all necessary configurations are in place and correct after the upgrade.
. Test the functionality of your Pulsar cluster by sending messages and ensuring that consumers can read them without issues. Conduct functional tests to ensure that the upgrade did not impact existing applications and that new features work as expected.
. After the upgrade, ensure all necessary configurations are in place and correct.


Loading
Loading