Script to automate the deployment and execution of the existing demo #17

SamBarker · 2024-09-18T03:31:57Z

The demo.sh script can deploy Apache Flink and dependant infrastructure using either Red Hat operators via subscription or upstream operators.

Its a bit rough round the edges yet, the waiting definitely needs work but its idempotent and reapplying works well.

I'm happy to clean up the git history if required.

katheris · 2024-09-18T14:07:41Z

Thanks, for working on this, it will definitely be great to have a script to easily deploy everything. I tried out your branch on both minikube and against an instance of OpenShift and in both cases the script failed with

failed calling webhook "validationwebhook.flink.apache.org": failed to call webhook: Post "https://flink-operator-webhook-service.flink.svc:443/validate?timeout=10s": no endpoints available for service "flink-operator-webhook-service"

I also noticed that the Strimzi operator is installed into the default namespace, rather than a named namespace and it fails to bring up a Kafka cluster. It looks like it failing due to permission errors:

User "system:serviceaccount:default:strimzi-cluster-operator" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "default".

More generally in terms of the content of the PR I have a few thoughts:

The README.md is empty apart from a TODO
The Kubernetes files for the recommendation app seem to be copied into kubernetes-samples/recommendations-demo/base. I think if possible it would be better if we only had a single instance of these files, to me the changes seem to be just around the names of the resources so I wonder if we could align the naming and reuse the existing files.
Similar to the previous point there is another instance of the productInventory.csv which if possible I think we shouldn't add.
A lot of the files added are either Kubernetes yaml files for a deployment, or in some cases the complete set of installation files. Is it possible to have the script without needing to check those in? In the demo today we are tied to a specific version for Flink and cert-manager, but not for Strimzi or Apicurio Registry. I'm worried that we are setting ourselves up to either have a demo that gets out of date very fast, or have lots of extra work to keep updating these dependencies.
If possible I think it would be useful to have a script for deleting everything

kornys · 2024-09-18T14:33:37Z

There has to be wait for cert manager ready, orherwise flink deployment operator fails on validation webhook.

kornys · 2024-09-18T15:14:36Z

Ans as Kate mentioned if strimzi CO is running in different ns and has to operate in all namespaces there are missing cluster roles and cluster role bindings.

Frawless · 2024-09-18T16:08:34Z

A lot of the files added are either Kubernetes yaml files for a deployment, or in some cases the complete set of installation files. Is it possible to have the script without needing to check those in? In the demo today we are tied to a specific version for Flink and cert-manager, but not for Strimzi or Apicurio Registry. I'm worried that we are setting ourselves up to either have a demo that gets out of date very fast, or have lots of extra work to keep updating these dependencies

I like the way to have specified just a versions of another projects used within the demo and then run script in CI. It should be then fairly easy to update to new versions and find issue in such cases? At least in Strimzi case. Guess it might be also better to use helm or olm to install all of the operators

Frawless · 2024-09-18T16:12:37Z

kubernetes-samples/supporting-infrastructure/postinstall.sh

+
+set -eo pipefail
+
+STRIMZI_KAFKA=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0


Would be better to align it with Strimzi version used for managing Kafka cluster.

It would, but postinstall.sh isn't actually used I copied it from Kroxylicious and should remove it from this PR.

SamBarker · 2024-09-18T22:00:14Z

Thanks all!

I agree its a bit all over the place :D

I tried out your branch on both minikube and against an instance of OpenShift and in both cases the script failed with

Yeah I think as @kornys says thats just a missing wait, but wasn't an issue I encountered (probably because I was using the Red Hat build which installed it via OLM).

The Kubernetes files for the recommendation app seem to be copied into kubernetes-samples/recommendations-demo/base. I think if possible it would be better if we only had a single instance of these files, to me the changes seem to be just around the names of the resources so I wonder if we could align the naming and reuse the existing files.
&
Similar to the previous point there is another instance of the productInventory.csv which if possible I think we shouldn't add.

I forked things so I can hack my demo together without affecting the existing script we should really pick a single source of truth. We should definitely avoid having two copies. The other thing I found was, the kustomize configMap generator was running but the pod wasn't finding the config map. I need to revisit that.

also noticed that the Strimzi operator is installed into the default namespace, rather than a named namespace and it fails to bring up a Kafka cluster. It looks like it failing due to permission errors:

I was only really testing the Red Hat builds branch. Maybe better to drop the upstream operators from this PR and pick them up again separately.

A lot of the files added are either Kubernetes yaml files for a deployment, or in some cases the complete set of installation files. Is it possible to have the script without needing to check those in? In the demo today we are tied to a specific version for Flink and cert-manager, but not for Strimzi or Apicurio Registry. I'm worried that we are setting ourselves up to either have a demo that gets out of date very fast, or have lots of extra work to keep updating these dependencies.

I think the versioning your concerned about is the upstream side of things and given that was me trying to be overly complete without testing it I'm tempted to split that into a separate PR and come back to it. In general I don't use floating tags/version ranges for any dependency I default to explicit versions so that I know what has been tested together. That said the maintenance of the demo definitely needs to be considered. We might be able to get dependabot to do the heavy lifting.

If possible I think it would be useful to have a script for deleting everything

It would be a short script: oc delete -k <dir> for each of the kustomize sets, but yes we should have one.

Ans as Kate mentioned if strimzi CO is running in different ns and has to operate in all namespaces there are missing cluster roles and cluster role bindings.

Oops! I should have tested the "upstream" option better.

I like the way to have specified just a versions of another projects used within the demo and then run script in CI. It should be then fairly easy to update to new versions and find issue in such cases? At least in Strimzi case. Guess it might be also better to use helm or olm to install all of the operators

Using olm for both Red Hat and upstream is probably a better option as it gives more consistency.

tinaselenge · 2024-09-19T14:38:57Z

scripts/demo.sh

+buildDataGenerator() {
+  echo -e "${GREEN}Building data generator${NO_COLOUR}"
+  ${MAVEN_COMMAND} -f "${EXAMPLES_DIR}/pom.xml" clean package
+  ${CONTAINER_ENGINE} build -f "${EXAMPLES_DIR}/data-generator/Dockerfile" -t flink-examples-data-generator:latest data-generator


Do we need to build the data generator image? The yaml for it pulling the image from quay.io.

I'd like to keep it in there for iterating on things but we should maybe make it optional (but we should pull it in when we build it locally)

tinaselenge · 2024-09-19T14:40:00Z

scripts/demo.sh

+  ${KUBE_COMMAND} apply -k "${EXAMPLES_DIR}/kubernetes-samples/supporting-infrastructure/overlays/${OPERATORS}_operators"
+
+  if [[ ${OPERATORS} == "Red_Hat" ]]; then
+    local AMQ_STREAMS_VERSION=""


Should we use the new name instead of AMQ Streams for the variables too?

Fair point, we should use the proper name.

kornys · 2024-10-10T07:08:18Z

@SamBarker it would be nice if you create a kafkatopic for apicurio ksql as it is described in documentation, maybe also other topics for applications.

robobario · 2024-10-31T03:43:47Z

This will likely need an update since I've just updated main to point at a v0.0.1 tag of the SQL runner instead of latest.

Frawless reviewed Sep 18, 2024

View reviewed changes

tinaselenge reviewed Sep 19, 2024

View reviewed changes

tinaselenge mentioned this pull request Oct 7, 2024

Add interactive ETL demo #20

Merged

SamBarker added 18 commits October 29, 2024 12:00

setup streams for Apache Kafka and Red Hat build of Apicurio registry

e17b3dd

Add Red Hat cert manager build

868f9a0

fixup cert-manager subscription to supported config

e358594

consolidate cert-manager resources

7d32941

install flink via helm

3f81d64

don't touch the openshift-operators namespace. bad things happen

dbb269b

test for apicurio in its name space

dba5af7

throw some kustomize at the demo deployment

5e7058a

fixup apicurio references

2485f06

add kustomize resources for recommendations-app

37e36b5

fix path

1248e9d

refactoring script

27ee8f3

Add upstream operators to kuztomize

4572ac8

formatting

2550059

deploy Strimzi into the expected namespace

e97953a

some cleanup

e5bd2d5

mash in the expected config-map

652862a

drop upstream overlay

f6babe8

SamBarker force-pushed the subscription-demo branch from 3c0ff6f to f6babe8 Compare October 28, 2024 23:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script to automate the deployment and execution of the existing demo #17

Script to automate the deployment and execution of the existing demo #17

SamBarker commented Sep 18, 2024 •

edited

Loading

katheris commented Sep 18, 2024

kornys commented Sep 18, 2024

kornys commented Sep 18, 2024

Frawless commented Sep 18, 2024

Frawless Sep 18, 2024

SamBarker Sep 18, 2024

SamBarker commented Sep 18, 2024

tinaselenge Sep 19, 2024

SamBarker Sep 19, 2024

tinaselenge Sep 19, 2024 •

edited

Loading

SamBarker Sep 19, 2024

kornys commented Oct 10, 2024

robobario commented Oct 31, 2024


		set -eo pipefail

		STRIMZI_KAFKA=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0

Script to automate the deployment and execution of the existing demo #17

Are you sure you want to change the base?

Script to automate the deployment and execution of the existing demo #17

Conversation

SamBarker commented Sep 18, 2024 • edited Loading

katheris commented Sep 18, 2024

kornys commented Sep 18, 2024

kornys commented Sep 18, 2024

Frawless commented Sep 18, 2024

Frawless Sep 18, 2024

Choose a reason for hiding this comment

SamBarker Sep 18, 2024

Choose a reason for hiding this comment

SamBarker commented Sep 18, 2024

tinaselenge Sep 19, 2024

Choose a reason for hiding this comment

SamBarker Sep 19, 2024

Choose a reason for hiding this comment

tinaselenge Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

SamBarker Sep 19, 2024

Choose a reason for hiding this comment

kornys commented Oct 10, 2024

robobario commented Oct 31, 2024

SamBarker commented Sep 18, 2024 •

edited

Loading

tinaselenge Sep 19, 2024 •

edited

Loading