Production testing Linkerd2's discovery & caching.
The goal of this test suite is to run an outbound proxy for a prolonged amount of time in a dynamically-scheduled environment in order to exercise:
- Route resource lifecyle (i.e. routes are properly evicted)
- Telemetry resource lifecycle (i.e. prometheus can run steadily for a long time, proxy doesn't leak memory in exporter).
- Service discovery lifecycle (i.e. updates are honored correctly, doesn't get out sync).
This environment creates a ClusterRole
, which requires your user to have this
kubectl create clusterrolebinding cluster-admin-binding-$USER \
--clusterrole=cluster-admin --user=$(gcloud config get-value account)
Deploy 3 lifecycle environments:
linkerd install --linkerd-namespace linkerd-lifecycle | kubectl apply -f -
bin/deploy 3
Scale 3 lifecycle environments to 3 replicas of bb-broadcast
, bb-p2p
, and
bin/scale 3 3
Total mesh-enabled pod count == 1 linkerd ns * (3*replicas+2)
Teardown 3 lifecycle environments:
bin/teardown 3
kubectl delete ns linkerd-lifecycle
Install Linkerd service mesh:
linkerd install --linkerd-namespace linkerd-lifecycle | kubectl apply -f -
linkerd dashboard --linkerd-namespace linkerd-lifecycle
Deploy test framework to lifecycle
export LIFECYCLE_NS=lifecycle
kubectl create ns $LIFECYCLE_NS
cat lifecycle.yml | linkerd inject --linkerd-namespace linkerd-lifecycle - | kubectl -n $LIFECYCLE_NS apply -f -
Scale bb-broadcast
, bb-p2p
, and bb-terminus
kubectl -n $LIFECYCLE_NS scale --replicas=3 deploy/bb-p2p deploy/bb-terminus
Browse to Grafana:
linkerd dashboard --linkerd-namespace linkerd-lifecycle --show grafana
Tail slow-cooker logs:
kubectl -n $LIFECYCLE_NS logs -f $(
kubectl -n $LIFECYCLE_NS get po --selector=app=slow-cooker -o jsonpath='{.items[*]}'
) slow-cooker
Relevant Grafana dashboards to observe
Linkerd Deployment
, for route lifecycle and service discovery lifecyclePrometheus 2.0 Stats
, for telemetry resource lifecycle
kubectl delete ns $LIFECYCLE_NS
kubectl delete ns linkerd-lifecycle