Application failure injection is a form of chaos engineering where we artificially increase the error rate of certain services in a microservice application to see what impact that has on the system as a whole. Traditionally, you would need to add some kind of failure injection library into your service code in order to do application failure injection. Thankfully, the service mesh gives us a way to inject application failures without needing to modify or rebuild our services at all.
We can easily inject application failures by using the Traffic Split API of the Service Mesh Interface. This allows us to do failure injection in a way that is implementation agnostic and works across service meshes.
We will do this first by deploying a new service which only return errored responses. We will be using a simple NGINX service which has configured to only return HTTP 500 responses.
We will hten create a traffic split which would redirect the service mesh to send a sample percentage of traffic to the error service instead, let's say 20% of service's traffic to error, then we would have injected an artificial 20% error rate in service.
We will be deploying Linkerd Books application for this part of the demo
Use meshery to deploy the bookinfo application :
- In Meshery, navigate to the Linkerd adapter's management page from the left nav menu.
- On the Linkerd adapter's management page, please enter
default
in theNamespace
field. - Then, click the (+) icon on the
Sample Application
card and selectBooks Application
from the list.
Inject linkerd into sample application using
linkerd inject https://run.linkerd.io/booksapp.yml | kubectl apply -f -
In the following, one of the service has already beeen configured with the error let's remove the error rate from the same :
kubectl edit deploy/authors
Remove the lines
- name: FAILURE_RATE
value: "0.5
Now if you will see linkerd stat
, the success rate would be 100%
linkerd stat deploy
Now we will create our error service, we have NGINX pre-configured to only respond with HTTP 500 status code
apiVersion: apps/v1
kind: Deployment
metadata:
name: error-injector
labels:
app: error-injector
spec:
selector:
matchLabels:
app: error-injector
replicas: 1
template:
metadata:
labels:
app: error-injector
spec:
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
name: nginx
protocol: TCP
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
volumes:
- name: nginx-config
configMap:
name: error-injector-config
---
apiVersion: v1
kind: Service
metadata:
labels:
app: error-injector
name: error-injector
spec:
clusterIP: None
ports:
- name: service
port: 7002
protocol: TCP
targetPort: nginx
selector:
app: error-injector
type: ClusterIP
---
apiVersion: v1
data:
nginx.conf: |2
events {
worker_connections 1024;
}
http {
server {
location / {
return 500;
}
}
}
kind: ConfigMap
metadata:
name: error-injector-config
After deploying the above errored service, we will create a traffic split resource which will be responsible to direct 20% of the book service to the error.
apiVersion: split.smi-spec.io/v1alpha3
kind: TrafficSplit
metadata:
name: fault-inject
spec:
service: books
backends:
- service: books
weight: 800m
- service: error-injector
weight: 200m
You can now see an 20% error rate for calls from webapp to books
linkerd routes deploy/webapp --to service/books
You can also see the error on the web browser
kubectl port-forward deploy/webapp 7000 && open http://localhost:7000
If you refresh page few times, you will see Internal Server Error
.
kubectl delete trafficsplit/error-split
- Remove the book info application from the
Meshery Dashboard
by clicking on thetrash icon
in thesample application
card on the linkerd adapters' page.