In this tutorial, you will learn how to use the capabilities of Keptn to provide self-healing for an application without modifying code. The following tutorial will scale up the pods of an application if the application undergoes heavy CPU saturation.
You'll find a time estimate until the end of this tutorial in the right top corner of your screen - this should give you guidance how much time is needed for each step.
Keptn can be installed on a variety of Kubernetes distributions. Please find a full compatibility matrix for supported Kubernetes versions here.
Please find tutorials how to set up your cluster here. For the best tutorial experience, please follow the sizing recommendations given in the tutorials.
Please make sure your environment matches these prerequisites:
Download the Istio command line tool by following the official instructions or by executing the following steps.
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.8.2 sh -
Check the version of Istio that has been downloaded and execute the installer from the corresponding folder, e.g.,
./istio-1.8.2/bin/istioctl install
The installation of Istio should be finished within a couple of minutes.
This will install the Istio default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster. Proceed? (y/N) y
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ Installation complete
Every release of Keptn provides binaries for the Keptn CLI. These binaries are available for Linux, macOS, and Windows.
There are multiple options how to get the Keptn CLI on your machine.
curl -sL https://get.keptn.sh | sudo -E bash
This will download and install the Keptn CLI automatically.keptn
binary in the unpacked directorychmod +x keptn
), and move it to the desired destination (e.g. mv keptn /usr/local/bin/keptn
)Now, you should be able to run the Keptn CLI:
keptn --help
.\keptn.exe --help
To install the latest release of Keptn with full quality gate + continuous delivery capabilities in your Kubernetes cluster, execute the keptn install
command.
keptn install --endpoint-service-type=ClusterIP --use-case=continuous-delivery
In the Keptn namespace, the following deployments should be found:
kubectl get deployments -n keptn
Here is the output of the command:
NAME READY UP-TO-DATE AVAILABLE AGE
api-gateway-nginx 1/1 1 1 2m44s
api-service 1/1 1 1 2m44s
bridge 1/1 1 1 2m44s
configuration-service 1/1 1 1 2m44s
eventbroker-go 1/1 1 1 2m44s
gatekeeper-service 1/1 1 1 2m44s
helm-service 1/1 1 1 2m44s
helm-service-continuous-deployment-distributor 1/1 1 1 2m44s
jmeter-service 1/1 1 1 2m44s
lighthouse-service 1/1 1 1 2m44s
mongodb 1/1 1 1 2m44s
mongodb-datastore 1/1 1 1 2m44s
remediation-service 1/1 1 1 2m44s
shipyard-service 1/1 1 1 2m44s
We are using Istio for traffic routing and as an ingress to our cluster. To make the setup experience as smooth as possible we have provided some scripts for your convenience. If you want to run the Istio configuration yourself step by step, please take a look at the Keptn documentation.
The first step for our configuration automation for Istio is downloading the configuration bash script from Github:
curl -o configure-istio.sh https://raw.githubusercontent.com/keptn/examples/release-0.7.3/istio-configuration/configure-istio.sh
After that you need to make the file executable using the chmod
command.
chmod +x configure-istio.sh
Finally, let's run the configuration script to automatically create your Ingress resources.
./configure-istio.sh
With this script, you have created an Ingress based on the following manifest.
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: istio
name: api-keptn-ingress
namespace: keptn
spec:
rules:
- host: <IP-ADDRESS>.nip.io
http:
paths:
- backend:
serviceName: api-gateway-nginx
servicePort: 80
Besides, the script has created a gateway resource for you so that the onboarded services are also available publicly.
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: public-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
name: http
number: 80
protocol: HTTP
hosts:
- '*'
Besides, the helm-service
pod of Keptn is restarted to fetch this new configuration.
In this section we are referring to the Linux/MacOS derivatives of the commands. If you are using a Windows host, please follow the official instructions.
KEPTN_ENDPOINT=http://$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')/api
KEPTN_API_TOKEN=$(kubectl get secret keptn-api-token -n keptn -ojsonpath='{.data.keptn-api-token}' | base64 --decode)
Use this stored information and authenticate the CLI.
keptn auth --endpoint=$KEPTN_ENDPOINT --api-token=$KEPTN_API_TOKEN
That will give you:
Starting to authenticate
Successfully authenticated
If you want, you can go ahead and take a look at the Keptn API by navigating to the endpoint that is given via
echo $KEPTN_ENDPOINT
For enabling the Keptn Quality Gates and for production monitoring, we are going to use Dynatrace as the data provider. Therefore, we are going to setup Dynatrace in our Kubernetes cluster to have our sample application monitored and we can use the monitoring data for both the basis for evaluating quality gates as well as a trigger to start self-healing.
If you don't have a Dynatrace tenant yet, sign up for a free trial or a developer account.
DT_TENANT
has to be set according to the appropriate pattern:{your-environment-id}.live.dynatrace.com
{your-domain}/e/{your-environment-id}
kubectl
command itself.export DT_TENANT=yourtenant.live.dynatrace.com
export DT_API_TOKEN=yourAPItoken
export DT_PAAS_TOKEN=yourPAAStoken
If you used the variables, the next command can be copied and pasted without modifications. If you have not set the variables, please make sure to set the right values in the next command.kubectl -n keptn create secret generic dynatrace --from-literal="DT_TENANT=$DT_TENANT" --from-literal="DT_API_TOKEN=$DT_API_TOKEN" --from-literal="DT_PAAS_TOKEN=$DT_PAAS_TOKEN" --from-literal="KEPTN_API_URL=http://$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')/api" --from-literal="KEPTN_API_TOKEN=$(kubectl get secret keptn-api-token -n keptn -ojsonpath='{.data.keptn-api-token}' | base64 --decode)" --from-literal="KEPTN_BRIDGE_URL=http://$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')/bridge"
To make the tutorial experience as smooth as possible, we are providing an automation script to setup the Dynatrace OneAgent operator in your Kubernetes cluster. For details on the installation, we refer to the official Dynatrace documentation. You can download and run the script using the following instructions.
curl -o deploy-dynatrace-oneagent.sh https://raw.githubusercontent.com/keptn/examples/release-0.7.2/dynatrace-oneagent/deploy-dynatrace-oneagent.sh
chmod
command.chmod +x deploy-dynatrace-oneagent.sh
./deploy-dynatrace-oneagent.sh
kubectl get pods -n dynatrace
dynatrace-oneagent-operator-696fd89b76-n9d9n 1/1 Running 0 6m26s
dynatrace-oneagent-webhook-78b6d99c85-h9759 2/2 Running 0 6m25s
oneagent-g9m42 1/1 Running 0 69s
Follow the next steps only if your Dynatrace OneAgent does not work properly.
kubectl get pods -n dynatrace
might look as follows:NAME READY STATUS RESTARTS AGE
dynatrace-oneagent-operator-7f477bf78d-dgwb6 1/1 Running 0 8m21s
oneagent-b22m4 0/1 Error 6 8m15s
oneagent-k7jn6 0/1 CrashLoopBackOff 6 8m15s
env:
- name: ONEAGENT_ENABLE_VOLUME_STORAGE
value: "true"
kubectl edit oneagent -n dynatrace
At the end of your installation, please verify that all Dynatrace resources are in a Ready and Running status by executing kubectl get pods -n dynatrace
:
NAME READY STATUS RESTARTS AGE
dynatrace-oneagent-operator-7f477bf78d-dgwb6 1/1 Running 0 8m21s
oneagent-b22m4 1/1 Running 0 8m21s
oneagent-k7jn6 1/1 Running 0 8m21s
kubectl apply -f https://raw.githubusercontent.com/keptn-contrib/dynatrace-service/release-0.10.4/deploy/service.yaml -n keptn
keptn configure monitoring dynatrace
Output should be similar to this:ID of Keptn context: 79f19c36-b718-4bb6-88d5-cb79f163289b
Configuring Dynatrace monitoring
Dynatrace OneAgent Operator is installed on cluster
Setting up auto-tagging rules in Dynatrace Tenant
Tagging rule keptn_service already exists
Tagging rule keptn_stage already exists
Tagging rule keptn_project already exists
Tagging rule keptn_deployment already exists
Setting up problem notifications in Dynatrace Tenant
Checking Keptn alerting profile availability
Keptn alerting profile available
Dynatrace Monitoring setup done
Verify Dynatrace configuration
Since Keptn has configured your Dynatrace tenant, let us take a look what has be done for you:
A project in Keptn is the logical unit that can hold multiple (micro)services. Therefore, it is the starting point for each Keptn installation.
To get all files you need for this tutorial, please clone the example repo to your local machine.
git clone --branch release-0.7.3 https://github.com/keptn/examples.git --single-branch
cd examples/onboarding-carts
Create a new project for your services using the keptn create project
command. In this example, the project is called sockshop. Before executing the following command, make sure you are in the examples/onboarding-carts
folder.
Recommended: Create a new project with Git upstream:
To configure a Git upstream for this tutorial, the Git user (--git-user
), an access token (--git-token
), and the remote URL (--git-remote-url
) are required. If a requirement is not met, go to the Keptn documentation where instructions for GitHub, GitLab, and Bitbucket are provided.
Let's define the variables before running the command:
GIT_USER=gitusername
GIT_TOKEN=gittoken
GIT_REMOTE_URL=remoteurl
Now let's create the project using the keptn create project
command.
keptn create project sockshop --shipyard=./shipyard.yaml --git-user=$GIT_USER --git-token=$GIT_TOKEN --git-remote-url=$GIT_REMOTE_URL
Alternatively: If you don't want to use a Git upstream, you can create a new project without it but please note that this is not the recommended way:
keptn create project sockshop --shipyard=./shipyard.yaml
For creating the project, the tutorial relies on a shipyard.yaml
file as shown below:
stages:
- name: "dev"
deployment_strategy: "direct"
test_strategy: "functional"
- name: "staging"
approval_strategy:
pass: "automatic"
warning: "automatic"
deployment_strategy: "blue_green_service"
test_strategy: "performance"
- name: "production"
approval_strategy:
pass: "automatic"
warning: "manual"
deployment_strategy: "blue_green_service"
remediation_strategy: "automated"
This shipyard contains three stages: dev, staging, and production. This results in the three Kubernetes namespaces: sockshop-dev, sockshop-staging, and sockshop-production.
Let's take a look at the project that we have just created. We can find all this information in the Keptn's Bridge.
Therefore, we need the credentials that have been automatically generated for us.
keptn configure bridge --output
Now use these credentials to access it on your Keptn's Bridge.
echo http://$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')/bridge
You will find the just created project in the bridge with all stages.
After creating the project, services can be onboarded to our project.
keptn onboard service carts --project=sockshop --chart=./carts
keptn add-resource --project=sockshop --stage=dev --service=carts --resource=jmeter/basiccheck.jmx --resourceUri=jmeter/basiccheck.jmx
keptn add-resource --project=sockshop --stage=staging --service=carts --resource=jmeter/load.jmx --resourceUri=jmeter/load.jmx
Note: You can adapt the tests in basiccheck.jmx
as well as load.jmx
for your service. However, you must not rename the files because there is a hardcoded dependency on these file names in the current implementation of Keptn's jmeter-service.Since the carts service requires a mongodb database, a second service needs to be onboarded.
--deployment-strategy
flag specifies that for this service a direct deployment strategy in all stages should be used regardless of the deployment strategy specified in the shipyard. Thus, the database is not blue/green deployed.keptn onboard service carts-db --project=sockshop --chart=./carts-db --deployment-strategy=direct
Take a look in your Keptn's Bridge and see the newly onboarded services.
After onboarding the services, a built artifact of each service can be deployed.
keptn send event new-artifact --project=sockshop --service=carts-db --image=docker.io/mongo --tag=4.2.2
keptn send event new-artifact --project=sockshop --service=carts --image=docker.io/keptnexamples/carts --tag=0.11.1
kubectl get pods --all-namespaces | grep carts-
sockshop-dev carts-77dfdc664b-25b74 1/1 Running 0 10m
sockshop-dev carts-db-54d9b6775-lmhf6 1/1 Running 0 13m
sockshop-production carts-db-54d9b6775-4hlwn 2/2 Running 0 12m
sockshop-production carts-primary-79bcc7c99f-bwdhg 2/2 Running 0 2m15s
sockshop-staging carts-db-54d9b6775-rm8rw 2/2 Running 0 12m
sockshop-staging carts-primary-79bcc7c99f-mbbgq 2/2 Running 0 7m24s
echo http://carts.sockshop-dev.$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')
echo http://carts.sockshop-staging.$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')
echo http://carts.sockshop-production.$(kubectl -n keptn get ingress api-keptn-ingress -ojsonpath='{.spec.rules[0].host}')
Now that the service is running in all three stages, let us generate some traffic so we have some data we can base the evaluation on.
Change the directory to examples/load-generation/cartsloadgen
. If you are still in the onboarding-carts directory, use the following command or change it accordingly:
cd ../load-generation/cartsloadgen
Now let us deploy a pod that will generate some traffic for all three stages of our demo environment.
kubectl apply -f deploy/cartsloadgen-base.yaml
The output will look similar to this.
namespace/loadgen created
deployment.extensions/cartsloadgen created
Optionally, you can verify that the load generator has been started.
kubectl get pods -n loadgen
NAME READY STATUS RESTARTS AGE
cartsloadgen-5dc47c85cf-kqggb 1/1 Running 0 117s
During the evaluation of a quality gate, the Dynatrace SLI provider is required that is implemented by an internal Keptn service, the dynatrace-sli-service. This service will fetch the values for the SLIs that are referenced in an SLO configuration.
kubectl apply -f https://raw.githubusercontent.com/keptn-contrib/dynatrace-sli-service/0.7.1/deploy/service.yaml -n keptn
Next we are going to add an SLI configuration file for Keptn to know how to retrieve the data.
Please make sure you are in the correct folder that is examples/onboarding-carts
. If not, please change the directory accordingly, e.g., with cd ../../onboarding-carts/
. We are going to add it globally to the project for all services and stages we create.
keptn add-resource --project=sockshop --resource=sli-config-dynatrace.yaml --resourceUri=dynatrace/sli.yaml
For your information, this is what the file looks like:
---
spec_version: '1.0'
indicators:
throughput: "builtin:service.requestCount.total:merge(\"dt.entity.service\"):sum?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
error_rate: "builtin:service.errors.total.count:merge(\"dt.entity.service\"):avg?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
response_time_p50: "builtin:service.response.time:merge(\"dt.entity.service\"):percentile(50)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
response_time_p90: "builtin:service.response.time:merge(\"dt.entity.service\"):percentile(90)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
response_time_p95: "builtin:service.response.time:merge(\"dt.entity.service\"):percentile(95)?scope=tag(keptn_project:$PROJECT),tag(keptn_stage:$STAGE),tag(keptn_service:$SERVICE),tag(keptn_deployment:$DEPLOYMENT)"
Configure the already onboarded project with the new SLI provider for Keptn to create some needed resources (e.g., a configmap):
keptn configure monitoring dynatrace --project=sockshop
To inform Keptn about any issues in a production environment, monitoring has to be set up correctly. The Keptn CLI helps with the automated setup and configuration of Dynatrace as the monitoring solution running in the Kubernetes cluster.
To add these files to Keptn and to automatically configure Dynatrace, execute the following commands:
cd examples/onboarding-carts
keptn add-resource --project=sockshop --stage=production --service=carts --resource=remediation.yaml --resourceUri=remediation.yaml
This is how the file looks that we are going to add here:apiVersion: spec.keptn.sh/0.1.4
kind: Remediation
metadata:
name: service-remediation
spec:
remediations:
- problemType: Response time degradation
actionsOnOpen:
- action: scaling
name: scaling
description: Scale up
value: 1
- problemType: response_time_p90
actionsOnOpen:
- action: scaling
name: scaling
description: Scale up
value: 1
keptn add-resource --project=sockshop --stage=production --service=carts --resource=slo-self-healing.yaml --resourceUri=slo.yaml
Configure Dynatrace problem detection with a fixed threshold: For the sake of this demo, we will configure Dynatrace to detect problems based on fixed thresholds rather than automatically.
Log in to your Dynatrace tenant and go to Settings > Anomaly Detection > Services.
Within this menu, select the option Detect response time degradations using fixed thresholds, set the limit to 1000ms, and select Medium for the sensitivity as shown below.
To simulate user traffic that is causing an unhealthy behavior in the carts service, please execute the following script. This will add special items into the shopping cart that cause some extensive calculation.
cd ../load-generation/cartsloadgen/deploy
kubectl apply -f cartsloadgen-faulty.yaml
As you can see in the time series chart, the load generation script causes a significant increase in the response time.
After approximately 10-15 minutes, Dynatrace will send out a problem notification because of the response time degradation.
After receiving the problem notification, the dynatrace-service will translate it into a Keptn CloudEvent. This event will eventually be received by the remediation-service that will look for a remediation action specified for this type of problem and, if found, execute it.
In this tutorial, the number of pods will be increased to remediate the issue of the response time increase.
kubectl get deployments -n sockshop-production
You can see that the carts-primary
deployment is now served by two pods:NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
carts-db 1 1 1 1 37m
carts-primary 2 2 2 2 32m
kubectl get pods -n sockshop-production
NAME READY STATUS RESTARTS AGE
carts-db-57cd95557b-r6cg8 1/1 Running 0 38m
carts-primary-7c96d87df9-75pg7 2/2 Running 0 33m
carts-primary-7c96d87df9-78fh2 2/2 Running 0 5m
You have successfully walked through the example to scale up your application based on high CPU consumption detected by Dynatrace.
remediation.yaml
fileapiVersion: spec.keptn.sh/0.1.4
kind: Remediation
metadata:
name: service-remediation
spec:
remediations:
- problemType: Response time degradation
actionsOnOpen:
- action: scaling
name: scaling
description: Scale up
value: 1
- problemType: response_time_p90
actionsOnOpen:
- action: scaling
name: scaling
description: Scale up
value: 1
Keptn can be easily extended with external tools such as notification tools, other SLI providers, bots to interact with Keptn, etc.
While we do not cover additional integrations in this tutorial, please feel fee to take a look at our integration repositories:
Please visit us in our Keptn Slack and tell us how you like Keptn and this tutorial! We are happy to hear your thoughts & suggestions!
Also, make sure to follow us on Twitter to get the latest news on Keptn, our tutorials and newest releases!