Skip to content

Latest commit

 

History

History
434 lines (353 loc) · 19.6 KB

README.md

File metadata and controls

434 lines (353 loc) · 19.6 KB

Cloud-Native Workstation

A set of development and prototyping tools that can be useful in some cloud-native-centric projects


Build Workflow Deploy Workflow Readme Standard MIT License

The components in this project are tailored towards cloud-native application development, delivery, and administration. Specific use cases include:

  1. Prototyping cloud-native systems
  2. Developing microservices and microsites
  3. Analyzing data - especially ETL processes with Python and REST APIs
  4. Provisioning cloud infrastructure with Terraform
  5. Managing Google Cloud Platform workloads
  6. Handling Helm charts and Kubernetes resources
  7. Working on data science notebooks (including AI/ML)

Relevant technologies include:

Code Server Pgweb Apache Guacamole Jupyter Selenium SonarQube Kanboard Keycloak Prometheus Grafana Terraform Helm Kubernetes Docker Certbot Open Policy Agent OAuth2 Proxy Nginx

My own use and testing is with Google Kubernetes Engine, but folks should find the system reasonably easy to adapt to other Kubernetes environments.

Architecture Diagram

Table of Contents

Repository preparation

Pull the repository submodules with the following commands

git submodule init
git submodule update

Provisioning (Optional)

Google Kubernetes Service

If you would like to provision a new Kubernetes cluster on Google Kubernetes Engine to run your workstation, set the GOOGLE_PROJECT environment variable, then follow the steps below:

gcloud iam roles create workstation_provisioner \
    --project=$GOOGLE_PROJECT \
    --file=roles/provisioner.yaml
gcloud iam service-accounts create workstation-provisioner \
    --display-name="Workstation Provisioner"
gcloud projects add-iam-policy-binding $GOOGLE_PROJECT \
    --member="serviceAccount:workstation-provisioner@$GOOGLE_PROJECT.iam.gserviceaccount.com" \
    --role="projects/$GOOGLE_PROJECT/roles/workstation_provisioner"
gcloud projects add-iam-policy-binding $GOOGLE_PROJECT \
    --member="serviceAccount:workstation-provisioner@$GOOGLE_PROJECT.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountUser"
gcloud iam service-accounts keys create workstation-provisioner.json \
    --iam-account="workstation-provisioner@$GOOGLE_PROJECT.iam.gserviceaccount.com"

Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the newly created key.

Navigate to the desired provisioning directory - either provision/gke or provision/gke-with-gpu. The gke specification creates a "normal" cluster with a single node pool. The gke-with-gpu specification adds Nvidia T4 GPU capabilities to the Jupyter component, for AI/ML/GPU workloads. If you do not want to enable the Jupyter component, or want it but for non-AI/ML/GPU workloads, then use the gke specification. The gke specification is recommended for most users. Once you've navigated to the desired infrastructure specification directory, provision with: 1. Using the default zone (us-central1-a) and cluster name (cloud-native-workstation): terraform init terraform apply Or, with a custom zone or custom cluster name: terraform init terraform apply -var compute_zone=YOUR_REGION -var cluster_name=YOUR_CLUSTER_NAME

  1. Return to the repository root directory
    cd ../..

Elastic Kubernetes Service (AWS)

If you would like to provision a new Kubernetes cluster on Elastic Kubernetes Service to run your workstation, follow the steps below.

  1. Create a Cloud Native Workstation policy in Amazon Web Services with the following permissions:
    1. iam:CreateRole
    2. iam:GetRole
    3. iam:ListRolePolicies
    4. iam:ListAttachedRolePolicies
    5. iam:ListInstanceProfilesForRole
    6. iam:DeleteRole
    7. iam:AttachRolePolicy
    8. iam:DetachRolePolicy
    9. logs:CreateLogGroup
    10. logs:PutRetentionPolicy
    11. logs:DescribeLogGroups
    12. logs:ListTagsLogGroup
    13. logs:DeleteLogGroup
    14. ec2:*
    15. eks:*
    16. autoscaling:CreateLaunchConfiguration
    17. autoscaling:DescribeLaunchConfigurations
    18. autoscaling:CreateAutoScalingGroup
    19. autoscaling:DescribeAutoScalingGroups
    20. autoscaling:DescribeScalingActivities
    21. autoscaling:SuspendProcesses
    22. autoscaling:UpdateAutoScalingGroup
    23. autoscaling:*
    24. cloudformation:ListStacks
  2. Create a new user and assign the Cloud Native Workstation and IAMFullAccess policies
  3. Create a key and set the AWS authentication environment variables
    export AWS_ACCESS_KEY_ID=YOURVALUEHERE
    export AWS_SECRET_ACCESS_KEY=YOURVALUEHERE
  4. Navigate to the EKS provisioning directory, then provision with:
    terraform init
    terraform apply
    
  5. Note the output values for efs_fs_id and efs_role_arn, as you will need them later
  6. Return to the repository root directory
    cd ../..

Configure kubectl

Google Kubernetes Engine

If using Google Kubernetes Engine, execute the commands below. If you ran provisioning with the default GCP zone and cluster name, then use cloud-native-workstation as the cluster name and us-central1-a as the cluster zone. Other cloud vendors should provide a similar cli and commands for configuring kubectl, if you are not using Google Kubernetes Engine.

gcloud init
gcloud container clusters get-credentials cloud-native-workstation --zone us-central1-a

Next, create a namespace and configure kubectl to use that namespace:

kubectl create namespace cloud-native-workstation
kubectl config set-context --current --namespace cloud-native-workstation

Elastic Kubernetes Service (AWS)

aws eks update-kubeconfig --region us-east-1 --name cloud-native-workstation
kubectl create namespace cloud-native-workstation
kubectl config set-context --current --namespace cloud-native-workstation

Prepare SSL

Secure SSL setup is required. There are three options for SSL certificates:

  1. Cert Manager certificate provisioning and management, on top of Google Kubernetes Engine
  2. Automated SSL certificate generation using Let's Encrypt, Certbot, and the DNS01 challenge
  3. Bring your own certificate

Cert Manager with GKE

  1. Use Terraform to provision the resources in provision/cert-manager

Later, during the helm installation, be sure certbot.enabled is true, certbot.type is cert-manager-google in the deployment Helm values, and make sure certManager.enabled is true in the preparation Helm values.

Certbot with Google Cloud Platform DNS

  1. In Google Cloud Platform, create a Cloud DNS zone for your domain
  2. In your domain name registrar, ensure the domain nameservers are set to the values from Google
  3. In Google Cloud Platform, create a Cloud Native Workstation Certbot role with the following permissions:
    1. dns.changes.create
    2. dns.changes.get
    3. dns.managedZones.list
    4. dns.resourceRecordSets.create
    5. dns.resourceRecordSets.delete
    6. dns.resourceRecordSets.list
    7. dns.resourceRecordSets.update
  4. Create a new service account and assign the Cloud Native Workstation Certbot role

Generate a json key file for the service account and add it to Kubernetes as a secret. Rename the file to google.json, then add it to Kubernetes as a secret:

kubectl create secret generic google-json --from-file google.json

Later, during the installation, be sure certbot.enabled is true and certbot.type is google in the Helm values

Certbot with Cloudflare DNS

Create a Cloudflare API token with Zone:DNS:Edit permissions for only the zones you need certificates for. Create a cloudflare.ini file using this format, and your specific API token:

# Cloudflare API token used by Certbot
dns_cloudflare_api_token = YOUR_TOKEN

Once you have created the cloudflare.ini file, run:

kubectl create secret generic cloudflare-ini --from-file cloudflare.ini

Later, during the installation, be sure certbot.enabled is true and certbot.type is cloudflare in the Helm values

Certbot with Route 53 (AWS) DNS

  1. In AWS, create a Route 53 DNS zone for your domain
  2. In your domain name registrar, ensure the domain nameservers are set to the values from AWS
  3. In AWS, create a Cloud Native Workstation Certbot user with the following permissions:
    1. route53:ListHostedZones
    2. route53:GetChange
    3. route53:ChangeResourceRecordSets

Create a credentials file called config, with the following contents. Use the Access Key ID and Secret Access Key for the account that you just setup.

[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Add this file to Kubernetes as a secret:

kubectl create secret generic aws-config --from-file config

Later, during the installation, be sure certbot.enabled is true and certbot.type is aws in the Helm values

Bring your own SSL certificate

Create cert and private key files. If you need a self-signed certificate, run the following bash commands:

openssl req -x509 \
    -newkey rsa:2048 \
    -days 3650 \
    -nodes \
    -out example.crt \
    -keyout example.key

Load this up as a TLS Kubernetes secret:

kubectl create secret tls workstation-tls \
    --cert=example.crt \
    --key=example.key

Later, during the helm installation, be sure certbot.enabled is false.

Build (Optional)

If you do not want to use the public Docker Hub images, you need to build the docker images for components you are interested in. For security, the Keycloak image is always required.

# Set the REPO environment variable to the image repository
REPO=us.gcr.io/my-project/my-repo  # for example
# Build and push images
cd build
docker build --tag $REPO/cloud-native-workstation-code-server:latest ./code-server
docker build --tag $REPO/cloud-native-workstation-initializers:latest ./initializers
docker build --tag $REPO/cloud-native-workstation-jupyter:latest ./jupyter
docker build --tag $REPO/cloud-native-workstation-pgweb:latest ./pgweb
docker push $REPO/cloud-native-workstation-code-server:latest
docker push $REPO/cloud-native-workstation-initializers:latest
docker push $REPO/cloud-native-workstation-jupyter:latest
docker push $REPO/cloud-native-workstation-pgweb:latest
cd ..

Configuration

Configure helm values, based on the instructions below.

Domain

Set the domain value, based on the domain that you would like to run your workstation on.

Certbot

The certbot.email should be configured if you are using the Certbot option for TLS certificates.

Resource requests

For portability to low-resource environments like minikube, resource requests are zeroed for all components. This is just the default configuration.

GPU capabilities

If you provisioned the cluster using the gke-with-gpu specification, ensure jupyter.enabled is true, set jupyter.gpu.enabled to true, and uncomment the two nvidia.com/gpu: 1 resource specification lines.

VM types

For GKE installations using the gke-beta provisioning specification (or similar), workloads can be configured to run on any combination of standard, spot, and/or preemptive VMs. If multiple VM types are enabled, no scheduling preference will be given to any one option. Any workload ran without node selectors or node affinities will be scheduled on standard VMs.

Installation

If you have already installed a workstation on the cluster, create a namespace for the new workstation, set kubeconfig to use that new namespace, prepare an SSL certificate, and skip to the workstation installation step.

Installation Architecture

Workstation prerequisites installation

The following commands install the Nginx Ingress Controller, Open Policy Agent Gatekeeper, and Keycloak CRDs. If you would like the ability (but not the obligation) to use EFS-backed persistent volume claims on Elastic Kubernetes Serice (AWS), update the cluster preparation chart values before running the Helm install. Specifically, set aws-efs-csi-driver.enabled to true, aws-efs-csi-driver.controller.serviceAccount.annotations.eks.amazonaws.com/role-arn to the provisioning output value that you noted earlier, and aws-efs-csi-driver.storageClasses[0].parameters.fileSystemId to the provisioning output value that you noted earlier.

cd prepare/chart
helm dependency update
helm install workstation-prerequisites . -n kube-system
cd ../..

If using Cert Manager for TLS certificates:

kubectl annotate serviceaccount workstation-prerequisites-cert-manager \
    --namespace=kube-system \
    --overwrite \
    "iam.gke.io/gcp-service-account=workstation-cert-manager@$GOOGLE_PROJECT.iam.gserviceaccount.com"

CRDs installation

Constraint templates provide policy-based workstation controls and security. If you choose not to install these constraint templates, ensure policies.enabled is set to false in the helm values. Install with:

kubectl apply -f prepare/crds/constraint-templates.yaml --wait

Workstation installation

Install the workstation on the Kubernetes cluster with Helm:

cd deploy
helm dependency update
helm install workstation .
cd ..

Create a DNS entry (A record) that points *.YOURDOMAIN to the Load Balancer External IP created during the Helm installation. If using EKS, the A record should be an alias to the AWS ELB domain. To see the installed services, including this Load Balancer, run:

kubectl get service workstation-prerequisites-ingress-nginx-controller \
    -n kube-system \
    -o custom-columns=NAME:.metadata.name,TYPE:.spec.type,EXTERNAL-IP:.status.loadBalancer.ingress[0].ip,EXTERNAL-HOSTNAME:.status.loadBalancer.ingress[0].hostname

The domain must resolve before the components will work (access by IP only is not possible).

Note that workstation creation can take a few minutes. The DNS propagation is particularly time consuming.

Usage

Access the components that you've enabled in the Helm values (after authenticating with the Keycloak proxy):

  • code.YOUR_DOMAIN for Code Server IDE
  • code-dev-server.YOUR_DOMAIN for a development web server
    • e.g. hugo serve -D --bind=0.0.0.0 --baseUrl=code-dev-server.YOUR_DOMAIN --appendPort=false in Code Server
  • pgweb.YOUR_DOMAIN for Pgweb
  • selenium-hub.YOUR_DOMAIN for Selenium Grid hub
  • selenium-chrome.YOUR_DOMAIN for Selenium node (Chrome)
  • selenium-firefox.YOUR_DOMAIN for Selenium node (Firefox)
  • selenium-edge.YOUR_DOMAIN for Selenium node (Edge)
  • jupyter.YOUR_DOMAIN for Jupyter data science notebook
  • sonarqube.YOUR_DOMAIN for SonarQube
  • guacamole.YOUR_DOMAIN/guacamole/ for Apache Guacamole (default login guacadmin:guacadmin)
  • kanboard.YOUR_DOMAIN for Kanboard (default login admin:admin)
  • prometheus.YOUR_DOMAIN for Prometheus monitoring
  • grafana.YOUR_DOMAIN for Grafana visualization
  • keycloak.YOUR_DOMAIN for Keycloak administration

Deprovisioning

Helm uninstall

Uninstall both helm charts with:

helm uninstall workstation --wait
helm uninstall workstation-prerequisites --namespace kube-system --wait

Terraform destroy

If a Terraform provisioning specification was used to create the cloud resources, navigate to the provisioning directory and delete with:

terraform destroy

Contributing

If you fork this project and add something cool, please let me know or contribute it back.

Development

Skaffold is the recommended tool for developing Cloud Native Workstation. A configuration for development on Google Cloud Platform is provided for reference. A service account key is required to leverage the required Google Cloud Platform services (e.g., create a build.json, and use it for GOOGLE_APPLICATION_CREDENTIALS).

export GOOGLE_PROJECT={YOUR PROJECT}
export GOOGLE_APPLICATION_CREDENTIALS=build.json
cp skaffold.dev.yaml skaffold.yaml
sed -i "s|CODE_IMAGE|us.gcr.io/$GOOGLE_PROJECT/cloud-native-workstation/code|g" skaffold.yaml
sed -i "s|INITIALIZERS_IMAGE|us.gcr.io/$GOOGLE_PROJECT/cloud-native-workstation/initializers|g" skaffold.yaml
sed -i "s|JUPYTER_IMAGE|us.gcr.io/$GOOGLE_PROJECT/cloud-native-workstation/jupyter|g" skaffold.yaml
sed -i "s|PGWEB_IMAGE|us.gcr.io/$GOOGLE_PROJECT/cloud-native-workstation/pgweb|g" skaffold.yaml
skaffold dev

License

Licensed under the MIT license.

See license in LICENSE