Skip to content

Commit

Permalink
retain content from old operator guide
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisghill committed Aug 30, 2024
1 parent 2d76cc3 commit bdbf3be
Show file tree
Hide file tree
Showing 2 changed files with 111 additions and 34 deletions.
47 changes: 22 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Elastic Kubernetes Service is an open source container orchestration platform th

## Design

For detailed information, check out our [Operator Guide](operator.mdx) for this bundle.
For detailed information, check out our [Operator Guide](operator.md) for this bundle.

## Usage

Expand Down Expand Up @@ -56,7 +56,7 @@ Form input parameters for configuring a bundle for deployment.

- **`fargate`** *(object)*: AWS Fargate provides on-demand, right-sized compute capacity for running containers on EKS without managing node pools or clusters of EC2 instances.
- **`enabled`** *(boolean)*: Enables EKS Fargate. Default: `False`.
- **`k8s_version`** *(string)*: The version of Kubernetes to run. Must be one of: `['1.22', '1.23', '1.24', '1.25', '1.26', '1.27']`. Default: `1.27`.
- **`k8s_version`** *(string)*: The version of Kubernetes to run. **WARNING: Upgrading Kubernetes version must be done one minor version at a time**. For example, upgrading from 1.28 to 1.30 requires upgrading to 1.29 first. Must be one of: `['1.22', '1.23', '1.24', '1.25', '1.26', '1.27', '1.28', '1.29', '1.30']`. Default: `1.30`.
- **`monitoring`** *(object)*
- **`control_plane_log_retention`** *(integer)*: Duration to retain control plane logs in AWS Cloudwatch (Note: control plane logs do not contain application or container logs). Default: `7`.
- **One of**
Expand All @@ -69,35 +69,32 @@ Form input parameters for configuring a bundle for deployment.
- **`prometheus`** *(object)*: Configuration settings for the Prometheus instances that are automatically installed into the cluster to provide monitoring capabilities".
- **`grafana_enabled`** *(boolean)*: Install Grafana into the cluster to provide a metric visualizer. Default: `False`.
- **`persistence_enabled`** *(boolean)*: This setting will enable persistence of Prometheus data via EBS volumes. However, in small clusters (less than 5 nodes) this can create problems of pod scheduling and placement due EBS volumes being zonally-locked, and thus should be disabled. Default: `True`.
- **`node_groups`** *(array)*
- **`node_groups`** *(array)*: Node groups to provision.
- **Items** *(object)*: Definition of a node group.
- **`advanced_configuration_enabled`** *(boolean)*: Default: `False`.
- **`instance_type`** *(string)*: Instance type to use in the node group.
- **One of**
- C5 High-CPU Large (2 vCPUs, 4.0 GiB)
- C5 High-CPU Extra Large (4 vCPUs, 8.0 GiB)
- C5 High-CPU Double Extra Large (8 vCPUs, 16.0 GiB)
- C5 High-CPU Quadruple Extra Large (16 vCPUs, 32.0 GiB)
- C5 High-CPU 9xlarge (36 vCPUs, 72.0 GiB)
- C5 High-CPU 12xlarge (48 vCPUs, 96.0 GiB)
- C5 High-CPU 18xlarge (72 vCPUs, 144.0 GiB)
- C5 High-CPU 24xlarge (96 vCPUs, 192.0 GiB)
- C5 High-CPU XL (4 vCPUs, 8.0 GiB)
- C5 High-CPU 2XL (8 vCPUs, 16.0 GiB)
- C5 High-CPU 4XL (16 vCPUs, 32.0 GiB)
- C5 High-CPU 9XL (36 vCPUs, 72.0 GiB)
- C5 High-CPU 12XL (48 vCPUs, 96.0 GiB)
- C5 High-CPU 18XL (72 vCPUs, 144.0 GiB)
- C5 High-CPU 24XL (96 vCPUs, 192.0 GiB)
- M5 General Purpose Large (2 vCPUs, 8.0 GiB)
- M5 General Purpose Extra Large (4 vCPUs, 16.0 GiB)
- M5 General Purpose Double Extra Large (8 vCPUs, 32.0 GiB)
- M5 General Purpose Quadruple Extra Large (16 vCPUs, 64.0 GiB)
- M5 General Purpose Eight Extra Large (32 vCPUs, 128.0 GiB)
- M5 General Purpose 12xlarge (48 vCPUs, 192.0 GiB)
- M5 General Purpose 16xlarge (64 vCPUs, 256.0 GiB)
- M5 General Purpose 24xlarge (96 vCPUs, 384.0 GiB)
- M5 General Purpose XL (4 vCPUs, 16.0 GiB)
- M5 General Purpose 2XL (8 vCPUs, 32.0 GiB)
- M5 General Purpose 4XL (16 vCPUs, 64.0 GiB)
- M5 General Purpose 8XL (32 vCPUs, 128.0 GiB)
- M5 General Purpose 12XL (48 vCPUs, 192.0 GiB)
- M5 General Purpose 16XL (64 vCPUs, 256.0 GiB)
- M5 General Purpose 24XL (96 vCPUs, 384.0 GiB)
- T3 Small (2 vCPUs for a 4h 48m burst, 2.0 GiB)
- T3 Medium (2 vCPUs for a 4h 48m burst, 4.0 GiB)
- T3 Large (2 vCPUs for a 7h 12m burst, 8.0 GiB)
- T3 Extra Large (4 vCPUs for a 9h 36m burst, 16.0 GiB)
- T3 Double Extra Large (8 vCPUs for a 9h 36m burst, 32.0 GiB)
- P2 General Purpose GPU Extra Large (4 vCPUs, 61.0 GiB)
- P2 General Purpose GPU Eight Extra Large (32 vCPUs, 488.0 GiB)
- P2 General Purpose GPU 16xlarge (64 vCPUs, 732.0 GiB)
- T3 XL (4 vCPUs for a 9h 36m burst, 16.0 GiB)
- T3 2XL (8 vCPUs for a 9h 36m burst, 32.0 GiB)
- **`max_size`** *(integer)*: Maximum number of instances in the node group. Minimum: `0`. Default: `10`.
- **`min_size`** *(integer)*: Minimum number of instances in the node group. Minimum: `0`. Default: `1`.
- **`name_suffix`** *(string)*: The name of the node group. Default: ``.
Expand All @@ -114,7 +111,7 @@ Form input parameters for configuring a bundle for deployment.
"fargate": {
"enabled": false
},
"k8s_version": "1.27",
"k8s_version": "1.30",
"monitoring": {
"control_plane_log_retention": 7,
"prometheus": {
Expand All @@ -137,7 +134,7 @@ Form input parameters for configuring a bundle for deployment.
```json
{
"__name": "Development",
"k8s_version": "1.27",
"k8s_version": "1.30",
"monitoring": {
"control_plane_log_retention": 7,
"prometheus": {
Expand All @@ -159,7 +156,7 @@ Form input parameters for configuring a bundle for deployment.
```json
{
"__name": "Production",
"k8s_version": "1.27",
"k8s_version": "1.30",
"monitoring": {
"control_plane_log_retention": 365,
"prometheus": {
Expand Down
98 changes: 89 additions & 9 deletions operator.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,90 @@
## AWS EKS (Elastic Kubernetes Service)
# aws-eks-cluster
AWS EKS (Elastic Kubernetes Service) is Amazon's managed Kubernetes service, making it easy to deploy, operate, and scale containerized applications and providing benefits such as automatic scaling of worker nodes, automatic upgrades and patching, integration with other AWS services, and access to the Kubernetes community and ecosystem.

AWS EKS is a managed Kubernetes service that makes it easy to run Kubernetes on AWS without needing to manage your own Kubernetes control plane. Kubernetes is an open-source system for automating the deployment, scaling, and management of containerized applications.
## Use Cases
### Container orchestration
Kubernetes is the most powerful container orchestrator, making it easy to deploy, scale, and manage containerized applications.
### Microservices architecture
Kubernetes can be used to build and manage microservices-based applications, allowing for flexibility and scalability in a distributed architecture.
### Big Data and Machine Learning
Kubernetes can be used to deploy and manage big data and machine learning workloads, providing scalability and flexibility for processing and analyzing large data sets.
### Internet of Things (IoT)
Kubernetes can be used to manage and orchestrate IoT applications, providing robust management and scaling capabilities for distributed IoT devices and gateways.

### Design Decisions
## Design
EKS provides a "barebones" Kubernetes control plane, meaning that it only includes the essential components required to run a Kubernetes cluster. These components include the [Kubernetes API server](https://kubernetes.io/docs/concepts/overview/components/#kube-apiserver), [etcd](https://kubernetes.io/docs/concepts/overview/components/#etcd) (a distributed key-value store for storing Kubernetes cluster data), the [controller manager](https://kubernetes.io/docs/concepts/overview/components/#kube-controller-manager) and the [scheduler](https://kubernetes.io/docs/concepts/overview/components/#kube-scheduler).

1. **IAM Roles and Policies**: Distinct IAM roles for EKS cluster and node groups to ensure security and proper role-based access.
2. **Logging and Monitoring**: EKS control plane logs are sent to CloudWatch for centralized logging and monitoring.
3. **Add-ons**: Enabled multiple AWS EKS add-ons like EBS CSI, cluster autoscaler, and Prometheus observability.
4. **Cert-Manager and External-DNS**: Enabled cert-manager for certificate management and External-DNS for automated DNS updates.
5. **KMS Encryption**: Used AWS KMS for encrypting secrets within the EKS cluster.
6. **Fargate Support**: Conditional role creation for Fargate profile if Fargate is enabled.
In order simplify deploying and operating a Kubernetes cluster, this bundle includes numerous optional addons to deliver a fully capable and feature rich cluster that's ready for production workloads. Some of these addons are listed below.

### Cluster Autoscaler
A [cluster autoscaler](https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html#cluster-autoscaler) is installed into every EKS cluster to automatically scale the number of nodes in the cluster based on the current resource usage. This providers numerous benefits such as cost efficiency, higher availability and better resource utilization.
### NGINX Ingress Controller
Users can optionally install the ["official" Kubernetes NGINX ingress controller](https://kubernetes.github.io/ingress-nginx/) (not to be confused with [NGINX's own ingress controller](https://docs.nginx.com/nginx-ingress-controller/) based on the paid nGinx-plus) into their cluster, which allows workloads in your EKS cluster to be accessible from the internet.
### External-DNS and Cert-Manager
If users associate one or more Route53 domains to their EKS cluster, this bundle will automatically install [external-dns](https://github.com/kubernetes-sigs/external-dns) and [cert-manager](https://cert-manager.io/docs/) in the cluster, allowing the cluster to automatically create and manage DNS records and TLS certificates for internet accessible workloads.
### EBS CSI Driver
[Beginning in Kubernetes version 1.23, EKS no longer comes with the default EBS provisioner](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-1.23). In order to allow users to continue using the default `gp2` storage class, this bundle includes the [EBS CSI Driver](https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi.html), which replaces the deprecated EBS provisioner.
### EFS CSI Driver
Optionally, users can also install the [EFS CSI Driver](https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html) which will allow the EKS cluster to attach EFS volumes to cluster workloads for persistant storage. EFS volumes offer some benefits over EBS volumes, such as [allowing multiple pods to use the volume simultaneously (ReadWriteMany)](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes) and not being being locked to a single AWS availability zone, but these benefits come with higher storage costs and increased latency.

### Fargate

Fargate can be enabled to allow AWS to provide on-demand, right-sized compute capacity for running containers on EKS without managing node pools or clusters of EC2 instances.

For workloads that require high uptime, its recommended to keep some node pools populated even when enabling Fargate to ensure compute is always available during surges.

Fargate has many [limitations](https://docs.aws.amazon.com/eks/latest/userguide/fargate.html).

Currently only `namespace` selectors are implemented. If you need `label` selectors please file an [issue](https://github.com/massdriver-cloud/aws-eks-cluster/issues).

## Best Practices
### Managed Node Groups
Worker nodes in the cluster are provisioned as [managed node groups](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html).
### Secure Networking
Cluster is designed according to [AWS's EKS networking best practices](https://docs.aws.amazon.com/eks/latest/userguide/network_reqs.html) including deploying nodes in private subnets and only deploying public load balancers into public subnets.
### Cluster Autoscaler
A cluster autoscaler is automatically installed to provide node autoscaling as workload demand increases.
### Open ID Connect (OIDC) Provider
Cluster is pre-configured for out-of-the box support of [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).


## Security
### Nodes Deployed into Private Subnets
Worker nodes are provisioned into private subnets for security.
### IAM Roles for Service Accounts
IRSA allows kubernetes pods to assume AWS IAM Roles, removing the need for static credentials to access AWS services.
### Secret Encryption
An AWS KMS key is created and associated to the cluster to enable [encryption of secrets](https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/) at rest.
### IMDSv2 Required on Node Groups
The [Instance Metadata Service version 2 (IMDSv2)]() is required on all EKS node groups. IMDSv1, which was the cause of the [2019 CapitalOne data breach](https://divvycloud.com/capital-one-data-breach-anniversary/), is disabled on all node groups.

## Connecting
After you have deployed a Kubernetes cluster through Massdriver, you may want to interact with the cluster using the powerful [kubectl](https://kubernetes.io/docs/reference/kubectl/) command line tool.

### Install Kubectl

You will first need to install `kubectl` to interact with the kubernetes cluster. Installation instructions for Windows, Mac and Linux can be found [here](https://kubernetes.io/docs/tasks/tools/#kubectl).

Note: While `kubectl` generally has forwards and backwards compatibility of core capabilities, it is best if your `kubectl` client version is matched with your kubernetes cluster version. This ensures the best stability and compability for your client.


The standard way to manage connection and authentication details for kubernetes clusters is through a configuration file called a [`kubeconfig`](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) file.

### Download the Kubeconfig File

The standard way to manage connection and authentication details for kubernetes clusters is through a configuration file called a [`kubeconfig`](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) file. The `kubernetes-cluster` artifact that is created when you make a kubernetes cluster in Massdriver contains the basic information needed to create a `kubeconfig` file. Because of this, Massdriver makes it very easy for you to download a `kubeconfig` file that will allow you to use `kubectl` to query and administer your cluster.

To download a `kubeconfig` file for your cluster, navigate to the project and target where the kubernetes cluster is deployed and move the mouse so it hovers over the artifact connection port. This will pop a windows that allows you to download the artifact in raw JSON, or as a `kubeconfig` yaml. Select "Kube Config" from the drop down, and click the button. This will download the `kubeconfig` for the kubernetes cluster to your local system.

![Download Kubeconfig](https://github.com/massdriver-cloud/aws-eks-cluster/blob/main/images/kubeconfig-download.gif?raw=true)

### Use the Kubeconfig File

Once the `kubeconfig` file is downloaded, you can move it to your desired location. By default, `kubectl` will look for a file named `config` located in the `$HOME/.kube` directory. If you would like this to be your default configuration, you can rename and move the file to `$HOME/.kube/config`.

A single `kubeconfig` file can hold multiple cluster configurations, and you can select your desired cluster through the use of [`contexts`](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/#context). Alternatively, you can have multiple `kubeconfig` files and select your desired file through the `KUBECONFIG` environment variable or the `--kubeconfig` flag in `kubectl`.

Once you've configured your environment properly, you should be able to run `kubectl` commands.

### Runbook

Expand Down Expand Up @@ -130,3 +205,8 @@ Verify that Grafana is accessible and that dashboards display the expected metri

By utilizing these runbook commands and tools, you can troubleshoot and manage your AWS EKS resources effectively.

## AWS Access

If you would like to manage access your EKS cluster through AWS IAM principals, you can do so via the `aws-auth` ConfigMap. This will allow the desired AWS IAM principals to view cluster status in the AWS console, as well as generate short-lived credentials for `kubectl` access. Refer to the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html) for more details.

**Note**: In order to connect to the EKS cluster to view or modify the `aws-auth` ConfigMap, you'll need to download the `kubeconfig` file and use `kubectl` as discussed earlier.

0 comments on commit bdbf3be

Please sign in to comment.