Skip to content
This repository has been archived by the owner on Jan 12, 2023. It is now read-only.

Commit

Permalink
Add TLS support to HTTP/GRPC clients (cortexproject#2502)
Browse files Browse the repository at this point in the history
* Checkpoint

Signed-off-by: Annanay <[email protected]>

* Add tls options to grpc client

Signed-off-by: Annanay <[email protected]>

* Add new httpclient util package for use in all client configs

Signed-off-by: Annanay <[email protected]>

* Change all grpc clients to use grpcclient

Signed-off-by: Annanay <[email protected]>

* Fix build, add docs

Signed-off-by: Annanay <[email protected]>

* Fix tests

Signed-off-by: Annanay <[email protected]>

* Fix lint, add tls to store-gw-client

Signed-off-by: Annanay <[email protected]>

* Rename config parameters

Signed-off-by: Annanay <[email protected]>

* Lint

Signed-off-by: Annanay <[email protected]>

* Nit fix

Signed-off-by: Annanay <[email protected]>

* Checkpoint

Signed-off-by: Annanay <[email protected]>

* Checkpoint

Signed-off-by: Annanay <[email protected]>

* Checkpoint

Signed-off-by: Annanay <[email protected]>

* Add integration tests for TLS

Signed-off-by: Annanay <[email protected]>

* Correct package names, fix config file reference

Signed-off-by: Annanay <[email protected]>

* Fix cert paths

Signed-off-by: Annanay <[email protected]>

* Fix lint, add sample tls config file

Signed-off-by: Annanay <[email protected]>

* Crash quickly if certs are bad

Signed-off-by: Annanay <[email protected]>

* Fixed linter and doc generation

Signed-off-by: Marco Pracucci <[email protected]>

* Cleaned white noise

Signed-off-by: Marco Pracucci <[email protected]>

* Address review comments

Signed-off-by: Annanay <[email protected]>

* Fix docs, flags

Signed-off-by: Annanay <[email protected]>

* Fix test

Signed-off-by: Annanay <[email protected]>

* Fix lint, docs

Signed-off-by: Annanay <[email protected]>

* Do not use TLS options with GCP clients

Signed-off-by: Annanay <[email protected]>

* Add client auth type, go mod tidy/vendor

Signed-off-by: Annanay <[email protected]>

* Address comments

Signed-off-by: Annanay <[email protected]>

* Fix lint, add new integration test

Signed-off-by: Annanay <[email protected]>

* Revert logging level to warn, add CHANGELOG entry

Signed-off-by: Annanay <[email protected]>

Co-authored-by: Marco Pracucci <[email protected]>
Signed-off-by: Jacob Lisi <[email protected]>
  • Loading branch information
2 people authored and jtlisi committed May 26, 2020
1 parent 9db28c1 commit 8555d7f
Show file tree
Hide file tree
Showing 26 changed files with 537 additions and 51 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Please make sure to review renamed metrics, and update your dashboards and alert
* [FEATURE] TLS config options added to the Server. #2535
* [FEATURE] Experimental: Added support for `/api/v1/metadata` Prometheus-based endpoint. #2549
* [FEATURE] Add ability to limit concurrent queries to Cassandra with `-cassandra.query-concurrency` flag. #2562
* [FEATURE] TLS config options added for GRPC clients in Querier (Query-frontend client & Ingester client), Ruler, Store Gateway, as well as HTTP client in Config store client. #2502
* [ENHANCEMENT] Experimental TSDB: sample ingestion errors are now reported via existing `cortex_discarded_samples_total` metric. #2370
* [ENHANCEMENT] Failures on samples at distributors and ingesters return the first validation error as opposed to the last. #2383
* [ENHANCEMENT] Experimental TSDB: Added `cortex_querier_blocks_meta_synced`, which reflects current state of synced blocks over all tenants. #2392
Expand Down
62 changes: 62 additions & 0 deletions docs/configuration/config-file-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -646,6 +646,19 @@ The `querier_config` configures the Cortex querier.
# instances form a ring and addresses are picked from the ring).
# CLI flag: -experimental.querier.store-gateway-addresses
[store_gateway_addresses: <string> | default = ""]
store_gateway_client:
# TLS cert path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-cert-path
[tls_cert_path: <string> | default = ""]
# TLS key path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-key-path
[tls_key_path: <string> | default = ""]
# TLS CA path for the client
# CLI flag: -experimental.querier.store-gateway-client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `query_frontend_config`
Expand Down Expand Up @@ -757,6 +770,19 @@ The `ruler_config` configures the Cortex ruler.
# CLI flag: -ruler.external.url
[external_url: <url> | default = ]
ruler_client:
# TLS cert path for the client
# CLI flag: -ruler.client.tls-cert-path
[tls_cert_path: <string> | default = ""]
# TLS key path for the client
# CLI flag: -ruler.client.tls-key-path
[tls_key_path: <string> | default = ""]
# TLS CA path for the client
# CLI flag: -ruler.client.tls-ca-path
[tls_ca_path: <string> | default = ""]
# How frequently to evaluate rules
# CLI flag: -ruler.evaluation-interval
[evaluation_interval: <duration> | default = 1m]
Expand Down Expand Up @@ -1964,6 +1990,18 @@ grpc_client_config:
# Number of times to backoff and retry before failing.
# CLI flag: -ingester.client.backoff-retries
[max_retries: <int> | default = 10]
# TLS cert path for the client
# CLI flag: -ingester.client.tls-cert-path
[tls_cert_path: <string> | default = ""]
# TLS key path for the client
# CLI flag: -ingester.client.tls-key-path
[tls_key_path: <string> | default = ""]
# TLS CA path for the client
# CLI flag: -ingester.client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `frontend_worker_config`
Expand Down Expand Up @@ -2025,6 +2063,18 @@ grpc_client_config:
# Number of times to backoff and retry before failing.
# CLI flag: -querier.frontend-client.backoff-retries
[max_retries: <int> | default = 10]
# TLS cert path for the client
# CLI flag: -querier.frontend-client.tls-cert-path
[tls_cert_path: <string> | default = ""]
# TLS key path for the client
# CLI flag: -querier.frontend-client.tls-key-path
[tls_key_path: <string> | default = ""]
# TLS CA path for the client
# CLI flag: -querier.frontend-client.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `etcd_config`
Expand Down Expand Up @@ -2530,6 +2580,18 @@ The `configstore_config` configures the config database storing rules and alerts
# Timeout for requests to Weave Cloud configs service.
# CLI flag: -<prefix>.configs.client-timeout
[client_timeout: <duration> | default = 5s]
# TLS cert path for the client
# CLI flag: -<prefix>.configs.tls-cert-path
[tls_cert_path: <string> | default = ""]
# TLS key path for the client
# CLI flag: -<prefix>.configs.tls-key-path
[tls_key_path: <string> | default = ""]
# TLS CA path for the client
# CLI flag: -<prefix>.configs.tls-ca-path
[tls_ca_path: <string> | default = ""]
```

### `tsdb_config`
Expand Down
100 changes: 100 additions & 0 deletions docs/configuration/single-process-config-blocks-tls.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@

# Configuration for running Cortex in single-process mode.
# This should not be used in production. It is only for getting started
# and development.

# Disable the requirement that every request to Cortex has a
# X-Scope-OrgID header. `fake` will be substituted in instead.
auth_enabled: false

server:
http_listen_port: 9009

# Configure the server to allow messages up to 100MB.
grpc_server_max_recv_msg_size: 104857600
grpc_server_max_send_msg_size: 104857600
grpc_server_max_concurrent_streams: 1000
grpc_tls_config:
cert_file: "server.crt"
key_file: "server.key"
client_auth_type: "RequireAndVerifyClientCert"
client_ca_file: "root.crt"


distributor:
shard_by_all_labels: true
pool:
health_check_ingesters: true

ingester_client:
grpc_client_config:
# Configure the client to allow messages up to 100MB.
max_recv_msg_size: 104857600
max_send_msg_size: 104857600
use_gzip_compression: true
tls_cert_path: "client.crt"
tls_key_path: "client.key"
tls_ca_path: "root.crt"

ingester:
# Disable blocks transfers on ingesters shutdown or rollout.
max_transfer_retries: 0

lifecycler:
# The address to advertise for this ingester. Will be autodiscovered by
# looking up address on eth0 or en0; can be specified if this fails.
# address: 127.0.0.1

# We want to start immediately and flush on shutdown.
join_after: 0
min_ready_duration: 0s
final_sleep: 0s
num_tokens: 512

# Use an in memory ring store, so we don't need to launch a Consul.
ring:
kvstore:
store: inmemory
replication_factor: 1

storage:
engine: tsdb

tsdb:
dir: /tmp/cortex/tsdb
bucket_store:
sync_dir: /tmp/cortex/tsdb-sync

# You can choose between local storage and Amazon S3, Google GCS and Azure storage. Each option requires additional configuration
# as shown below. All options can be configured via flags as well which might be handy for secret inputs.
backend: s3 # s3, gcs, azure or filesystem are valid options
s3:
bucket_name: cortex
endpoint: s3.dualstack.us-east-1.amazonaws.com
# Configure your S3 credentials below.
# secret_access_key: "TODO"
# access_key_id: "TODO"
# gcs:
# bucket_name: cortex
# service_account: # if empty or omitted Cortex will use your default service account as per Google's fallback logic
# azure:
# account_name:
# account_key:
# container_name:
# endpoint_suffix:
# max_retries: # Number of retries for recoverable errors (defaults to 20)
# filesystem:
# dir: ./data/tsdb

compactor:
data_dir: /tmp/cortex/compactor
sharding_ring:
kvstore:
store: inmemory

frontend_worker:
match_max_concurrent: true
grpc_client_config:
tls_cert_path: "client.crt"
tls_key_path: "client.key"
tls_ca_path: "root.crt"
111 changes: 111 additions & 0 deletions docs/production/tls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
title: "Securing communication between Cortex components with TLS"
linkTitle: "Securing communication between Cortex components with TLS"
weight: 5
slug: tls
---

Cortex is a distributed system with significant traffic between its services.
To allow for secure communication, Cortex supports TLS between all its
components. This guide describes the process of setting up TLS.

### Generation of certs to configure TLS

The first step to securing inter-service communication in Cortex with TLS is
generating certificates. A Certifying Authority (CA) will be used for this
purpose which should be private to the organization, as any certificates signed
by this CA will have permissions to communicate with the cluster.

We will use the following script to generate self signed certs for the cluster:

```
# Refer: github.com/cortexproject/cortex/integration/certs/genCerts.sh
# keys
openssl genrsa -out root.key
openssl genrsa -out client.key
openssl genrsa -out server.key
# root cert / certifying authority
openssl req -x509 -new -nodes -key root.key -subj "/C=US/ST=KY/O=Org/CN=root" -sha256 -days 100000 -out root.crt
# csrs - certificate signing requests
openssl req -new -sha256 -key client.key -subj "/C=US/ST=KY/O=Org/CN=client" -out client.csr
openssl req -new -sha256 -key server.key -subj "/C=US/ST=KY/O=Org/CN=localhost" -out server.csr
# certificates
openssl x509 -req -in client.csr -CA root.crt -CAkey root.key -CAcreateserial -out client.crt -days 100000 -sha256
openssl x509 -req -in server.csr -CA root.crt -CAkey root.key -CAcreateserial -out server.crt -days 100000 -sha256
```

Note that the above script generates certificates that are valid for 100000 days.
This can be changed by adjusting the `-days` option in the above commands.
It is recommended that the certs be replaced atleast once every 2 years.

The above script generates keys `client.key, server.key` and certs
`client.crt, server.crt` for both the client and server. The CA cert is
generated as `root.crt`.

### Load certs into the HTTP/GRPC server/client

Every HTTP/GRPC link between Cortex components supports TLS configuration
through the following config parameters:

#### Server flags

```
# Path to the TLS Cert for the HTTP Server
-server.http-tls-cert-path=/path/to/server.crt
# Path to the TLS Key for the HTTP Server
-server.http-tls-key-path=/path/to/server.key
# Type of Client Auth for the HTTP Server
-server.http-tls-client-auth="RequireAndVerifyClientCert"
# Path to the Client CA Cert for the HTTP Server
-server.http-tls-ca-path="/path/to/root.crt"
# Path to the TLS Cert for the GRPC Server
-server.grpc-tls-cert-path=/path/to/server.crt
# Path to the TLS Key for the GRPC Server
-server.grpc-tls-key-path=/path/to/server.key
# Type of Client Auth for the GRPC Server
-server.grpc-tls-client-auth="RequireAndVerifyClientCert"
# Path to the Client CA Cert for the GRPC Server
-server.grpc-tls-ca-path=/path/to/root.crt
```

#### Client flags

Client flags are component specific.

For an HTTP client in the Alertmanager:
```
# Path to the TLS Cert for the HTTP Client
-alertmanager.configs.tls-cert-path=/path/to/client.crt
# Path to the TLS Key for the HTTP Client
-alertmanager.configs.tls-key-path=/path/to/client.key
# Path to the TLS CA for the HTTP Client
-alertmanager.configs.tls-ca-path=/path/to/root.crt
```

For a GRPC client in the Querier:
```
# Path to the TLS Cert for the GRPC Client
-querier.frontend-client.tls-cert-path=/path/to/client.crt
# Path to the TLS Key for the GRPC Client
-querier.frontend-client.tls-key-path=/path/to/client.key
# Path to the TLS CA for the GRPC Client
-querier.frontend-client.tls-ca-path=/path/to/root.crt
```

TLS can be configured in a similar fashion for other GRPC clients like the
ingester client.
37 changes: 37 additions & 0 deletions integration/certs/genCerts.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env bash
# Copied from https://github.com/joe-elliott/cert-exporter/blob/5ce49ebf6bfcdcb178d31145ae2a460f3b348cf5/test/files/genCerts.sh
# Copyright [2020] [cert-exporter authors]
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

certFolder=$1
days=$2

pushd $certFolder

# keys
openssl genrsa -out root.key
openssl genrsa -out client.key
openssl genrsa -out server.key

# root cert
openssl req -x509 -new -nodes -key root.key -subj "/C=US/ST=KY/O=Org/CN=root" -sha256 -days $days -out root.crt

# csrs
openssl req -new -sha256 -key client.key -subj "/C=US/ST=KY/O=Org/CN=client" -out client.csr
openssl req -new -sha256 -key server.key -subj "/C=US/ST=KY/O=Org/CN=localhost" -out server.csr

openssl x509 -req -in client.csr -CA root.crt -CAkey root.key -CAcreateserial -out client.crt -days $days -sha256
openssl x509 -req -in server.csr -CA root.crt -CAkey root.key -CAcreateserial -out server.crt -days $days -sha256

popd
11 changes: 8 additions & 3 deletions integration/configs.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,16 +22,21 @@ const (
cortexConfigFile = "config.yaml"
cortexSchemaConfigFile = "schema.yaml"
blocksStorageEngine = "tsdb"
clientCertFile = "certs/client.crt"
clientKeyFile = "certs/client.key"
caCertFile = "certs/root.crt"
serverCertFile = "certs/server.crt"
serverKeyFile = "certs/server.key"
storeConfigTemplate = `
- from: {{.From}}
store: {{.IndexStore}}
schema: v9
index:
prefix: cortex_
period: 168h
period: 168h
chunks:
prefix: cortex_chunks_
period: 168h
period: 168h
`

cortexAlertmanagerUserConfigYaml = `route:
Expand Down Expand Up @@ -120,7 +125,7 @@ storage:
table_manager:
poll_interval: 1m
retention_period: 168h
retention_period: 168h
schema:
{{.SchemaConfig}}
Expand Down
Loading

0 comments on commit 8555d7f

Please sign in to comment.