Skip to content

Commit

Permalink
Merge pull request #441 from logzio/logzio-logs-collector
Browse files Browse the repository at this point in the history
Add `logzio-logs-collector` chart
  • Loading branch information
yotamloe authored Mar 27, 2024
2 parents 6c8a989 + b4b027b commit 9f7f9cc
Show file tree
Hide file tree
Showing 20 changed files with 1,851 additions and 0 deletions.
10 changes: 10 additions & 0 deletions charts/logzio-logs-collector/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: v2
name: logzio-logs-collector
version: 1.0.0
description: kubernetes logs collection agent for logz.io based on opentelemetry collector
type: application
home: https://logz.io/
maintainers:
- name: yotam loewenbach
email: [email protected]
appVersion: 0.80.0
139 changes: 139 additions & 0 deletions charts/logzio-logs-collector/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# logzio-logs-collector

**In development**

kubernetes logs collection agent for logz.io based on opentelemetry collector

## Prerequisites

- Kubernetes 1.24+
- Helm 3.9+

Below is the extended README.md including the full configuration table based on the provided `values.yaml`.

* * *

Logz.io Logs Collector for Kubernetes
=====================================

The `logzio-logs-collector` Helm chart deploys a Kubernetes logs collection agent designed to forward logs from your Kubernetes clusters to Logz.io. This solution leverages the OpenTelemetry Collector, offering a robust and flexible way to manage log data, ensuring that your logging infrastructure scales with your application needs.

Features
--------

* **Easy Integration with Logz.io**: Pre-configured to send logs to Logz.io, simplifying setup and integration.
* **Secure Secret Management**: Option to automatically manage secrets for seamless and secure authentication with Logz.io.

Getting Started
---------------

### Add Logz.io Helm Repository

Before installing the chart, add the Logz.io Helm repository:

```
helm repo add logzio-helm https://logzio.github.io/logzio-helm
helm repo update
```

### Installation

1. **Create the Logz.io Secret**

If not managing secrets externally, ensure the Logz.io secret with your shipping token and other relevant information is created.

2. **Install the Chart**

Install `logzio-logs-collector` from the Logz.io Helm repository, with logzio authentication values:

```
helm install logzio-logs-collector -n monitoring \
--set enabled=true \
--set secrets.logzioLogsToken=<<token>> \
--set secrets.logzioRegion=<<region>> \
--set secrets.env_id=<<env_id>> \
--set secrets.logType=<<logType>> \
logzio-helm/logzio-logs-collector
```
Replace `logzio-logs-collector` with your release name.
### Uninstalling the Chart
To uninstall/delete the `logzio-logs-collector` deployment:
shellCopy code
`helm delete -n monitoring logzio-logs-collector`
Configuration
-------------
The following table lists the configurable parameters of the `logzio-logs-collector` chart and their default values.
| Key | Description | Default Value |
|--------------------------|----------------------------------------------------------------------------------|----------------------------------------|
| enabled | Toggle for enabling the Helm chart deployment. | `false` |
| nameOverride | Override the default name for the deployment. | `""` |
| fullnameOverride | Set a full name override for the deployment. | `""` |
| mode | Deployment mode (currently supports only "daemonset"). | `"daemonset"` |
| namespaceOverride | Override the namespace into which the resources will be deployed. | `""` |
| fargateLogRouter.enabled | Boolean to decide if to configure Fargate log router (EKS Fargate environments). | `false` |
| secrets.enabled | Toggle for creating and managing the Logz.io secret by this chart. | `true` |
| secrets.name | The name of the secret for Logz.io log collector. | `"logzio-log-collector-secrets"` |
| secrets.env_id | Environment identifier attribute added to all logs. | `"my_env"` |
| secrets.logType | Default log type field. | `"k8s"` |
| secrets.logzioLogsToken | Secret with your Logz.io logs shipping token. | `"token"` |
| secrets.LogzioRegion | Secret with your Logz.io region. | `"us"` |
| secrets.customEndpoint | Secret with your custom endpoint, overrides Logz.io region listener address. | `""` |
| configMap.create | Specifies whether a configMap should be created. | `true` |
| config | Base collector configuration, supports templating. | Complex structure (see `values.yaml`) |
| image.repository | Docker image repository. | `"otel/opentelemetry-collector-contrib"` |
| image.pullPolicy | Image pull policy. | `"IfNotPresent"` |
| image.tag | Overrides the image tag. | `""` |
| image.digest | Pull images by digest. | `""` |
| imagePullSecrets | Specifies image pull secrets. | `[]` |
| command.name | OpenTelemetry Collector executable. | `"otelcol-contrib"` |
| command.extraArgs | Additional arguments for the command. | `[]` |
| serviceAccount.create | Specifies whether a service account should be created. | `true` |
| serviceAccount.name | The name of the service account to use. | `""` |
| clusterRole.create | Specifies whether a clusterRole should be created. | `true` |
| clusterRole.name | The name of the clusterRole to use. | `""` |
| podSecurityContext | Security context policies for the pod. | `{}` |
| securityContext | Security context policies for the container. | `{}` |
| nodeSelector | Node labels for pod assignment. | `{}` |
| tolerations | Tolerations for pod assignment. | `[]` |
| affinity | Affinity rules for pod assignment. | Complex structure (see `values.yaml`) |
| priorityClassName | Scheduler priority class name. | `""` |
| extraEnvs | Extra environment variables to set in the pods. | `[]` |
| ports | Defines ports configurations. | Complex structure (see `values.yaml`) |
| resources | CPU/memory resource requests/limits. | `limits.cpu:250m` `limits.cpu:512Mi` |
| podAnnotations | Annotations to add to the pod. | `{}` |
| podLabels | Labels to add to the pod. | `{}` |
| hostNetwork | Use the host's network namespace. | `false` |
| dnsPolicy | Pod DNS policy. | `""` |
| livenessProbe | Liveness probe configuration. | (see `values.yaml`) |
| readinessProbe | Readiness probe configuration. | (see `values.yaml`) |
| service.enabled | Enable the creation of a Service. | `true` |
| ingress.enabled | Enable ingress resource creation. | `false` |
| podMonitor.enabled | Enable the creation of a PodMonitor. | `false` |
| networkPolicy.enabled | Enable NetworkPolicy creation. | `false` |
| useGOMEMLIMIT | Set GOMEMLIMIT env var to a percentage of resources.limits.memory. | `false` |
### Configure customization options
You can use the following options to update the Helm chart parameters:
* Specify parameters using the `--set key=value[,key=value]` argument to `helm install`
* Edit the `values.yaml`
* Overide default values with your own `my_values.yaml` and apply it in the `helm install` command.
Multi line logs configuration
-----------------------------
The collector supports by default various log formats (including multiline logs) such as `CRI-O` `CRI-Containerd` `Docker` formats. You can configure the chart to parse custom multiline logs pattern according to your needs, please read [Customizing Multiline Log Handling](./examples/multiline.md) guide for more details.
## Change log
* 1.0.0
- kubernetes logs collection agent for logz.io based on opentelemetry collector
211 changes: 211 additions & 0 deletions charts/logzio-logs-collector/examples/multiline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,211 @@
Customizing Multiline Log Handling
==================================

Handling multiline logs efficiently in the OpenTelemetry Collector deployed through `logzio-logs-collector` Helm chart requires understanding of the `filelog` receiver, particularly the `recombine` operator. This operator is crucial for parsing and recombining logs that span multiple lines into a single entry, such as stack traces or multiline application logs.

Key Configuration Options of the `recombine` Operator
-----------------------------------------------------

To tailor the `filelog` receiver for multiline logs, focus on these essential `recombine` operator configurations:

* **`combine_field`**: Specifies which field of the log entry should be combined into a single entry. Typically, this is the message body of the log (`body` or `body.message`).

* **`is_first_entry`** and **`is_last_entry`**: Logical expressions that evaluate to `true` if the log entry being processed is the beginning or the end of a multiline series, respectively. You need to specify at least one of these based on whether you can reliably identify the start or the end of a multiline log entry.

* **`combine_with`**: Defines the delimiter used to concatenate the log entries. For most logs, `"\n"` (newline) is a suitable delimiter, preserving the original log's structure.

* **`source_identifier`**: Helps differentiate between log sources when combining entries, ensuring logs from different sources are not mistakenly merged.


Creating Custom Formats for Multiline Logs
------------------------------------------

To configure custom formats, you must understand your logs' structure to accurately use `is_first_entry` or `is_last_entry` expressions. Regular expressions (regex) are powerful tools in matching specific log patterns, allowing you to identify the start or end of a multiline log entry effectively.

Custom multiline `recombine` operators should be added before `move from attributes.log to body`:
```yaml
# Update body field after finishing all parsing
- from: attributes.log
to: body
type: move
```
Here is an example `custom-values.yaml` that shows how where to add custom multiline `recombine` operators:
```yaml
secrets:
enabled: true
logzioLogsToken: "<<logzio-token>>"
config:
receivers:
filelog:
operators:
- id: get-format
routes:
- expr: body matches "^\\{"
output: parser-docker
- expr: body matches "^[^ Z]+ "
output: parser-crio
- expr: body matches "^[^ Z]+Z"
output: parser-containerd
type: router
- id: parser-crio
regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: 2006-01-02T15:04:05.999999999Z07:00
layout_type: gotime
parse_from: attributes.time
type: regex_parser
- combine_field: attributes.log
combine_with: ""
id: crio-recombine
is_last_entry: attributes.logtag == 'F'
max_log_size: 102400
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
type: recombine
- id: parser-containerd
regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
timestamp:
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
parse_from: attributes.time
type: regex_parser
- combine_field: attributes.log
combine_with: ""
id: containerd-recombine
is_last_entry: attributes.logtag == 'F'
max_log_size: 102400
output: extract_metadata_from_filepath
source_identifier: attributes["log.file.path"]
type: recombine
- id: parser-docker
output: extract_metadata_from_filepath
timestamp:
layout: '%Y-%m-%dT%H:%M:%S.%LZ'
parse_from: attributes.time
type: json_parser
- id: extract_metadata_from_filepath
parse_from: attributes["log.file.path"]
regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
type: regex_parser
- from: attributes.stream
to: attributes["iostream"]
type: move
- from: attributes.container_name
to: resource["k8s.container.name"]
type: move
- from: attributes.namespace
to: resource["k8s.namespace.name"]
type: move
- from: attributes.pod_name
to: resource["k8s.pod.name"]
type: move
- from: attributes.restart_count
to: resource["k8s.container.restart_count"]
type: move
- from: attributes.uid
to: resource["k8s.pod.uid"]
type: move
# Add custom multiline parsers here. Add more `type: recombine` operators for custom multiline formats
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/recombine.md
- type: recombine
id: stack-errors-recombine
combine_field: body
is_first_entry: body matches "^[^\\s]"
source_identifier: attributes["log.file.path"]
# Update body field after finishing all parsing
- from: attributes.log
to: body
type: move
```
### Examples
#### Java Stack Trace Errors
Java exceptions span multiple lines, starting with an exception message followed by lines that detail the stack trace.
**Log Format:**
```java
Exception in thread "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
```
**Configuration:**

```yaml
config:
receivers:
filelog:
operators:
# previous operators
- type: recombine
id: Java-Stack-Trace-Errors
combine_field: body
is_first_entry: body matches "^[\\w]+(Exception|Error)"
combine_with: "\n"
# Update body field after finishing all parsing
- from: attributes.log
to: body
type: move
```
#### Python Tracebacks
Python errors start with `Traceback` followed by file paths and the actual error message.

**Log Format:**

```python
Traceback (most recent call last):
File "/path/to/script.py", line 1, in <module>
raise Exception("An error occurred")
Exception: An error occurred
```
**Configuration:**

```yaml
config:
receivers:
filelog:
operators:
# previous operators
- type: recombine
id: Python-Tracebacks
combine_field: body
is_first_entry: body matches "^Traceback"
combine_with: "\n"
# Update body field after finishing all parsing
- from: attributes.log
to: body
type: move
```

#### Custom Multiline Log Format

Suppose logs start with a timestamp and include continuation lines prefixed with a special character (e.g., `>`).

**Log Format:**

```shell
`2023-03-25 10:00:00 ERROR: An error occurred
> additional info
> more details
2023-03-25 10:05:00 INFO: A new entry starts`
```
**Configuration:**
```yaml
config:
receivers:
filelog:
operators:
# previous operators
- type: recombine
id: custom-multiline
combine_field: body
is_first_entry: body matches "^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}"
combine_with: "\n"
# Update body field after finishing all parsing
- from: attributes.log
to: body
type: move
```
25 changes: 25 additions & 0 deletions charts/logzio-logs-collector/templates/NOTES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{{- if and (eq .Values.dnsPolicy "None") (not .Values.dnsConfig) }}
{{- fail "[ERROR] dnsConfig should be provided when dnsPolicy is None" }}
{{ end }}

{{/* validate extensions must include health_check */}}
{{- if not (has "health_check" .Values.config.service.extensions) }}
{{ fail "[ERROR] The logzio-logs-collector chart requires that the health_check extension to be included in the extension list." }}
{{- end}}

{{- if not .Values.configMap.create }}
[WARNING] "configMap" wil not be created and "config" will not take effect.
{{ end }}

{{- if not .Values.resources }}
[WARNING] No resource limits or requests were set. Consider setter resource requests and limits for your collector(s) via the `resources` field.
{{ end }}

{{- if and (eq .Values.mode "daemonset") (eq .Values.service.internalTrafficPolicy "Cluster") }}
[WARNING] Setting internalTrafficPolicy to 'Cluster' on Daemonset is not recommended. Consider using 'Local' instead.
{{ end }}

{{- if and (.Values.useGOMEMLIMIT) (not ((((.Values.resources).limits).memory))) }}
[WARNING] "useGOMEMLIMIT" is enabled but memory limits have not been supplied, which means no GOMEMLIMIT env var was configured but the Memory Ballast Extension was removed. It is highly recommended to only use "useGOMEMLIMIT" when memory limits have been set.
{{ end }}

Loading

0 comments on commit 9f7f9cc

Please sign in to comment.