Skip to content

Commit 42f3396

Browse files
soanderSuperskyyywu-sheng
authored
Support the telegraf receiver plugin module (apache#9620)
* The telegraf receiver plugin module development. * Refactored code of converting telegraf data and fixed some errors. * Support Telegraf receiver plugin module. * Add telegraf-receiver.md, update changes.md and vm-monitoring.md. * Rename YAML file and change code style. * Change receiver-telegraf of application.yml. * The receiver-telegraf e2e test. * The telegraf receiver e2e test. * Fix an issue and delete redundant configs. * Add Unit Test about converting Telegraf metrics. * Adjust Unit Test about converting Telegraf metrics. * Add License Header to Unit Test and change binary.xml. * Change module provider's config initialization mechanism. * Exclude the telegraf-rules in server-starter pom.xml. * Fix telegraf e2e test issues. * Change telegraf e2e test. * Fix issues * Fix issues. * Add Sample convert Unit Test. * Change vm.yaml, related documents and fix some issues. * Change backend-vm-monitoring.md. * Fix SampleConvertTest checkstyle issue. * Change vm.yaml swap MAL. * Update menu.yml and binary.xml. * Update Telegraf Unit test. * Delete telegraf config package, use meter.analyzer.prometheus package to load config file. * Reorder telegraf metrics in menu.yml. * Change e2e, vm, config, linux-service and vm.md. * Update telegraf.conf file. * Change url of telegraf.conf file. * Update e2e.yaml, conf file, menu.yml and delete useless code of provider. * Update grouping sampleFamily by timestamp and name, and add new UTs. * Update vm-monitoring.md and .asf.yaml. Co-authored-by: Superskyyy (ONLINE) <[email protected]> Co-authored-by: 吴晟 Wu Sheng <[email protected]>
1 parent d32a318 commit 42f3396

File tree

26 files changed

+1561
-16
lines changed

26 files changed

+1561
-16
lines changed

.asf.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ github:
3333
- open-telemetry
3434
- zabbix
3535
- ebpf
36+
- telegraf
3637
enabled_merge_buttons:
3738
squash: true
3839
merge: false

.github/workflows/skywalking.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -549,6 +549,8 @@ jobs:
549549
config: test/e2e-v2/cases/vm/zabbix/e2e.yaml
550550
- name: VM Prometheus
551551
config: test/e2e-v2/cases/vm/prometheus-node-exporter/e2e.yaml
552+
- name: VM Telegraf
553+
config: test/e2e-v2/cases/vm/telegraf/e2e.yaml
552554
- name: So11y
553555
config: test/e2e-v2/cases/so11y/e2e.yaml
554556
- name: MySQL Prometheus and slowsql

apm-dist/src/main/assembly/binary.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@
6969
<include>ui-initialized-templates/*/*.json</include>
7070
<include>lal/*</include>
7171
<include>log-mal-rules/*</include>
72+
<include>telegraf-rules/*</include>
7273
</includes>
7374
<outputDirectory>config</outputDirectory>
7475
</fileSet>

docs/en/setup/backend/backend-vm-monitoring.md

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -3,38 +3,54 @@ SkyWalking leverages Prometheus node-exporter to collect metrics data from the V
33
[OpenTelemetry receiver](opentelemetry-receiver.md) and into the [Meter System](./../../concepts-and-designs/meter.md).
44
VM entity as a `Service` in OAP and on the `Layer: OS_LINUX`.
55

6+
SkyWalking also provides InfluxDB Telegraf to receive VMs' metrics data by [Telegraf receiver](./telegraf-receiver.md).
7+
The telegraf receiver plugin receiver, process and convert the metrics, then it send converted metrics to [Meter System](./../../concepts-and-designs/meter.md).
8+
VM entity as a `Service` in OAP and on the `Layer: OS_LINUX`.
9+
610
## Data flow
11+
**For OpenTelemetry receiver:**
712
1. The Prometheus node-exporter collects metrics data from the VMs.
813
2. The OpenTelemetry Collector fetches metrics from node-exporter via Prometheus Receiver and pushes metrics to the SkyWalking OAP Server via the OpenCensus gRPC Exporter or OpenTelemetry gRPC exporter.
914
3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to filter/calculate/aggregate and store the results.
1015

16+
**For Telegraf receiver:**
17+
1. The InfluxDB Telegraf [input plugins](https://docs.influxdata.com/telegraf/v1.24/plugins/) collects various metrics data from the VMs.
18+
2. The cpu, mem, system, disk and diskio input plugins should be set in telegraf.conf file.
19+
2. The InfluxDB Telegraf send `JSON` format metrics by `HTTP` messages to Telegraf Receiver, then pushes converted metrics to the SkyWalking OAP Server [Meter System](./../../concepts-and-designs/meter.md).
20+
3. The SkyWalking OAP Server parses the expression with [MAL](../../concepts-and-designs/mal.md) to filter/calculate/aggregate ad store the results.
21+
4. The meter_vm_cpu_average_used metrics indicates the average usage of each CPU core for telegraf receiver.
1122

1223
## Setup
13-
24+
**For OpenTelemetry receiver:**
1425
1. Setup [Prometheus node-exporter](https://prometheus.io/docs/guides/node-exporter/).
1526
2. Setup [OpenTelemetry Collector ](https://opentelemetry.io/docs/collector/). This is an example for OpenTelemetry Collector configuration [otel-collector-config.yaml](../../../../test/e2e-v2/cases/vm/prometheus-node-exporter/otel-collector-config.yaml).
1627
3. Config SkyWalking [OpenTelemetry receiver](opentelemetry-receiver.md).
1728

29+
**For Telegraf receiver:**
30+
1. Setup InfluxDB Telegraf's `telegraf.conf file` according to [Telegraf office document](https://docs.influxdata.com/telegraf/v1.24/).
31+
2. Setup InfluxDB Telegraf's `telegraf.conf file` specific rules according to [Telegraf receiver document](telegraf-receiver.md).
32+
3. Config SkyWalking [Telegraf receiver](telegraf-receiver.md).
33+
1834
## Supported Metrics
1935

20-
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
21-
|-----|-----|-----|-----|-----|
22-
| CPU Usage | % | cpu_total_percentage | The total percentage usage of the CPU core. If there are 2 cores, the maximum usage is 200%. | Prometheus node-exporter |
23-
| Memory RAM Usage | MB | meter_vm_memory_used | The total RAM usage | Prometheus node-exporter |
24-
| Memory Swap Usage | % | meter_vm_memory_swap_percentage | The percentage usage of swap memory | Prometheus node-exporter |
25-
| CPU Average Used | % | meter_vm_cpu_average_used | The percentage usage of the CPU core in each mode | Prometheus node-exporter |
26-
| CPU Load | | meter_vm_cpu_load1<br />meter_vm_cpu_load5<br />meter_vm_cpu_load15 | The CPU 1m / 5m / 15m average load | Prometheus node-exporter |
27-
| Memory RAM | MB | meter_vm_memory_total<br />meter_vm_memory_available<br />meter_vm_memory_used | The RAM statistics, including Total / Available / Used | Prometheus node-exporter |
28-
| Memory Swap | MB | meter_vm_memory_swap_free<br />meter_vm_memory_swap_total | Swap memory statistics, including Free / Total | Prometheus node-exporter |
29-
| File System Mountpoint Usage | % | meter_vm_filesystem_percentage | The percentage usage of the file system at each mount point | Prometheus node-exporter |
30-
| Disk R/W | KB/s | meter_vm_disk_read,meter_vm_disk_written | The disk read and written | Prometheus node-exporter |
31-
| Network Bandwidth Usage | KB/s | meter_vm_network_receive<br />meter_vm_network_transmit | The network receive and transmit | Prometheus node-exporter |
32-
| Network Status | | meter_vm_tcp_curr_estab<br />meter_vm_tcp_tw<br />meter_vm_tcp_alloc<br />meter_vm_sockets_used<br />meter_vm_udp_inuse | The number of TCPs established / TCP time wait / TCPs allocated / sockets in use / UDPs in use | Prometheus node-exporter |
33-
| Filefd Allocated | | meter_vm_filefd_allocated | The number of file descriptors allocated | Prometheus node-exporter |
36+
| Monitoring Panel | Unit | Metric Name | Description | Data Source |
37+
|------------------------------|------|-------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|-----------------------------------------------------|
38+
| CPU Usage | % | meter_vm_cpu_total_percentage | The total percentage usage of the CPU core. If there are 2 cores, the maximum usage is 200%. | Prometheus node-exporter<br />Telegraf input plugin |
39+
| Memory RAM Usage | MB | meter_vm_memory_used | The total RAM usage | Prometheus node-exporter<br />Telegraf input plugin |
40+
| Memory Swap Usage | % | meter_vm_memory_swap_percentage | The percentage usage of swap memory | Prometheus node-exporter<br />Telegraf input plugin |
41+
| CPU Average Used | % | meter_vm_cpu_average_used | The percentage usage of the CPU core in each mode | Prometheus node-exporter<br />Telegraf input plugin |
42+
| CPU Load | | meter_vm_cpu_load1<br />meter_vm_cpu_load5<br />meter_vm_cpu_load15 | The CPU 1m / 5m / 15m average load | Prometheus node-exporter<br />Telegraf input plugin |
43+
| Memory RAM | MB | meter_vm_memory_total<br />meter_vm_memory_available<br />meter_vm_memory_used | The RAM statistics, including Total / Available / Used | Prometheus node-exporter<br />Telegraf input plugin |
44+
| Memory Swap | MB | meter_vm_memory_swap_free<br />meter_vm_memory_swap_total | Swap memory statistics, including Free / Total | Prometheus node-exporter<br />Telegraf input plugin |
45+
| File System Mountpoint Usage | % | meter_vm_filesystem_percentage | The percentage usage of the file system at each mount point | Prometheus node-exporter<br />Telegraf input plugin |
46+
| Disk R/W | KB/s | meter_vm_disk_read,meter_vm_disk_written | The disk read and written | Prometheus node-exporter<br />Telegraf input plugin |
47+
| Network Bandwidth Usage | KB/s | meter_vm_network_receive<br />meter_vm_network_transmit | The network receive and transmit | Prometheus node-exporter<br />Telegraf input plugin |
48+
| Network Status | | meter_vm_tcp_curr_estab<br />meter_vm_tcp_tw<br />meter_vm_tcp_alloc<br />meter_vm_sockets_used<br />meter_vm_udp_inuse | The number of TCPs established / TCP time wait / TCPs allocated / sockets in use / UDPs in use | Prometheus node-exporter<br />Telegraf input plugin |
49+
| Filefd Allocated | | meter_vm_filefd_allocated | The number of file descriptors allocated | Prometheus node-exporter |
3450

3551
## Customizing
3652
You can customize your own metrics/expression/dashboard panel.
37-
The metrics definition and expression rules are found in `/config/otel-rules/vm.yaml`.
53+
The metrics definition and expression rules are found in `/config/otel-rules/vm.yaml` and `/config/telegraf-rules/vm.yaml`.
3854
The dashboard panel confirmations are found in `/config/ui-initialized-templates/os_linux`.
3955

4056
## Blog
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Telegraf receiver
2+
3+
The Telegraf receiver supports receiving InfluxDB Telegraf's metrics by meter-system.
4+
The OAP can load the configuration at bootstrap. The files are located at `$CLASSPATH/telegraf-rules`.
5+
If the new configuration is not well-formed, the OAP may fail to start up.
6+
7+
This is the [InfluxDB Telegraf](https://docs.influxdata.com/telegraf/v1.24/) Document,
8+
the Telegraf receiver can handle Telegraf's [CPU Input Plugin](https://github.com/influxdata/telegraf/blob/release-1.24/plugins/inputs/cpu/README.md),
9+
[Memory Input Plugin](https://github.com/influxdata/telegraf/blob/release-1.24/plugins/inputs/mem/README.md).
10+
11+
There are many other telegraf input plugins, users can customize different input plugins' rule files.
12+
The rule file should be in YAML format, defined by the scheme described in [MAL](../../concepts-and-designs/mal.md).
13+
Please see the [telegraf plugin directory](https://docs.influxdata.com/telegraf/v1.24/plugins/) for more input plugins information.
14+
15+
**Notice:**
16+
* The Telegraf receiver module uses `HTTP` to receive telegraf's metrics,
17+
so the outputs method should be set `[[outputs.http]]` in telegraf.conf file.
18+
Please see the [http outputs](https://github.com/influxdata/telegraf/blob/release-1.24/plugins/outputs/http/README.md)
19+
for more details.
20+
21+
* The Telegraf receiver module **only** process telegraf's `JSON` metrics format,
22+
the data format should be set `data_format = "json"` in telegraf.conf file.
23+
Please see the [JSON data format](https://docs.influxdata.com/telegraf/v1.24/data_formats/output/json/)
24+
for more details.
25+
26+
* The default `json_timestamp_units` is second in JSON output,
27+
and the Telegraf receiver module **only** process `second` timestamp unit.
28+
If users configure `json_timestamp_units` in telegraf.conf file, `json_timestamp_units = "1s"` is feasible.
29+
Please see the [JSON data format](https://docs.influxdata.com/telegraf/v1.24/data_formats/output/json/)
30+
for more details.
31+
32+
The following is the default telegraf receiver YAML rule file in the `application.yml`,
33+
Set `SW_RECEIVER_TELEGRAF:default` through system environment or change `SW_RECEIVER_TELEGRAF_ACTIVE_FILES:vm`
34+
to activate the OpenTelemetry receiver with `vm.yml` in telegraf-rules.
35+
```yaml
36+
receiver-telegraf:
37+
selector: ${SW_RECEIVER_TELEGRAF:default}
38+
default:
39+
activeFiles: ${SW_RECEIVER_TELEGRAF_ACTIVE_FILES:vm}
40+
```
41+
42+
| Rule Name | Description | Configuration File | Data Source |
43+
|-----------|----------------|------------------------|-------------------------------------------------------------------------|
44+
| vm | Metrics of VMs | telegraf-rules/vm.yaml | Telegraf inputs plugins --> Telegraf Receiver --> SkyWalking OAP Server |
45+

docs/menu.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,8 @@ catalog:
119119
path: "/en/setup/backend/backend-zabbix"
120120
- name: "Meter Analysis"
121121
path: "/en/setup/backend/backend-meter"
122+
- name: "Telegraf Metrics"
123+
path: "/en/setup/backend/telegraf-receiver"
122124
- name: "Apdex Threshold"
123125
path: "/en/setup/backend/apdex-threshold"
124126
- name: "Spring Sleuth Metrics Analysis"

oap-server/server-receiver-plugin/pom.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
<module>skywalking-event-receiver-plugin</module>
4747
<module>skywalking-zabbix-receiver-plugin</module>
4848
<module>skywalking-ebpf-receiver-plugin</module>
49+
<module>skywalking-telegraf-receiver-plugin</module>
4950
</modules>
5051

5152
<dependencies>
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
~ Licensed to the Apache Software Foundation (ASF) under one or more
4+
~ contributor license agreements. See the NOTICE file distributed with
5+
~ this work for additional information regarding copyright ownership.
6+
~ The ASF licenses this file to You under the Apache License, Version 2.0
7+
~ (the "License"); you may not use this file except in compliance with
8+
~ the License. You may obtain a copy of the License at
9+
~
10+
~ http://www.apache.org/licenses/LICENSE-2.0
11+
~
12+
~ Unless required by applicable law or agreed to in writing, software
13+
~ distributed under the License is distributed on an "AS IS" BASIS,
14+
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
~ See the License for the specific language governing permissions and
16+
~ limitations under the License.
17+
~
18+
-->
19+
20+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
21+
<parent>
22+
<artifactId>server-receiver-plugin</artifactId>
23+
<groupId>org.apache.skywalking</groupId>
24+
<version>9.3.0-SNAPSHOT</version>
25+
</parent>
26+
<modelVersion>4.0.0</modelVersion>
27+
28+
<artifactId>skywalking-telegraf-receiver-plugin</artifactId>
29+
<packaging>jar</packaging>
30+
31+
<dependencies>
32+
<dependency>
33+
<groupId>org.apache.skywalking</groupId>
34+
<artifactId>agent-analyzer</artifactId>
35+
<version>${project.version}</version>
36+
</dependency>
37+
<dependency>
38+
<groupId>org.apache.skywalking</groupId>
39+
<artifactId>skywalking-sharing-server-plugin</artifactId>
40+
<version>${project.version}</version>
41+
</dependency>
42+
</dependencies>
43+
44+
</project>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*
17+
*/
18+
19+
package org.apache.skywalking.oap.server.receiver.telegraf.module;
20+
21+
import org.apache.skywalking.oap.server.library.module.ModuleDefine;
22+
23+
public class TelegrafReceiverModule extends ModuleDefine {
24+
25+
public TelegrafReceiverModule() {
26+
super("receiver-telegraf");
27+
}
28+
29+
@Override
30+
public Class[] services() {
31+
return new Class[0];
32+
}
33+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*
17+
*/
18+
19+
package org.apache.skywalking.oap.server.receiver.telegraf.provider;
20+
21+
import lombok.Getter;
22+
import lombok.Setter;
23+
import org.apache.skywalking.oap.server.core.Const;
24+
import org.apache.skywalking.oap.server.library.module.ModuleConfig;
25+
26+
@Setter
27+
@Getter
28+
public class TelegrafModuleConfig extends ModuleConfig {
29+
30+
public static final String CONFIG_PATH = "telegraf-rules";
31+
32+
/**
33+
* active receive configs, files split by ","
34+
*/
35+
private String activeFiles = Const.EMPTY_STRING;
36+
37+
}

0 commit comments

Comments
 (0)