Infrastructure monitoring #87

rodecker · 2020-06-08T15:49:24Z

Some kind of monitoring system that sends mails when ring infrastructure servers or services are down. Monitoring of hosts and services should be automatically configured when they are added to ansible.

rodecker · 2020-06-08T15:49:59Z

Icinga, another nagios fork, or something else entirely?

leoluk · 2020-06-08T18:51:39Z

Prometheus with Alertmanager :)

isodude · 2021-06-13T19:08:57Z

Telegraf + VictoriaMetrics was really nice to set up.
Either send Influx to Victoria or let Victoria fetch prometheus from Telegraf.

I also added MTR support to my Telegraf-fork which made it easy to get nice stats in grafana how hops are evolving over time. This could be useful for the Ring especially.

Let me know if it's of interest.

leoluk · 2021-06-13T19:47:35Z

For monitoring (vs. telemetry), Prometheus, node_exporter and Alertmanager is hard to beat.

isodude · 2021-06-14T10:38:13Z

I tried node_exporter first, but the 'everything shall be run on a different port' theme did not sit well with me.

So how it works is that Telegraf, which btw has excellent support out of the box for most things and has support for executing custom binaries that exports different formats (influx, json, simple etc), exports data via a output plugin that exports in prometheus format. VictoriaMetrics pulls the data. You can still run Alertmanager as you would, or use their own https://docs.victoriametrics.com/vmalert.html.

At the same time you get the same features as Thanos with storage over time etc.

I did have a look and there's a fairly new victoriametrics available straight in the repo. I would need to compile a telegraf from my own fork if there should be MTR support however. I also made a bit better TLS client certificate support, which means you could use client certificates between all nodes for transporting data.

So in short node_exporter + Alertmanager is technically the same as telegraf + victoriametrics.

isodude · 2021-09-27T07:53:40Z

If people like running prometheus, maybe this is interesting? https://opensourcelibs.com/lib/network_exporter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infrastructure monitoring #87

Infrastructure monitoring #87

rodecker commented Jun 8, 2020

rodecker commented Jun 8, 2020

leoluk commented Jun 8, 2020

isodude commented Jun 13, 2021

leoluk commented Jun 13, 2021

isodude commented Jun 14, 2021

isodude commented Sep 27, 2021

Infrastructure monitoring #87

Infrastructure monitoring #87

Comments

rodecker commented Jun 8, 2020

rodecker commented Jun 8, 2020

leoluk commented Jun 8, 2020

isodude commented Jun 13, 2021

leoluk commented Jun 13, 2021

isodude commented Jun 14, 2021

isodude commented Sep 27, 2021