Skip to content

Releases: aerospike/aerospike-monitoring

Aerospike Monitoring v2.8.0

20 Sep 17:12
da6dde6
Compare
Choose a tag to compare

Description

NOTE: The v2.8.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • This release includes 1 major feature - Connector Dashboard, Alerts and topology.
  • Aerospike Monitoring Stack version 2.8.0 adds 2 dashboard, alerts and bug fixes:
    • 2 dashboards to monitor connectors and connector JVM metrics.
    • Enhanced alerts to cover various aspects of Connector key metric thresholds and JVM health.

NOTE:

  • Aerospike Prometheus exporter 1.13.0 or greater must be used to get the Aerospike 6.4 metrics.
  • The Multi-Cluster View dashboard now requires the Diagram Panel plugin.

Features

  • [OM-64] - Create predefined Prometheus alert rules for Connectors.
    • This release include 6 alerts to cover mandatory functional and process/health of the Connectors.
      • Key alerts covered are connector-status, connector-request-lag, connector-request-errors, jvm heap, jvm cpu and jvm gc.
  • [OM-56] - Connectors alerts & Dashboards
    • Connector view dashboard which helps to monitor 6 connectors.
      • Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
      • Key metrics covered are - request lag, request error, success, skipped, connections, xdr record byte size, etc....
  • [OM-107] - Create a dashboard for a Connector(s)
    • Connector JVM view dashboard which helps to monitor JVM health of 6 Connectors.
      • Connectors supported are - xdr-proxy, kafka-outboud, pulsar-outbound, esp-outbound, elastic-search and jms-outbound.
      • Key metrics covered are - uptime, cpu, memory, threads, files, classes and buffers.
    • Multi-cluster view dashboard is enhanced to display Aerospike Server topology using the cluster-name and xdr dc configurations.
      NOTE:
      • To view data replication topology in multi-cluster-view.
      • The cluster-name is mandatory and destination cluster-name is configured as the name of dc in xdr section of the Aerospike Server configuration.

Fixes

  • [OM-122] - Avoid duplicate defrag metric values on the namespace dashboard.
  • [OM-113] - Namespace view dashboard - average objects per sprig stat.
  • [OM-120] - Add high-water mark breached to the Rolling Restart dashboard.

Aerospike Monitoring v2.7.0

28 Aug 14:38
f57d636
Compare
Choose a tag to compare

Description

NOTE: The v2.7.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • This release includes 2 major features - Enhanced Alerts and All Flash use-case dashboard.
  • Aerospike Monitoring Stack version 2.7.0 adds new dashboard and bug fixes:
    • All Flash dashboard, various key metrics which should be monitored while working with flash storage at both index and sindex.
    • Enhanced alerts to cover various aspects of server metrics, this release covers alerts on Namespaces, XDR, Latencies, Best checks, Node-exporter etc...

NOTE:

  • Aerospike Prometheus exporter 1.13.0 or greater must be used to get the Aerospike 6.4 metrics.

Features

  • [OM-104] - Add new XDR bytes-shipped metrics to dashboards.
    • Display bytes-shipped both as stat and time-series which can help monitoring the replication progress.
  • [OM-98] - Observability & Management Alerts - Enhance / enrich prometheus alerts from ACMS.
    • This release includes 40 alerts covering various metrics of Aerospike Server, some key areas are:
      • Namespaces, Latencies, data replication (xdr), set, node-exporter, flash , best checks etc...
  • [OM-93] - Use-case Dashboard: all-flash.
    • A new use-case dashboard is introduced in this release, this dashboard focuses mainly on key metrics and alerts related to flash usage.
      • Some key metrics are average-objects per sprig, index-pressure, primary index flash and secondary index flash etc...
  • [OM-48] - Use-case Dashboard Organization & Naming.
    • Added brief descriptions on each dashboard and updated tags to identify each dashboard easily.
  • [OM-111] - Observability dashboard unit tests.
    • Created a framework to test our dashboard automatically including panels, expression / queries, layout and expression results.
  • [OM-103] - Add user stat related alerts.
    • Added user stat specific alerts covering connections, connection churn etc...
  • [OM-101] - Add warning for best practice failures.
    • Alerts if best-practices are not followed while setting up the Aerospike server, this flag is sent by the server after a series of checks.
  • [OM-102] - Add warning for node-exporter not being present.
    • As a precursor to integrate node-exporter metrics into Aerospike Monitoring stack, this alert is introduced if node-exporter is not configured, raising a warning alert in the Alerts View dashboard.

Aerospike Monitoring v2.6.1

03 Aug 13:51
2fecad6
Compare
Choose a tag to compare

Description

  • The v2.6.1 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.
  • Aerospike Monitoring Stack version 2.6.1 adds bug fixes.

NOTE:

  • Aerospike Prometheus exporter 1.12.0 or greater must be used to get the Aerospike 6.3 metrics.
  • Deprecated
    • Existing Alerts dashboard is deprecated and will be removed in future releases.
    • Existing Jobs dashboard is deprecated and will be removed in future releases.

Fixes

  • [OM-100]- Issues in Multi-cluster view dashboard
    -- Corrected label and unit in XDR panel.
    -- Corrected links from XDR and Latencies to respective dashboards (instead of cluster-view).
    -- Added a alert-severity based filter.
  • Issues in Alerts view
    -- Panel colors are corrected according to the severity types.
  • Issues in Unique Data view
    -- Unique data bytes are not shown correctly when custom labels are enabled in configuration.
    -- Added historical time-series for unique data-bytes data point.

Aerospike Monitoring v2.6.0

12 Jul 12:25
Compare
Choose a tag to compare

Description

NOTE: The v2.6.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • This release eliminates instances of hard coded values for variables. As a result, the user needs to ensure that the Aerospike Prometheus data source is selected as a default in order for dashboard data to populate correctly.
  • Aerospike Monitoring Stack version 2.6.0 adds new dashboard and bug fixes
    • Rolling restarts dashboard, various key metrics which should be monitored during specific use cases
    • Alerts View dashboard, adopting more meaningful alert severity levels

NOTE:

  • Aerospike Prometheus exporter 1.12.0 or greater must be used to get the Aerospike 6.3 metrics.
  • Deprecated
    • Existing Alerts dashboard is deprecated and will be removed in future releases.
    • Existing Jobs dashboard is deprecated and will be removed in future releases.

Features

  • OM-79 - Rolling Restarts dashboard, data is shown in group like stats, error and resources.

    • This dashboard curates various key metrics which should be monitored during specific use cases, like
      • Node restart
      • Software upgrade
      • Investigation
      • etc...
    • Resource utilization is displayed for the TopK major consumers at a service and namespace level.
  • OM-85 - Added the new Alerts view dashboard. This visualizes alerts according to the severity as count and each alert.

    • Newly adopted alert levels in decreasing order
      • critical, error, warn and info.
    • This dashboard replaces the existing Alerts dashboard.
  • OM-82 - All Aerospike dashboards and panel visualizations are modified according to the Grafana 9.x version.

  • OM-49 - Improved and reorganized Aerospike Monitoring stack examples

    • Reorganized docker compose file in relevant folder.
    • Added examples on how to use AeroLab which can spin up Aerospike clusters per Proof of Concept (POC) needs.

Fixes

  • OM-82 - Includes bug fixes related to queries and visualizations
    • All queries now include proper regex pattern to honor single or multiple value template variable selection.
    • All Time-Series are adjusted to use range vector.
    • All dashboard have standardized template variable and same order.

Aerospike Monitoring v2.5.0

20 Jun 04:17
a94b10d
Compare
Choose a tag to compare

Description

NOTE: The v2.5.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • Aerospike Monitoring Stack version 2.5.0 adds the new Multi Cluster view dashboard, Otel integration examples and bug fixes.
    NOTE: Aerospike Prometheus exporter 1.11.0 or greater must be used to get the Aerospike 6.3 metrics.

Features

  • OM-45 - Added the new Multi cluster view dashboard. This visualizes multiple clusters across regions and data centers with a focus on health. This dashboard consists of 4 panels.

    • Geomap panel - displays multiple cluster view.
    • Cluster panel - displays key metrics like size, alerts, XDR lag, Read & Write latencies.
    • Node panel - uses the Polystat plugin and displays nodes in Green or Red indicating the health.
    • Namespace panel - displays namespaces in Green or Red indicating the health.
      Key metrics used in this dashboard
      - aerospike_node_up
      - aerospike_namespace_objects
      - aerospike_node_stats_cluster_size
      - aerospike_xdr_lag
      - aerospike_latencies_write_ms_bucket
      - aerospike_latencies_read_ms_bucket
  • OM-60 - Added new examples on how to integrate Aerospike prometheus exporter with the Otel collector and export metrics to a partner solution
    partner integration examples are provided for NewRelic, Datadog and Cloudwatch.

Fixes

  • OM-76 - In the Namespace dashboard, the Defrag row hides anomalies as a result of aggregation.
    • Removed the Defrag row, as aggregation is removed and moved from the defrag panels to the namespace row to display defrag metrics for each namespace.

Aerospike Monitoring v2.4.0

16 May 07:32
44dcecd
Compare
Choose a tag to compare

Description

NOTE: the v2.4.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • Aerospike Monitoring Stack version 2.4.0 adds support for metrics introduced in Aerospike 6.3.
    NOTE: Aerospike Prometheus exporter 1.11.0 or greater must be used to get the Aerospike 6.3 metrics.

Features

Fixes

Aerospike Monitoring v2.3.1

19 Apr 07:00
6c9c1a6
Compare
Choose a tag to compare

Fixes

  • [OM-37] - Issues in Set view, Unique data view, Sindex view, Namespace view and Node view:
    - Fixed issue in "Set view" dashboard to remove hardcoded datasource.
    - Re-exported Set view, Unique data view, Sindex view, Namespace view and Node view dashboards with right configurations so they are suitable to be made available in Grafana Cloud.

Aerospike Monitoring v2.3.0

03 Apr 13:19
239b02f
Compare
Choose a tag to compare

Description

NOTE: the v2.3.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • Aerospike Monitoring Stack version 2.3.0 adds support for metrics introduced in Aerospike 6.3.
    NOTE: Aerospike Prometheus exporter 1.10.0 or greater must be used to get the Aerospike 6.3 metrics.

Features

  • Added 6.3 metrics:
    • Adds aerospike_sindex_used_bytes secondary index metric.
    • Adds aerospike_namespace_nsup_cycle_deleted_pct NSUP metric.
    • Adds aerospike_sets_stop_writes_size set level configuration.
  • Updated memory used panel in secondary index to consider aerospike_sindex_used_bytes or aerospike_sindex_memory_used as aerospike_sindex_memory_used is deprecated in Aerospike 6.3.
  • Added nsup metrics panel to Namespace view dashboard.
  • Added set level quotas panel to Namespace view dashboard.
  • Added a new dashboard displaying set level metrics.
  • Added a new dashboard displaying unique data usage.
  • Added 4 new prometheus alerts:
    • NamespaceSupervisorFallingBehind when NSUP is falling behind and/or display the length of time the most recent NSUP cycle lasted.
    • NamespaceFreeMemoryCloseToStopWrites when one of your Aerospike nodes memory is close to the stop writes limit configured for a namespace.
    • NamespaceSetQuotaWarning when one of your Aerospike nodes is at 80% of the quota you have configured on a set.
    • NamespaceSetQuotaAlert when one of your Aerospike nodes is at 99% of the quota you have configured on a set.

Aerospike Monitoring v2.2.0

26 Aug 23:22
148141c
Compare
Choose a tag to compare

Description

NOTE: the v2.2.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • Aerospike Monitoring Stack version 2.2.0 adds support for metriics introduced in Aerospike 6.1.
    NOTE: Aerospike Prometheus exporter 1.8.0 or greater must be used to get the Aerospike 6.1 metrics.

Features

  • [TOOLS-2087] - Add server 6.1 metrics.
    • Adds aerospike_xdr_bytes_shipped.
    • Adds aerospike_sindex_entries_per_bval.
    • Adds aerospike_sindex_entries_per_rec.
  • [TOOLS-2132] Replace latency panels with heat map and percentiles.

Aerospike Monitoring v2.1.0

29 Aug 21:38
dc39fef
Compare
Choose a tag to compare

Description

NOTE: the v2.1.0 Grafana dashboards are not backwards compatible with servers older than 6.0.0.0.

  • Add support for the batch-index latency metrics aerospike_latencies_batch_index_us_bucket and aerospike_latencies_batch_index_us_count.
    NOTE: Aerospike Prometheus exporter 1.7.0 or greater must be used to get the batch-index latency metrics.

Features

  • [TOOLS-2069] - Add batch-index latency panels.