Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release] logzio APM Collector 1.0.0 #561

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

8naama
Copy link
Contributor

@8naama 8naama commented Nov 19, 2024

Description

As part of the Unified helm chart, this is the new chart for APM

  • Changes from logzio-telemetry
    • drop support for daemonset mode. In logzio-telemetry it's required for metrics scraping per node which is not needed for traces
    • support enabling autoscaling for SPM via VPA
    • Secret values are read from .Values.global.<secretParam> section. This change provides two key benefits:
      • Enables the definition of tokens in the parent chart without duplication
      • Preserves the ability of each sub chart to control the values of its secrets
    • OTEL configuration changes:
      • Merge the config of latency and calls which renames span.name >> operation
      • Add _seconds prefix to Service Graph metrics (which was dropped in v109) following product request to keep the naming of metrics as it was
      • Add connector.spanmetrics.legacyMetricNames feature gate to keep old names of SPM metrics (which were changed to match the Service Graph metrics names in v109) following product request
    • Fine tune ClusterRole permissions and drop cluster-admin role permissions
    • Add support for easily changing the OTEL Collector logging level via a flag otelLogLevel
    • Listener address for SPM is automatically generated based on the region, for better user experience
    • Move all custom configurations to be generated in .tpl files (_helpers and _config)
    • Add resource limitations to the pods
  • Aligned relevant tests
  • Added readme
    • add custom tail sampling explanation section
    • moved changelog to it's own file CHANGELOG.md to keep the readme more clean

What type of PR is this?

(check all applicable)

  • 🍕 Feature
  • 🐛 Bug Fix
  • 🧑‍💻 Code Refactor
  • 🔥 Performance Improvements
  • ✅ Test
  • 🤖 Build / CI
  • ⏩ Revert

Added tests?

  • 👍 yes
  • 🙅 no, because they aren't needed
  • 🙋 no, because I need help from somebody

@8naama 8naama mentioned this pull request Nov 19, 2024
10 tasks
@8naama 8naama force-pushed the release/logzio-apm-collector-1.0.0 branch from 35f9f7e to 6796009 Compare December 10, 2024 11:56
@8naama 8naama marked this pull request as ready for review December 12, 2024 15:11
level: ${LOG_LEVEL}

# Exporter from Traces Collector to SPM Collector
spmForwarderConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to rethink about the separation of the config spmForwarderConfig from the main traceConfig. It can be confusing when a user will want to add custom configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll think if there is a better way to manage it, perhaps keep the base config as is and only remove\add the exporter from the pipeline

level: ${LOG_LEVEL}

# Service Graph configuration
serviceGraphConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again not sure its the best practice to separate the logic

memory: 250Mi

# Configuration for ports
ports:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ports for traces collector? or spm collector?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it for both, but I can split them to be more precise and reduce the amount of ports for SPM since we need less open ports there 👍🏼

{{- end }}

{{- $region := lower .Values.global.logzioRegion }}
{{- if not (or (eq $region "us") (eq $region "eu") (eq $region "uk") (eq $region "ca") (eq $region "au")) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we fail the installation if this condition is evaluated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of case where there would be a new region, it would be easier to add the code then add a full custom endpoint. Do you think it would be better to fail?

action: insert
batch: {}
service:
extensions: [health_check, pprof, zpages]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to use https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/storage/filestorage for data persistence? and see how it effects the performance? same goes for the spm collector? maybe a value that can control the storage type

headers:
user-agent: "{{ .Chart.Name }}-{{ .Chart.Version }}-helm"
extensions:
pprof:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we still need to use this extension, also zpages

health_check:
endpoint: :13133
receivers:
jaeger:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this legacy jaeger receiver at this point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants