Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Opensearch Dashboards. Otel + Jaeger vs DataPrepper. No errors statistics from DataPrepper perspective for review. #9118

Open
berezinsn opened this issue Dec 24, 2024 · 0 comments
Labels
bug Something isn't working untriaged

Comments

@berezinsn
Copy link

berezinsn commented Dec 24, 2024

Setup:
Otel Agents -> Otel collector -> Jaeger / DataPrepper -> Opensearch -> OpensearchDashboards

Versions:
Opensearch Helm Chart version: 2.27.1, appVersion: 2.18.0
Opensearch-Dashboards Helm Chart version: 2.25.0, appVersion: 2.18.0
Jaeger Helm Chart version: 3.3.3, appVersion: 1.53.0
DataPrepper Helm Chart version: 0.1.0, appVersion: 2.8.0

Describe the issue:
I have a setup with instrumented applications using OpenTelemetry (Otel) agents, which push traces to an Otel collector. The Otel collector sends data to both Jaeger and DataPrepper. However, I am noticing a difference in the behavior of the same traces when viewed in OpenSearch Dashboards depending on the data source selected (Jaeger vs. DataPrepper).

Specifically, when I select DataPrepper as the data source, I do not see the entire trace being marked as a trace with errors, and the errors are not displayed on the dashboard. In contrast, when using Jaeger as the data source, the errors are correctly visualized, and the entire trace is marked as an "error trace" if any span within the trace contains an error.

Configuration:
Jaeger:

jaeger:
  agent:
    enabled: false
  provisionDataStore:
    cassandra: false
    elasticsearch: false
  collector:
    enabled: true
    annotations: {}
    image:
      registry: ""
      repository: jaegertracing/jaeger-collector
      tag: ""
      digest: ""
    envFrom: []
    cmdlineParams: {}
    basePath: /
    replicaCount: 1
    service:
      otlp:
        grpc:
          name: "otlp-grpc"
          port: 4317
        http:
          name: "otlp-http"
          port: 4318
    serviceAccount:
      create: true
  storage:
    type: elasticsearch
    elasticsearch:
      scheme: http
      host: opensearch-cluster-master.opensearch-otel.svc.cluster.local
      port: 9200
      anonymous: true
      usePassword: false
        - name: SPAN_STORAGE_TYPE
          value: "opensearch"
        - name: ES_TAGS_AS_FIELDS_ALL
          value: "true"
      tls:
        enabled: false

DataPrepper:

    config:
      otel-trace-pipeline:
        delay: "1000"
        source:
          otel_trace_source:
            ssl: false
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        sink:
          - pipeline:
              name: "raw-traces-pipeline"
          - pipeline:
              name: "otel-service-map-pipeline"
      raw-traces-pipeline:
        source:
          pipeline:
            name: "otel-trace-pipeline"
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        processor:
          - otel_trace_raw:
          - otel_trace_group:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
        sink:
          - opensearch:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
              index_type: trace-analytics-raw
      otel-service-map-pipeline:
        delay: "1000"
        source:
          pipeline:
            name: "otel-trace-pipeline"
        buffer:
          bounded_blocking:
            buffer_size: 10240
            batch_size: 160
        processor:
          - service_map_stateful:
              window_duration: 300
        sink:
          - opensearch:
              hosts: [ "http://opensearch-cluster-master:9200" ]
              insecure: true
              index_type: trace-analytics-service-map
              index: otel-v1-apm-span-%{yyyy.MM.dd}
              #max_retries: 20
              bulk_size: 4

Relevant Logs or Screenshots:
DataPrepper source. Error in span, but not all trace marked with Error, and no statistics observed
Screenshot 2024-12-24 at 16 31 59
Screenshot 2024-12-24 at 16 30 49

Here is Jaeger source. Error is observed in span and the whole trace marked with error (in the right top corner, next capture)
Screenshot 2024-12-24 at 16 31 43
Screenshot 2024-12-24 at 16 31 07

Please share your suggestions on how to fix it. TraceID is the same for both cases.
Thanks

@berezinsn berezinsn added bug Something isn't working untriaged labels Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

1 participant