Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many recursive expansions; Cannot load file by ENV #11679

Open
Pwalne opened this issue Nov 14, 2024 · 6 comments
Open

Too many recursive expansions; Cannot load file by ENV #11679

Pwalne opened this issue Nov 14, 2024 · 6 comments
Assignees
Labels

Comments

@Pwalne
Copy link

Pwalne commented Nov 14, 2024

Describe the bug

Upto, but not limited to v0.98.0, I was able to use a ENV var in the yaml to point to a configuration file to load. It is no longer possible, and gives the error cannot resolve the configuration: too many recursive expansions It seems to not play nice with {file:/file.ayml} at all

Steps to reproduce

Use the given configuration, with a ENV pointing to a yaml configuration file for enrichment.
An example was for our own custom exporter, we had additional properties.

# These are defined inline; You can add it to the block
#   processors:
#     custom_exporter: ${file:config.yaml}
redis_addr: ${env:REDIS_ADDR}
redis_user: ${env:REDIS_USER}
redis_pass: ${env:REDIS_PASS}
metrics:
  enabled: ${env:METRICS_ENABLED}
  addr: ${env:STATSD_ADDR}
  system_name: ${env:METRICS_SYSTEM_NAME}
  app_name: ${env:METRICS_APP_NAME}

What did you expect to see?

Loads the configuration fine

What did you see instead?

cannot resolve the configuration: too many recursive expansions

What version did you use?

v 0.113.0

What config did you use?

# http://opentelemetry.io/docs/collector/configuration
receivers:
  otlp:
    protocols:
      http:
        endpoint: ":${env:OTEL_HTTP_PORT}"

processors:
  transform: ${env:CONFIG_TRANSFORM_FILE}
  batch: # this is setup for reference to pipeline

# Multiple exporters for forwarding; See https://github.com/open-telemetry/opentelemtry-collector/blob/main/docs/design.md
exporters:
  otlphttp:
    endpoint: ${env:OTEL_EXPORTER_OTLP_ENDPOINT}

extensions:
  health_check:

service:
  extensions: [ health_check ]
  telemetry:
    logs:
      level: ${env:LOG_LEVEL}
    metrics:
      address: ":8888"
  pipelines:
    traces:
      receivers: [ otlp ]
      processors: [ transform, batch ]
      exporters: [ otlphttp, ]

Environment

alpine docker image

Additional context

@Pwalne Pwalne added the bug Something isn't working label Nov 14, 2024
@mx-psi
Copy link
Member

mx-psi commented Nov 15, 2024

@Pwalne Thanks for filing this, it's unclear to me what environment variable you are setting to what value. Could you list the (I think) two configuration files and one environment variable that are involved here?

@Pwalne
Copy link
Author

Pwalne commented Nov 15, 2024

@Pwalne Thanks for filing this, it's unclear to me what environment variable you are setting to what value. Could you list the (I think) two configuration files and one environment variable that are involved here?

If you were to a exporter/processor value to {env:CONFIG_TRANSFORM_FILE} and then set the ENV to CONFIG_TRANSFORM_FILE=${file:/file.yaml} file contents: (it really doesnt matter)

# These are defined inline; You can add it to the block
#   processors:
#     custom_exporter: ${file:config.yaml}
redis_addr: ${env:REDIS_ADDR}
redis_user: ${env:REDIS_USER}
redis_pass: ${env:REDIS_PASS}

it complains about too my recursions, which worked fine prior. I cannot share the exact files we use, but in our specific condition we were setting the ENV via dockerfile with ENV CONFIG_TRANSFORM_FILE=\${file:/file.yaml}. In testing, it really didnt matter how it worked, the configuration DID NOT want to load a file at all, even without a env being used.

I would think {file:{env:CONFIG_TRANSFORM_FILE}} would work, but thats why I went down this rabbithole in the first place.

@mx-psi
Copy link
Member

mx-psi commented Nov 15, 2024

@Pwalne Thanks for the additional details. I was not able to reproduce this on my first attempt:

My failed attempt at reproducing this (click to expand)

File mainconfig.yaml:

receivers:
  otlp: ${env:CONFIG_OTLP_RECEIVER_FILE}

exporters:
  nop:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [nop]

File otlpconfig.yaml:

protocols:
  grpc:

Command that I used and logs:

❯ CONFIG_OTLP_RECEIVER_FILE='${file:otlpconfig.yaml}' ./otelcorecol --config mainconfig.yaml 

2024-11-15T13:00:54.513-0700    info    [email protected]/service.go:166 Setting up own telemetry...
2024-11-15T13:00:54.514-0700    info    telemetry/metrics.go:70 Serving metrics  {"address": "localhost:8888", "metrics level": "Normal"}
2024-11-15T13:00:54.516-0700    info    [email protected]/service.go:238 Starting otelcorecol...  {"Version": "0.113.0-dev", "NumCPU": 20}
2024-11-15T13:00:54.516-0700    info    extensions/extensions.go:39     Starting extensions...
2024-11-15T13:00:54.516-0700    info    [email protected]/otlp.go:112Starting GRPC server    {"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "localhost:4317"}
2024-11-15T13:00:54.516-0700    info    [email protected]/service.go:261 Everything is ready. Begin running and processing data.

^C # I pressed Ctrl+C to stop

2024-11-15T13:00:56.380-0700  info    [email protected]/collector.go:328Received signal from OS {"signal": "interrupt"}
2024-11-15T13:00:56.380-0700    info    [email protected]/service.go:303 Starting shutdown...
2024-11-15T13:00:56.380-0700    info    extensions/extensions.go:66     Stopping extensions...
2024-11-15T13:00:56.380-0700    info    [email protected]/service.go:317 Shutdown complete.

Am I doing something different from your setup? It seems like the problem is not in the "pass a filename through an environment variable and recursively expand" (or at least you would need some extra condition to trigger this).

I would need your help to get a minimal working example. One way to go about this is to remove parts of your configuration until you stop getting the error.

@Pwalne
Copy link
Author

Pwalne commented Nov 18, 2024

@mx-psi turns out this had todo with the comment we had in our yaml file.
So our issue went away when we removed the following block at the top of our yaml

# These are defined inline; You can add it to the block
#   processors:
#     custom_exporter: ${file:config.yaml}

I can't seem to replicate it outside of our configuration, which is wierd or there is something I'm missing that also triggered it.

@mx-psi
Copy link
Member

mx-psi commented Nov 18, 2024

Thanks! That's interesting, I think there may be an actual bug here. I suspect the issue is somewhere else but the error message is confusing, some of our recent work made the errors a bit confusing at the edges.

Would love to have a minimal example for this, if you are able to share a configuration that reproduces the error after removing any sensitive details that would be great.

@Pwalne
Copy link
Author

Pwalne commented Nov 18, 2024

Here is our additional file; For clarity, all env were set other than REDIS_USER and REDIS_PASS

# These are defined inline; You can add it to the block
#   processors:
#     workflow: ${file:config.yaml}
redis_addr: ${env:REDIS_ADDR}
redis_user: ${env:REDIS_USER}
redis_pass: ${env:REDIS_PASS}
metrics:
  enabled: ${env:METRICS_ENABLED}
  addr: ${env:STATSD_ADDR}
  system_name: ${env:METRICS_SYSTEM_NAME}
  app_name: ${env:METRICS_APP_NAME}

it was being used as

processors:
  workflow: ${env:OTEL_WORKFLOW_CONFIG}

I tried to confirm it wasnt just a docker issue, as i changed the line to workflow: ${file:/workflow_config.yaml}, but the image would always throw the error. I couldn't reproduce the issue on the commandline wierdly enough...

Our Dockerfile looked something like this

FROM alpine:3.20

ARG VERSION=0.0.0

# Apparently alpine uses musl. So we need to symlink it. Blame golang deciding to put a dependency on resolve...
RUN apk add --no-cache gcompat libc6-compat curl

COPY otel-collector /
COPY collector-config.yaml /config.yaml
COPY exporters/workflow/config.yaml /workflow_config.yaml

ENV VERSION=${VERSION}
ENV OTEL_WORKFLOW_CONFIG='${file:/workflow_config.yaml}'

Like I said, after removing the comments the issue resolved itself and our integration tests ran fine.

This was ran with the colelctor-builder generated collector target: v0.113.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants