Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need Support for Dynamic CSV Headers in filelog Receiver. #36415

Open
VenuEmmadi opened this issue Nov 18, 2024 · 5 comments
Open

Need Support for Dynamic CSV Headers in filelog Receiver. #36415

VenuEmmadi opened this issue Nov 18, 2024 · 5 comments
Labels
discussion needed Community discussion needed enhancement New feature or request receiver/filelog

Comments

@VenuEmmadi
Copy link

Component(s)

receiver/filelog

What happened?

Description

Description:
I am using the filelog receiver in the OpenTelemetry Collector Contrib to parse CSV log files. When parsing a single file with a predefined header, the configuration works as expected. However, when attempting to process multiple CSV files with different headers, there is no way to dynamically handle varying headers.

If the header is omitted, the configuration fails with an error. This limitation makes it impossible to manage directories containing multiple CSV files with different structures efficiently.

Steps to Reproduce

Steps to Reproduce :

  1. Configure the filelog receiver to parse a single CSV file with a specified header
    receivers:
    filelog/LightningInteractionLogs_quoted:
    include: [/u01/SFLogs/8292024/continuationcallout_hundred.csv]
    start_at: beginning
    operators:

    • type: csv_parser
      header: ApplicationName, page_app_name, Application_Version, Environment, HostName, EventType, timestamp, user_id, user_name, url, duration, request_form_size, response_size, status_code, success, TimestampDerived
  2. Attempt to configure the receiver to include multiple CSV files with varying headers:
    receivers:
    filelog/LightningInteractionLogs_multiple:
    include: [/u01/SFLogs/*.csv]
    start_at: beginning
    operators:

    • type: csv_parser

      No way to handle multiple headers dynamically

  3. Observe the failure when the header is not explicitly provided:
    Error: failed to build pipelines: failed to create "filelog/LightningInteractionLogs_multiple" receiver for data type "logs"; missing required field "header" or "header_attribute"

Expected Result

Expected Result :
The csv_parser operator should be able to:

Dynamically detect headers from the first row of the CSV file (e.g., via a dynamic_header option).
Alternatively, allow mapping specific headers to specific files or file patterns using a header_attribute or similar configuration.

For example:
receivers:
filelog/LightningInteractionLogs_dynamic:
include: [/u01/SFLogs/*.csv]
start_at: beginning
operators:
- type: csv_parser
dynamic_header: true

Actual Result

Actual Result
The configuration fails when header is not explicitly provided, making it impossible to process multiple CSV files with different headers in the same receiver configuration.

Error message:
Error: failed to build pipelines: failed to create "filelog/LightningInteractionLogs_multiple" receiver for data type "logs"; missing required field "header" or "header_attribute"

Collector version

v0.109.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

receivers:
  filelog/LightningInteractionLogs_multiple:
    include: [/u01/SFLogs/*.csv]
    start_at: beginning
    operators:
      - type: csv_parser
exporters:
  logging:
    loglevel: debug

service:
  pipelines:
    logs:
      receivers: [filelog/LightningInteractionLogs_multiple]
      exporters: [logging]

Log output

Error: failed to build pipelines: failed to create "filelog/LightningInteractionLogs_multiple" receiver for data type "logs"; missing required field "header" or "header_attribute"

Additional context

No response

@VenuEmmadi VenuEmmadi added bug Something isn't working needs triage New item requiring triage labels Nov 18, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@VihasMakwana
Copy link
Contributor

To me, it sounds like a valid enhancement request. But I'm quite unsure how to accomplish this.
Maybe the codeowners have thoughts over this?

@VihasMakwana VihasMakwana added discussion needed Community discussion needed enhancement New feature or request and removed needs triage New item requiring triage bug Something isn't working labels Nov 20, 2024
@VihasMakwana
Copy link
Contributor

This is an enhancement, not a bug. Please let me know if you disagree

@VenuEmmadi
Copy link
Author

This is an enhancement, not a bug. Please let me know if you disagree

I’m not entirely sure if this qualifies as an enhancement or a bug. At the very least, they should not accept the acceptance of regular expressions for file names in case of csv-parser. Please share your thoughts.

@VihasMakwana
Copy link
Contributor

I see what you mean. But I'm not very supportive of that idea. It seems like a strange case to me.

I'll explore the codebase and see what we can do here. In the meantime, I'll ask @djaglowski to share his thoughts over this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion needed Community discussion needed enhancement New feature or request receiver/filelog
Projects
None yet
Development

No branches or pull requests

2 participants