-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[chore][otel] Enable persistence in otel.yml #5549
Conversation
This pull request does not have a backport label. Could you fix it @VihasMakwana? 🙏
|
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
@elastic/elastic-agent-control-plane I propose two options for default directory:
For the latter option, we need to implement a config converter which maps Let me know how do you feel about this proposal. |
Introducing a new config converter would be straightforward and could be easily integrated into our distribution. |
As discussed let's leave this as an environment variable. How the collector is run and installed (if it is installed as a service at all) is currently completely up to the user and this goes along with that. The only place we likely want a default is in our reference Kubernetes configurations, which can set the environment variable or set a default in the configuration it includes directly. In general we do not want to use any of the paths that are also used by Beats or Agent, because a standalone OTel collector must be able to run alongside them without issue. In our OTel distribution I think we solve this by having an |
This pull request is now in conflicts. Could you fix it? 🙏
|
changelog/fragments/1726572104-enable-persistence-by-default.yaml
Outdated
Show resolved
Hide resolved
otel.yml
Outdated
file_storage/filelogreceiver: | ||
create_directory: true | ||
directory: ${env:STORAGE_DIR} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the File Storage extension may cause problems for users using this configuration with read-only filesystems, like in Docker, in Kubernetes, unless they configure a writable storage layer in their containers/pods. Should we be concerned about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the File Storage extension may cause problems for users using this configuration with read-only filesystems, like in Docker, in Kubernetes, unless they configure a writable storage layer in their containers/pods. Should we be concerned about this?
I believe we will have a separate recommendation in our k8s configs based on @cmacknz's comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change breaks running Elatic Agent as OTel Collector in Docker. I'd rather have this change applied in a way that doesn't break running the EDOT Collector in Docker. If we want to enable persistence by default, it should "just work", like it "just works" with the regular Elastic Agent.
$ docker run --rm -ti --entrypoint="elastic-agent" docker.elastic.co/beats/elastic-agent:9.0.0-SNAPSHOT otel
Starting in otel mode
2024-09-30T09:06:39.260Z info [email protected]/service.go:137 Setting up own telemetry...
2024-09-30T09:06:39.261Z info [email protected]/service.go:186 Skipped telemetry setup.
2024-09-30T09:06:39.261Z info builders/builders.go:26 Development component. May change in the future. {"kind": "exporter", "data_type": "logs", "name": "debug"}
2024-09-30T09:06:39.262Z info builders/extension.go:50 Development component. May change in the future. {"kind": "extension", "name": "memory_limiter"}
2024-09-30T09:06:39.262Z info memorylimiter/memorylimiter.go:75 Memory limiter configured {"kind": "extension", "name": "memory_limiter", "limit_mib": 700, "spike_limit_mib": 180, "check_interval": 1}
2024-09-30T09:06:39.252Z warn [email protected]/provider.go:59 Configuration references unset environment variable {"name": "STORAGE_DIR"}
failed to build extensions: failed to create extension "file_storage/filelogreceiver": mkdir /var/lib/otelcol: permission denied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrzej-stencel I see what you mean. The error is due to an empty env:STORAGE_DIR
and hence, the filestorage
extension defaults to /var/lib/otelcol
. Since the default user in the Docker container is elastic-agent
, this causes a permission denied error.
If I use any of following commands, it works.
docker run --rm -ti --entrypoint="elastic-agent" --mount type=bind,source=$(pwd)/data,target=/usr/share/otel -e STORAGE_DIR=/usr/share/otel/ docker.elastic.co/beats/elastic-agent:9.0.0-SNAPSHOT otel
or
docker run --user root --rm -ti --entrypoint="elastic-agent" docker.elastic.co/beats/elastic-agent:9.0.0-SNAPSHOT otel
I wonder how should we approach this. Maybe add documentation? or update EDOT according to @cmacknz's comment. Or do both.
In our OTel distribution I think we solve this by having an install command that puts this in whatever base install directory we pick (/opt/Elastic, C:\Program Files\Elastic, or /Library/Elastic) but we need an installer that runs as a privileged user to do that. You can't do it with config alone.
@cmacknz @andrzej-stencel I'd appreciate your opinions over this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that is should "just work".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the container we control the paths, and can pick a default that is writable by our user in the container. Elastic agent uses the STATE_PATH environment variable and puts it under /usr/share/elastic-agent.
elastic-agent/internal/pkg/agent/cmd/container.go
Lines 48 to 49 in 95021e1
agentBaseDirectory = "/usr/share/elastic-agent" // directory that holds all elastic-agent related files | |
defaultStateDirectory = agentBaseDirectory + "/state" // directory that will hold the state data |
We could add an /otelcol
sub-directory to /usr/share/elastic-agent
for one possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly we can reuse the STATE_PATH
variable and change which sub-directory we use based on whether the entrypoint is the otel command or the elastic-agent container command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
Co-authored-by: Andrzej Stencel <[email protected]>
This pull request is now in conflicts. Could you fix it? 🙏
|
@andrzej-stencel @cmacknz PTAL 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond of the prepareEnv
function setting the STORAGE_DIR
env var irrespective of whether the env var is used in configuration or not. On the other hand, I don't have an alternative "perfect" solution, so let's go with this approach now and perhaps change if we come up with a better way.
This pull request is now in conflicts. Could you fix it? 🙏
|
internal/pkg/agent/cmd/otel.go
Outdated
// The filestorage extension will handle directory creation since create_directory: true is set by default. | ||
// If the user hasn’t specified the env:STORAGE_DIR in filestorage, they may have opted for a custom path, and the extension will create the directory accordingly. | ||
// In this case, setting env:STORAGE_DIR will have no effect. | ||
if err := os.Setenv("STORAGE_DIR", filepath.Join(paths.Top(), "otel_registry")); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would just call the path otel
instead of otel_registry
since there is no guarantee we only use this directory to store file log registries long term.
internal/pkg/agent/cmd/otel.go
Outdated
// The filestorage extension will handle directory creation since create_directory: true is set by default. | ||
// If the user hasn’t specified the env:STORAGE_DIR in filestorage, they may have opted for a custom path, and the extension will create the directory accordingly. | ||
// In this case, setting env:STORAGE_DIR will have no effect. | ||
if err := os.Setenv("STORAGE_DIR", filepath.Join(paths.Top(), "otel_registry")); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, why can't we just reuse the STATE_PATH environment variable the container already supports so we don't need to add a new one? The default value is also writable by the elastic agent container user.
One reason to do this is it won't be a breaking change if we decide to get rid of STORAGE_DIR later.
Co-authored-by: Craig MacKenzie <[email protected]>
Co-authored-by: Craig MacKenzie <[email protected]>
@cmacknz I've reused the STATE_PATH and changed directory name. PTAL! |
internal/pkg/agent/cmd/otel.go
Outdated
@@ -118,3 +123,22 @@ func runCollector(cmdCtx context.Context, configFiles []string) error { | |||
return goerrors.Join(errs...) | |||
|
|||
} | |||
|
|||
func prepareEnv() error { | |||
if _, ok := os.LookupEnv("STORAGE_DIR"); !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use STATE_PATH here as well (and in the collector examples)?
What I am trying to do is avoid introducing a new environment variable we may decide to remove in the future, trying to keep our public API as small as possible while we figure out how everything fits together.
STATE_PATH isn't going anywhere without making it a breaking change for many people, so it doesn't increase the surface area of configuration we need to maintain.
This pull request is now in conflicts. Could you fix it? 🙏
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Quality Gate passedIssues Measures |
* chore: update config, docs * fix: spelling mistake * fix: go tidy * remove comments * fix: go tidy * fix: add filestorage to extension * Update changelog/fragments/1726572104-enable-persistence-by-default.yaml Co-authored-by: Andrzej Stencel <[email protected]> * chore: update readme * chore: update readme * go.mod * chore: go.mod * comments * comments * chore: update notice * chore: go.sum and notice * chore: go.sum and notice * Update internal/pkg/otel/templates/README.md.tmpl Co-authored-by: Craig MacKenzie <[email protected]> * Update internal/pkg/otel/templates/README.md.tmpl Co-authored-by: Craig MacKenzie <[email protected]> * chore: address comments * chore: use STATE_PATH * chore: update readme * fix: add missing import --------- Co-authored-by: Andrzej Stencel <[email protected]> Co-authored-by: Craig MacKenzie <[email protected]> (cherry picked from commit 49579ae)
What does this PR do?
This PR adds the
filestorage
extension inotel.yaml
and updates the documentation.Updates
otel
dependencies to 0.110.0Why is it important?
By default, the OpenTelemetry Collector operates in a stateless mode, meaning it does not save file offsets to disk. To resolve this, we enable persistence in the default configuration files.
Checklist
./changelog/fragments
using the changelog toolHow to test this PR locally
elastic-agent otel --config otel.yaml
Related issues