Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Add ability to enable and configure HTTP Monitoring #153950

Open
1 task done
pierrehilbert opened this issue Mar 29, 2023 · 37 comments · May be fixed by #193361
Open
1 task done

[Fleet] Add ability to enable and configure HTTP Monitoring #153950

pierrehilbert opened this issue Mar 29, 2023 · 37 comments · May be fixed by #193361
Assignees
Labels
QA:Needs Validation Issue needs to be validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v8.16.0

Comments

@pierrehilbert
Copy link
Contributor

pierrehilbert commented Mar 29, 2023

Describe the feature:
When Elastic Agent is enrolled into Fleet, we can no more configure the agent.monitoring setting because it's part of the elastic-agent.yml file (that is taken into account only when we are enrolling the Agent).
In the past, we were able to still configure it in the fleet.yml file but now this file is encrypted and it's no more possible.

This issue is following this SDH https://github.com/elastic/sdh-beats/issues/3168

Requirements

In the agent policy settings page, under the Agent Monitoring section
image

Something to this effect:

image

For reference the full configuration options are:

Note: http.buffer.enabled does not work and can be omitted, see comment #153950 (comment)

image

@pierrehilbert pierrehilbert added the Team:Fleet Team label for Observability Data Collection Fleet team label Mar 29, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@jen-huang
Copy link
Contributor

cc @nimarezainia - would like to have your input on the priority of exposing advanced agent.monitoring settings through the UI. Today we just have a simple toggle:

image

This is converted to the following in agent yaml:

agent:
  monitoring:
    enabled: true
    use_output: default
    namespace: default
    logs: true
    metrics: true

@pierrehilbert
Copy link
Contributor Author

To give more context, it's something problematic for APM but we have a "workaround": re-enroll the Agent into Fleet.

@nimarezainia
Copy link
Contributor

re-enrolling the agent is never an acceptable solution. @jen-huang we should address this but not sure if it's that urgent. I'll let you place in the appropriate sprint.

@jen-huang
Copy link
Contributor

@nimarezainia This will need design consideration to support all agent.monitoring settings. I found https://www.elastic.co/guide/en/fleet/current/elastic-agent-monitoring-configuration.html but that doesn't seem comprehensive as the original SDH reported needing to set fields like:

agent.monitoring:
  http:
    enabled: true 
    host: localhost 
    port: 6791

@pierrehilbert Are all the agent.monitoring fields documented somewhere?

@pierrehilbert
Copy link
Contributor Author

pierrehilbert commented Apr 11, 2023

I'm not aware if we have another documentation for that somewhere else.
But as you mentioned, we have more fields that we can see in elastic-agent.reference.yml
@nimarezainia do you know if we have something else?

@nimarezainia
Copy link
Contributor

I'm not aware if we have another documentation for that somewhere else. But as you mentioned, we have more fields that we can see in elastic-agent.reference.yml @nimarezainia do you know if we have something else?

Sorry I am not aware of any other docs.
Where could I find the full list of configurable options in code? (obviously I see host and port). We probably need to redesign that section of the settings as Jen mentioned.

@fmiqbal
Copy link

fmiqbal commented May 13, 2023

this is become blocking for installation.

Environment: Kubernetes VM using microk8s

I have elastic agent installed inside the k8s using this guide https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-managed-by-fleet.html#running-on-kubernetes-managed-by-fleet, it then bind to the host 6791 as seen on netstat

Now I want to add Fleet Elastic Agent in the node itself using default guide when adding agent (that refer to https://www.elastic.co/guide/en/fleet/8.7/install-fleet-managed-elastic-agent.html) , but it can't because it can't bind to 6791, and I don't think editing elastic-agent.yml does anything

May 13 18:51:19 unpad-k8s-node-0 systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
May 13 18:51:19 unpad-k8s-node-0 systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
May 13 18:51:20 unpad-k8s-node-0 elastic-agent[3289118]: Error: could not start the HTTP server for the API: listen tcp 127.0.0.1:6791: bind: address already in use
May 13 18:51:20 unpad-k8s-node-0 elastic-agent[3289118]: For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.7/fleet-troubleshooting.html
May 13 18:51:20 unpad-k8s-node-0 systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
May 13 18:51:20 unpad-k8s-node-0 systemd[1]: elastic-agent.service: Failed with result 'exit-code'.

@jerrac
Copy link

jerrac commented Sep 8, 2023

When Elastic Agent is enrolled into Fleet, we can no more configure the agent.monitoring setting because it's part of the elastic-agent.yml file (that is taken into account only when we are enrolling the Agent).
In the past, we were able to still configure it in the fleet.yml file but now this file is encrypted and it's no more possible.

Wait, so the reason I've been getting no result from modifying elastic-agent.yml is that it is no longer allowed? Even though the file itself still has that comment on top "You can update this file to configure the settings that are not supported by Fleet."?

Specifically we're trying to set up backups via Veeam and it requires the 6791 port. So I've been trying to get Agent to stop listening on that port. Is my only choice to just stop using Agent?

@pierrehilbert
Copy link
Contributor Author

elastic-agent.yml is used only by a standalone Agent.
When you are enrolling an Agent into Fleet, the local configuration file is merged with what you are getting from the Fleet policy and is creating fleet.enc that is now the new configuration file.

If you want your local elastic-agent.yml file to be taken into account again, you have to run the enroll command again to regenerate fleet.enc.

This is the current only way if you want to change the monitoring port.
Warning: don't forget to use elastic-agent.yml.<DATE>.bak content if you want to have your previous configuration in the fleet.enc too

@zez3
Copy link

zez3 commented Sep 11, 2023

@pierrehilbert

If you want your local elastic-agent.yml file to be taken into account again, you have to run the enroll command again to regenerate fleet.enc.

What did you meant by that?

I can combine some local option with some coming from Fleet or am I reading this wrong?

@pierrehilbert
Copy link
Contributor Author

You can only during enrollment phase.
When your Agent is enrolled, we won't parse again the elastic-agent.yaml file.

@zez3
Copy link

zez3 commented Sep 11, 2023

So if I try to configure HTTP endpoint for metrics https://www.elastic.co/guide/en/beats/filebeat/current/http-endpoint.html

this should work as well ?

@zez3
Copy link

zez3 commented Sep 11, 2023

Strange is that if I do:

/opt/Elastic/Agent/elastic-agent inspect

I can already see

http: enabled: true

agent:
  download:
    sourceURI: https://artifacts.elastic.co/downloads/
  features: null
  headers: null
  id: 99b69253-1d27-47b8-a5a6-c3024081e677
  logging:
    level: info
  monitoring:
    enabled: true
    http:
      buffer: null
      enabled: false
      host: ""
      port: 6791
    logs: true
    metrics: true
    namespace: ece
    use_output: default
  protection:
    enabled: false
    signing_key: mykey
    uninstall_token_hash: ""
fleet:
  access_api_key: mykey
  agent:
    id: ""
  enabled: true
  host: mydom.mytld:9243
  hosts:
  - https://mydom.mytld:9243
  protocol: http
  reporting:
    check_frequency_sec: 30
    threshold: 10000
  ssl:
    renegotiation: never
    verification_mode: ""
  timeout: 10m0s
host:
  id: d7d522adb50b4050aee658b2bbe4ebfd
http:
  enabled: true
id: ce165160-3b59-11ec-9e09-e3e9ddff6cd0
inputs:
- data_stream:
...

but that does not seem to work, I cannot see port 5066 open

Do I need re-enroll?

@zez3
Copy link

zez3 commented Sep 11, 2023

Or would this filebeat HTTP endpoint for metrics need to be configured under the - data_stream: config?

@lucabelluccini
Copy link
Contributor

The setting http.enabled.true at the root of the elastic-agent configuration is likely ignored.
It is not present in https://www.elastic.co/guide/en/fleet/current/elastic-agent-reference-yaml.html

I did an elastic-agent inspect on a Fleet managed elastic Agent and I got no http... setting at the root.

In general, we recommend to not edit the configuration of Elastic Agent.
The workarounds provided in this comment are not tested on any possible configuration of policies.

How to change or disable the 6791/http (monitoring) port

In case you want to disable the 6791/http (monitoring) port, you have 2 options:

  • [BEFORE INSTALL] set agent.monitoring.enabled: false in the elastic-agent.yml after extracting the tar.gz, then install
  • [AFTER INSTALL] set agent.monitoring.enabled: false in the elastic-agent.yml at /opt/Elastic/Agent and trigger an elastic-agent restart.

In both cases, also ensure you disable Collect agent metrics in the Elastic Agent Policy assigned to the Elastic Agent in Fleet UI. The local config overrides the policy setting anyway.

In case you want to change the 6791/http (monitoring) port, you have only 2 options:

  • [BEFORE INSTALL] set agent.monitoring.http.port: <preferred port number> in the elastic-agent.yml after extracting the tar.gz
  • [AFTER INSTALL] set agent.monitoring.http.port: <preferred port number> in the elastic-agent.yml at /opt/Elastic/Agent and re-enroll the Elastic Agent. WARNING: you will likely lose the state/registry of Elastic Agent, leading to possible data loss or duplicates. The Elastic Agent will temporarily appear twice in the Fleet UI in Kibana.

Once the port setting is effective for the Elastic Agent, subsequent upgrades should preserve it.

In this case, ensure you enable Collect agent metrics in the Elastic Agent Policy assigned to the Elastic Agent in Fleet UI.

Monitoring port listens on all interfaces

It is also well known that by default the monitoring port listens on all interfaces, not just localhost. This is being tracked via elastic/elastic-agent#2509.
As a workaround, it is possible to set the following configuration and use the same strategies as changing the port detailed in the previous paragraph.

agent.monitoring:
  http:
    enabled: true 
    host: localhost 
    port: 6791

How to change the 6789/grcp (management) port

In case you want to change the 6789/http (management) port:

  • [BEFORE INSTALL] set agent.grpc.port: <preferred port number> in the elastic-agent.yml. Then install/enroll.
  • [AFTER INSTALL] set agent.grpc.port: <preferred port number> in the elastic-agent.yml at /opt/Elastic/Agent. Then trigger an elastic-agent restart.

You can confirm the ports listening using netstat or lsof -i.

@jerrac
Copy link

jerrac commented Sep 11, 2023

[BEFORE INSTALL] set agent.monitoring.http.port: in the elastic-agent.yml after extracting the tar.gz
Just to clarify, you mean the elastic-agent.yml in the extracted directory. Not manually creating one in /opt/Elastic/Agent before the installation occurs. Right?

Long term, edits to elastic-agent.yml for settings not covered by Kibana/Fleet really should apply. Even if set after installation. Is there an issue (I'll go look myself in a bit) covering progress on making that happen?

Is there any movement on making the topic of this issue happen? As in making Elastic Agent fully configurable through Kibana/Fleet UI?

The workaround given is not exactly stellar user interface. I'd only need to do it on maybe 10 or so vm's. I can't imagine working somewhere larger and needing to make it happen on hundreds. If it was just adding some config to elastic-agent.yml and restarting it, a small ad-hoc Ansible playbook would do. But since I have to un-enroll, edit, re-enroll, I'd need to figure out how to securely pass the enrollment token around, and how to make sure the right token for the right policy goes to the right vm. Doable, I think, but enough extra I haven't done it yet.

@lucabelluccini
Copy link
Contributor

I think this boils down to have a richer structured Elastic Agent config deployable by Fleet Server through a policy.
What I've shared at #153950 (comment) is absolutely a workaround (hence the disclaimer).
I think your suggestion is great and I think Fleet/Elastic Agent team will value your feedback too.

@pierrehilbert
Copy link
Contributor Author

Ping @nimarezainia to ensure that this is under your radar.

@nimarezainia
Copy link
Contributor

Long term, edits to elastic-agent.yml for settings not covered by Kibana/Fleet really should apply. Even if set after installation. Is there an issue (I'll go look myself in a bit) covering progress on making that happen?

@jerrac this would break the configuration model for Fleet managed agents. Fleet here (and its policies) are a configuration source of truth. If we allow changes to the configuration on the agent, it will quickly drift from the source of truth and end up with agents in a policy that are not behaving the same.

We will address this issue properly by adding the configurations to the policy.

@nimarezainia
Copy link
Contributor

@amitkanfer @kpollich another use case for that advanced config conversation.

@jerrac
Copy link

jerrac commented Sep 14, 2023

this would break the configuration model for Fleet managed agents. Fleet here (and its policies) are a configuration source of truth. If we allow changes to the configuration on the agent, it will quickly drift from the source of truth and end up with agents in a policy that are not behaving the same.

I get that, even mostly agree with it, but the fact this issue exists makes me think I'd rather risk the configuration drift than not be able to actually use the tool at all.

That said, it sounds like there is work going on to make sure Fleet can mange the settings properly. Hopefully that will get everything under one hood and we won't end up with some settings only managed via yaml, and others via the UI. So I'll just look forward to seeing the results of that. :)

@jerrac
Copy link

jerrac commented Apr 8, 2024

@nimarezainia Is there any progress on making the monitoring port configurable? As well as the rest of the settings that used to be configured in the yaml file?

@nimarezainia
Copy link
Contributor

@jerrac No I am sorry I don't have an update at this point. There are other higher priorities on the roadmap but we will get to this in due course.

@jerrac
Copy link

jerrac commented Aug 25, 2024

Um, so, this issue is about 15 months old. At what point will this get addressed? Is the number of people effected by this really so low that it can linger for that long? That'd explain the delay, still leave me frustrated, but it'd explain it...

@nimarezainia
Copy link
Contributor

@jerrac yes there are higher priority issues that we are spending time on.
Sounds like you were just looking for changing the monitoring ports correct? if that capability is available via a config in the agent policy:
image

@kpollich kpollich removed their assignment Aug 29, 2024
@kpollich
Copy link
Member

This is scheduled for delivery in 8.16.0.

@kpollich kpollich added QA:Needs Validation Issue needs to be validated by QA v8.16.0 labels Aug 30, 2024
@jen-huang
Copy link
Contributor

@nimarezainia 2 Qs:

  1. Should the existing HTTP monitoring endpoint advanced setting be removed in favor of this new complete monitoring configuration UI? (I think yes)
  2. HTTP monitoring can be enabled independently of logs/metrics monitoring, correct? The example config leads me to believe we should have a general Enable monitoring toggle that then enables the rest of the monitoring options (logs, metrics, http, and the subsequent advanced options). If you agree, I can play around with what the UI looks like here.

@nimarezainia
Copy link
Contributor

@nimarezainia 2 Qs:

1. Should the existing `HTTP monitoring endpoint` advanced setting be removed in favor of this new complete monitoring configuration UI? (I think yes)

2. HTTP monitoring can be enabled independently of logs/metrics monitoring, correct? The example config leads me to believe we should have a general `Enable monitoring` toggle that then enables the rest of the monitoring options (logs, metrics, http, and the subsequent advanced options). If you agree, I can play around with what the UI looks like here.

@jen-huang the mock up is probably dated before the liveness changes were added. But yes the short answer is to have the more comprehensive set of options. It would probably be a bit weird to have an "advanced" expandable under the Agent Monitoring given that we now have a Advanced section further below.

One option would be to have an expandable section in the advanced settings say titled "Advanced Agent Monitoring" and have all the options under that. How does that sound?

(also as a tangent, could we have the whole "Advanced Settings" section of the policy as an expandable? )

@jen-huang
Copy link
Contributor

jen-huang commented Sep 18, 2024

It would probably be a bit weird to have an "advanced" expandable under the Agent Monitoring given that we now have a Advanced section further below.

One option would be to have an expandable section in the advanced settings say titled "Advanced Agent Monitoring" and have all the options under that. How does that sound?

@nimarezainia The existing advanced settings at the bottom are built from config and therefore the UI is not very flexible. I opted for the original option of adding Advanced monitoring options under the existing logs and metrics collection and tweaked the field placements, see PR for screenshot: #193361.

The UI was also informed by my conversation with @cmacknz where we clarified that:

  • HTTP monitoring endpoint can be enabled independently of Beats (logs and metrics) collection so there is no need to disable the new advanced options if they are not being collected
  • The diagnostics rate limiting and file upload options are independent of Beats and HTTP monitoring so there is no need to conditionally disable them either
  • pprof.enabled only controls if /debug/pprof endpoint is enabled (it previously used to control the endpoint + profiling collection in diagnostics), so I have put this in-line with the other HTTP UI fields

Let me know if you have any questions or concerns. Craig, please correct me if I misunderstood any of the above.

@nimarezainia
Copy link
Contributor

brilliant @jen-huang . looks great.

On a tangential ask: Can the Advanced Settings section (bottom of the agent policy) be behind an expandable also? hidden from the average user. While we are at it :-). or can open another issue for that.

@jen-huang
Copy link
Contributor

@nimarezainia Would be great to have a separate small issue for that. I can pick that up separately.

@zez3
Copy link

zez3 commented Sep 19, 2024

@jen-huang while you are at it, would it be possible to also change the custom processors text box so that it's expandable/resizeble?

Thank you

@nimarezainia
Copy link
Contributor

@jen-huang while you are at it, would it be possible to also change the custom processors text box so that it's expandable/resizeble?

Thank you

@zez3 please open an enhancement issue for that. thank you.

@zez3
Copy link

zez3 commented Sep 19, 2024

@zez3 please open an enhancement issue for that. thank you.

@nimarezainia & @jen-huang
#193387

Thank you

@jen-huang
Copy link
Contributor

@nimarezainia During my testing I found that http.buffer.enabled: true, which is supposed to enable a /buffer endpoint, does not work. Craig said that it hasn't ever worked and there aren't any plans currently to make it work. I'm going to remove this from the UI for now. It should be trivial to re-introduce if we ever decide to fix it.

@nimarezainia
Copy link
Contributor

@nimarezainia During my testing I found that http.buffer.enabled: true, which is supposed to enable a /buffer endpoint, does not work. Craig said that it hasn't ever worked and there aren't any plans currently to make it work. I'm going to remove this from the UI for now. It should be trivial to re-introduce if we ever decide to fix it.

@cmacknz can we just remove it from the filebeat config as well? as in deprecate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QA:Needs Validation Issue needs to be validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v8.16.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants