Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Fleet Behavior when Removing Configurations from kibana.yml for Managed Agent Policies #193407

Closed
eyalkraft opened this issue Sep 19, 2024 · 2 comments · Fixed by #193725 or #195377
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@eyalkraft
Copy link
Contributor

Kibana version: latest (8.15)

Elasticsearch version: latest (8.15)

Describe the bug:
We observed inconsistency in Fleet’s behavior when removing configurations from kibana.yml, specifically for managed Agent policies.
First we configured a Fleet Host, Fleet output, and a managed Agent Policy (using the host and output) through the kibana.yml. After starting Kibana we removed these configurations from kibana.yml.
After removing the Fleet Host and Fleet output configurations, the Agent policy remains, but the Fleet Host and Fleet output are deleted. This leads to the Fleet UI failing since the Agent policy references non-existing Fleet Host and output.

Steps to Reproduce:

  1. Configure Kibana using kibana.yml with a managed Agent policy, Fleet Host, and Fleet output.
  2. Remove the Fleet Host, Output and agent policy from kibana.yml.
  3. Stop and start Kibana to re-read kibana.yml
  4. Observe that the Fleet UI fails due to the missing references in the managed Agent policy.
kibana.yml example

# Kibana configuration file

# Enabling agentless mode

xpack.cloud.serverless.project_id: 'some_fake_project_id'

xpack.securitySolutionServerless.productTypes:
  - product_line: security
    product_tier: complete
xpack.ml.nlp.enabled: true  # Enable NLP when security and complete are selected

# Fleet and agentless settings
xpack.fleet.enableExperimental:
  - agentless

xpack.fleet.packages:
  - name: "cloud_security_posture"
    version: "latest"

server.versioned.versionResolution: oldest

-# Remove everything below

xpack.fleet.agentPolicies:
  - name: "Agentless"
    id: "agentless-policy"
    is_managed: true
    namespace: "default"
    fleet_server_host_id: "agentless-fleet-internal-host"
    data_output_id: "agentless-es-internal-output"
    monitoring_output_id: "agentless-es-internal-output"
    monitoring_enabled: ["logs", "metrics"]
    supports_agentless: true
    package_policies: []

xpack.fleet.fleetServerHosts:
  - id: "agentless-fleet-internal-host"
    name: "Agentless internal fleet server"
    is_default: false
    is_internal: true
    host_urls: ["https://internal-fleet-server-url/"]

xpack.fleet.outputs:
  - id: "agentless-es-internal-output"
    name: "Internal agentless output"
    type: "elasticsearch"
    is_default: false
    is_default_monitoring: false
    is_internal: true
    hosts: ["https://internal-es-url/"]

Expected Behavior:

  1. Deletion of non-default fleet host and output from kibana.yml which result with their deletion in Kibana, should result with removal of references to this host and output from managed agent policies, just like it works for non-managed agent policies. They should be replaced with the default fleet host and output.
  2. The Fleet UI should not crash if an agent policy references a non-existing Fleet Host or output. Instead, this specific Agent policy should display an error, while the remaining policies should be accessible and unaffected.

Screenshots (if relevant):
Image
Image
Image

Provide logs and/or server output (if relevant):

Details

[2024-09-19T13:34:13.956+03:00][INFO ][status.plugins.fleet] fleet plugin is now available: Fleet setup failed
GET kbn:/api/fleet/agent_policies

{
    "item": {
        "id": "agentless-policy",
        "version": "WzM4NSwxXQ==",
        "space_ids": [],
        "monitoring_enabled": [
            "logs",
            "metrics"
        ],
        "inactivity_timeout": 1209600,
        "is_preconfigured": true,
        "data_output_id": "agentless-es-internal-output",
        "monitoring_output_id": "agentless-es-internal-output",
        "fleet_server_host_id": "agentless-fleet-internal-host",
        "schema_version": "1.1.1",
        "package_policies": [],
        "agents": 0,
        "namespace": "default",
        "name": "Agentless",
        "supports_agentless": true,
        "status": "active",
        "is_managed": true,
        "revision": 2,
        "updated_at": "2024-09-19T10:04:32.766Z",
        "updated_by": "system",
        "is_protected": false,
        "unprivileged_agents": 0
    }
}

Any additional context:

This bug was discovered as part of an expected migration of agentless on serverless from using preconfigured (kibana.yml) agent policy to using the agentless API.

cc @kpollich @nchaulet

@eyalkraft eyalkraft added the bug Fixes for quality problems that affect the customer experience label Sep 19, 2024
@botelastic botelastic bot added the needs-team Issues missing a team label label Sep 19, 2024
@eyalkraft eyalkraft added Team:Fleet Team label for Observability Data Collection Fleet team and removed needs-team Issues missing a team label labels Sep 19, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@amirbenun
Copy link
Contributor

I see that it was fixed for preconfigured outputs.
However, I still get the same behavior for preconfigured fleet servers:
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
4 participants