Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Elastic Agent] Add a new Custom Filestream integration #5370

Open
nimarezainia opened this issue Aug 28, 2024 · 22 comments · May be fixed by elastic/integrations#11332
Open

[Elastic Agent] Add a new Custom Filestream integration #5370

nimarezainia opened this issue Aug 28, 2024 · 22 comments · May be fixed by elastic/integrations#11332
Assignees
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@nimarezainia
Copy link
Contributor

Our current Custom log integration is based on the log input which is now being deprecated. We therefore need to introduce a Custom Filestream integration so users can migrate over to using this integration instead.

@nimarezainia nimarezainia self-assigned this Aug 28, 2024
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Aug 28, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@ycombinator
Copy link
Contributor

@nimarezainia Should this issue be moved to the https://github.com/elastic/integrations/ repo as that's where the custom log integration is defined?

@nimarezainia
Copy link
Contributor Author

@nimarezainia Should this issue be moved to the https://github.com/elastic/integrations/ repo as that's where the custom log integration is defined?

No I believe that this is owned by the data plane team. (fyi @flexitrev )

@nimarezainia
Copy link
Contributor Author

Linking these here for record keeping:

elastic/integrations#4609
https://github.com/elastic/ingest-dev/issues/1384

@belimawr
Copy link
Contributor

belimawr commented Sep 9, 2024

Could we migrate the Custom Logs integration to use Filestream instead? We have the take-over mode in Filestream that is capable of getting the state from a file being harvested by the Log input and use it in the Filestream input, the only thing missing is an automated way to convert the configuration (specially the custom configuration field). It might also need some extra work because the way the Elastic-Agent divides the integrations per input type when starting component.s

Just adding a new integration might make it harder to migrate the state and confusing for users finding both when searching.

@cmacknz
Copy link
Member

cmacknz commented Sep 9, 2024

Could we migrate the Custom Logs integration to use Filestream instead?

We could, but it is a lot more work, in addition to what you mentioned it is blocked by:

Creating a new integration at least stops new instances of the deprecated input being used with low effort. The migration problem should still be solved but it doesn't need to block creating a filestream based integration given how long solving migration correctly is likely to take.

@nimarezainia nimarezainia removed their assignment Sep 17, 2024
@nimarezainia
Copy link
Contributor Author

@belimawr would you be merging this PR again as part of this issue?

@belimawr
Copy link
Contributor

@belimawr would you be merging this PR again as part of this issue?

Do you mean reverting it? That's an option and it's an interesting starting point. My first idea was to just copy the Custom Logs integration, edit it to use filestream and update the docs, but I can start from reverting this PR as well.

@jlind23
Copy link
Contributor

jlind23 commented Sep 30, 2024

elastic/integrations#4609 was using a package spec version of 1.0.0 so there might be some changes required.

@belimawr
Copy link
Contributor

belimawr commented Oct 4, 2024

Folks, I create a draft PR, there are I'd like your opinion on (@nimarezainia, @jlind23)
1. Should we expose file identities other than fingerprint?
File identity docs: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-filestream.html#_file_identity_2

Currently I'm using fingerprint as the only option and allowing to configure the offset and length. I believe that is the best option because this is the file identity we implemented to solve the issues we were having with inode reuse and the one we recommend to our users.

The reason I did not add any of the other ones is because while I can define a variable/input field as a selection from a list, the template language used is somehow limited and there is no way to use this selection values to render the correct policy. Having all options a boolean/toggle is very confusing because only one can be used at a time. I asked on slack for some suggestions/help, if there is a way

2. Which name do we want for the integration?
Currently it's "Custom Filestream Logs" (from "reverting" and updating the PR removing it), however having "Custom logs" and "Custom Filestream Logs" can be confusing... Even the description is not very helpful in answering which one should be used.
Image

I was thinking about something like "Custom Logs V2","Custom Logs enhanced", something that makes it clear this one should be used instead of the "Custom Logs".

3. Is there any preference for the initial version?
Currently I put 0.0.1 because it is the very first release, the old one had 1.0.0 as the initial release. I'm not 100% sure what is the standard we use for integrations.

4. When do we need to release the first version?
I'm asking because if question 1., one way to solve is is to get a small change in Kibana to add template helpers that will allow me to implement the logic I need. my JS foo is rusty, but I could even give it a shot, but I don't know how long it would take to get the change in and it will make the integration require the newest version of Kibana, which might not be ideal.

@cmacknz
Copy link
Member

cmacknz commented Oct 4, 2024

Currently I'm using fingerprint as the only option and allowing to configure the offset and length. I believe that is the best option because this is the file identity we implemented to solve the issues we were having with inode reuse and the one we recommend to our users.

If people want to use the take over mode to migrate from custom logs one day, do we want to give them the option to do that with the same file identity? I think this might actually be a requirement of the take over mode (once it actually works for Elastic Agent).

@jlind23
Copy link
Contributor

jlind23 commented Oct 7, 2024

Currently it's "Custom Filestream Logs" (from "reverting" and updating the elastic/integrations#4609 removing it), however having "Custom logs" and "Custom Filestream Logs" can be confusing... Even the description is not very helpful in answering which one should be used.

IIRC we spoke with @nimarezainia and @strawgate about renaming "Custom logs" to "Custom logs (Legacy)" so naming this one "Custom Filestream Logs" would make sense.

@belimawr
Copy link
Contributor

belimawr commented Oct 7, 2024

Currently it's "Custom Filestream Logs" (from "reverting" and updating the elastic/integrations#4609 removing it), however having "Custom logs" and "Custom Filestream Logs" can be confusing... Even the description is not very helpful in answering which one should be used.

IIRC we spoke with @nimarezainia and @strawgate about renaming "Custom logs" to "Custom logs (Legacy)" so naming this one "Custom Filestream Logs" would make sense.

Would this rename happen at the same time/version where we release the Custom Filestream logs?

@belimawr
Copy link
Contributor

belimawr commented Oct 7, 2024

Currently I'm using fingerprint as the only option and allowing to configure the offset and length. I believe that is the best option because this is the file identity we implemented to solve the issues we were having with inode reuse and the one we recommend to our users.

If people want to use the take over mode to migrate from custom logs one day, do we want to give them the option to do that with the same file identity? I think this might actually be a requirement of the take over mode (once it actually works for Elastic Agent).

That makes sens. Then we need to find the best way to help users not to create an invalid configuration for their Filestream integration. I'll look into this.

@jlind23
Copy link
Contributor

jlind23 commented Oct 7, 2024

Would this rename happen at the same time/version where we release the Custom Filestream logs?

Once the Custom Filestream logs integration is published we would have to create a new PR that will update the Custom Log title, yes.

@nimarezainia
Copy link
Contributor Author

Would this rename happen at the same time/version where we release the Custom Filestream logs?

Once the Custom Filestream logs integration is published we would have to create a new PR that will update the Custom Log title, yes.

@belimawr
Yes this can happen at some other point. the idea is that at some point the whole "Custom Log" integration will be deprecated (and that would show in the title).

Regarding configurations available: all the options available to configure filestream should be configurable. e need to accommodate a user migrating from Custom Logs integration to Custom Filestream Logs.

@jlind23 jlind23 linked a pull request Oct 16, 2024 that will close this issue
5 tasks
@belimawr
Copy link
Contributor

If people want to use the take over mode to migrate from custom logs one day, do we want to give them the option to do that with the same file identity? I think this might actually be a requirement of the take over mode (once it actually works for Elastic Agent).

I did some testing and looked at the code for the take over mode, currently it only works if the file identity used is the same for the log and filestream inputs.

Regarding configurations available: all the options available to configure filestream should be configurable. e need to accommodate a user migrating from Custom Logs integration to Custom Filestream Logs.

@nimarezainia, I made all configurations available, however for the file_identity I only added an option for fingerprint. My reasoning is:

  • The fingerprint is our current recommended file identity, it solves the inode reuse from the native file identity and is much more versatile than the path.
  • Only one file identity can be enable
  • Our current policy template features does not allow for a drop down menu that can be used to implement the more complex template logic required by file identity
  • I'm having a hard time thinking about any scenario where the fingerprint file identity cannot be used/replace the others.

Here is how this part of the configuration looks like:
Image

@nimarezainia
Copy link
Contributor Author

@belimawr don't think about it too much :-) - file identity is fine. We can capture requirements if they come up.

@strawgate
Copy link
Contributor

The fingerprint is our current recommended file identity, it solves the inode reuse from the native file identity and is much more versatile than the path.

The fingerprint identity will be enabled by default, right?

@belimawr
Copy link
Contributor

The fingerprint is our current recommended file identity, it solves the inode reuse from the native file identity and is much more versatile than the path.

The fingerprint identity will be enabled by default, right?

At the moment it's not on the draft PR, but that's a very good point, I'll enable it by default.

@mbudge
Copy link
Contributor

mbudge commented Nov 7, 2024

https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html

Will the parsers section be included to configure multiline messages?

@belimawr
Copy link
Contributor

belimawr commented Nov 7, 2024

https://www.elastic.co/guide/en/beats/filebeat/current/multiline-examples.html

Will the parsers section be included to configure multiline messages?

Yes, the parsers will be there, the goal is to support all configuration options from Filestream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants