-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add usage info for dead letter queues #7323
Conversation
@robbavey can you review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=== Dead Letter Queues | ||
|
||
//REVIEWERS: I had to install logstash-input-dead_letter_queue. Is it not bundled with the alpha2 release? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be installed with alpha3, I don't think it was added to alpha2 - @suyograo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it was not bundled in alpha3. @dedemorton you can get it off the latest 5.5 BC as well. it is included in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified that the plugin is available in alpha3
of the Logstash plugins. | ||
|
||
//REVIEWERS: I feel like we have to say something here ^^ but I'm not sure if this is enough info. How will users be able to tell if a specific output supports DLQs? Do we have a plan for when/how we will add DLQ support to plugins that we support? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, initially we've targeted on ES output, but I agree we can link back to the plugin docs which should state explicit support. In this case, we should add docs in ES output to mention what exactly gets DLQd. I think it's ok to have this in both places, but we can just cross-reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll add something to indicate that users can check the plugin docs for DLQ support. For now, I'll mention that ES is the only output supported, and remove that bit when we have support in a few other outputs.
//REVIEWERS: I feel like we have to say something here ^^ but I'm not sure if this is enough info. How will users be able to tell if a specific output supports DLQs? Do we have a plan for when/how we will add DLQ support to plugins that we support? | ||
|
||
//REVIEWERS: It sounds like there might be some performance implications wrt enabling DLQs. If so, what are they? Should we document the restrictions? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I've created #7404 to track this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think performance should be called out here at all. We can always improve performance, but the important point of DLQ is the resiliency it brings in. User docs should not highlight a performance impact from DLQ. If the DLQ is super slow that's a bug we can fix IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but +1 to measuring it, as @robbavey pointed out.
|
||
<1> The path to the directory containing the dead letter queue. This is either | ||
the default, `path.data/dead_letter_queue`, or the value specified for | ||
`path.dead_letter_queue` in the `logstash.yml` file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we need to call out that each pipeline with an plugin that supports dlq output maintains its own dlq folder underneath this directory? But this folder refers to the 'top level' dlq folder.
Or whether this fact would muddy the waters somewhat...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add something about this and see how it works. The main point here is to help users understand how they can find out where the files reside.
the dead letter queue on or after June 6, 2017, at 23:40:37. | ||
|
||
//REVIEWERS: It's not clear to me what happens when the user configures start_timestamp and commit_offsets is true. If an offset's been committed, will the pipeline start reading at the offset, or go by the timestamp specified in start_timestamp. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the code, the timestamp overrides the value of the offset.
Its an interesting thought as to whether this is the desired behaviour - restarting logstash would see the events replayed from the earlier timestamp, even if offsets had been stored for events past the starting timestamp. That might cause some confusion. Thoughts @suyograo, @acchen97?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this one is tricky. I think the start_timestamp
should honor the committed offset and should restrict the range based on the timestamp after the commit point.
@robbavey you are right that the current behavior will confuse users. We should open an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created logstash-plugins/logstash-input-dead_letter_queue#6 to track this.
@dedemorton The latest 5.5 should have most of the fixes if you want to re-test this feature. |
This should be ready for a final review. Does anyone have a better example to use for the DLQ event? Something a bit more real-world-y. |
//TODO: Need a better example here. | ||
|
||
[source,json] | ||
------------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update with actual input event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to show the full event payload as it sits in the DLQ, including the reason/error metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an event that I pulled out of the DLQ created by the ES plugin:
{ "@timestamp" => 2017-06-16T20:47:02.510Z, "@metadata" => { "dead_letter_queue" => { "plugin_id" => "095bc531617b5a5fe5f632e85c9a5a03ddb204a5-2", "reason" => "Could not index event to Elasticsearch. status: 404, action: [\"update\", {:_id=>\"1\", :_index=>\"logstash-2017.06.16\", :_type=>\"logs\", :_routing=>nil, :_retry_on_conflict=>1}, #<LogStash::Event:0x31ac3ede>], response: {\"update\"=>{\"_index\"=>\"logstash-2017.06.16\", \"_type\"=>\"logs\", \"_id\"=>\"1\", \"status\"=>404, \"error\"=>{\"type\"=>\"document_missing_exception\", \"reason\"=>\"[logs][1]: document missing\", \"index_uuid\"=>\"FHQaa_FKRcyIB6ef8t9qFQ\", \"shard\"=>\"3\", \"index\"=>\"logstash-2017.06.16\"}}}", "entry_time" => #<Java::OrgLogstash::Timestamp:0x7e5e7c82>, "plugin_type" => "elasticsearch" } }, "@version" => "1", "host" => "dev.local", "message" => "Update" }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can clean that up, but does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robbavey Yes. Can you clean up the formatting to be more "pretty print" style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed there is no field called original
. I may have misled @robbavey.
The original event gets appended to the Event
object at the top-level.
Here is the updated config to deal with mapping error:
input {
dead_letter_queue {
path => "/path.to/data/dead_letter_queue/"
}
}
filter {
mutate {
remove_field => "[geoip][location]"
}
}
output {
elasticsearch{}
}
This removes the offending field and indexes other fields that were present in the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a before and after event:
In the DLQ before processing.
{
"@metadata" => {
"dead_letter_queue" => {
"entry_time" => #<Java::OrgLogstash::Timestamp:0x5b5dacd5>,
"plugin_id" => "fb80f1925088497215b8d037e622dec5819b503e-4",
"plugin_type" => "elasticsearch",
"reason" => "Could not index event to Elasticsearch. status: 400, action: [\"index\", {:_id=>nil, :_index=>\"logstash-2017.06.22\", :_type=>\"logs\", :_routing=>nil}, 2017-06-22T01:29:29.804Z Suyogs-MacBook-Pro-2.local {\"geoip\":{\"location\":\"home\"}}], response: {\"index\"=>{\"_index\"=>\"logstash-2017.06.22\", \"_type\"=>\"logs\", \"_id\"=>\"AVzNayPze1iR9yDdI2MD\", \"status\"=>400, \"error\"=>{\"type\"=>\"mapper_parsing_exception\", \"reason\"=>\"failed to parse\", \"caused_by\"=>{\"type\"=>\"illegal_argument_exception\", \"reason\"=>\"illegal latitude value [266.30859375] for geoip.location\"}}}}"
}
},
"@timestamp" => 2017-06-22T01:29:29.804Z,
"@version" => "1",
"geoip" => {
"location" => "home"
},
"host" => "Suyogs-MacBook-Pro-2.local",
"message" => "{\"geoip\":{\"location\":\"home\"}}"
}
After processing:
{
"@metadata" => {
"dead_letter_queue" => {
"entry_time" => #<Java::OrgLogstash::Timestamp:0x59f3b191>,
"plugin_id" => "fb80f1925088497215b8d037e622dec5819b503e-4",
"plugin_type" => "elasticsearch",
"reason" => "Could not index event to Elasticsearch. status: 400, action: [\"index\", {:_id=>nil, :_index=>\"logstash-2017.06.22\", :_type=>\"logs\", :_routing=>nil}, 2017-06-22T01:29:29.804Z Suyogs-MacBook-Pro-2.local {\"geoip\":{\"location\":\"home\"}}], response: {\"index\"=>{\"_index\"=>\"logstash-2017.06.22\", \"_type\"=>\"logs\", \"_id\"=>\"AVzNayPze1iR9yDdI2MD\", \"status\"=>400, \"error\"=>{\"type\"=>\"mapper_parsing_exception\", \"reason\"=>\"failed to parse\", \"caused_by\"=>{\"type\"=>\"illegal_argument_exception\", \"reason\"=>\"illegal latitude value [266.30859375] for geoip.location\"}}}}"
}
},
"@timestamp" => 2017-06-22T01:29:29.804Z,
"@version" => "1",
"geoip" => {},
"host" => "Suyogs-MacBook-Pro-2.local",
"message" => "{\"geoip\":{\"location\":\"home\"}}"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notice the geoip
field is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyograo @dedemorton - apologies. I ran that config that I pasted here through my local rig and it appeared to DTRT - an entry got added to ES after going through the DLQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, no worries. I think it just ignored the json
filter parsing because original
field was not found :) All good now
[[configuring-dlq]] | ||
==== Configuring Logstash to Use Dead Letter Queues | ||
|
||
You enable dead letter queues by setting the `dead_letter_queue_enable` option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say its off by default?
|
||
Dead letter queues are stored as files in the local directory of the Logstash | ||
instance. By default, the dead letter queue files are stored in | ||
`path.data/dead_letter_queue`. Each pipeline has a separate queue. For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning each pipeline has a separate queue only makes sense when we have multi-pipeline. Also, mentioning the "main" pipeline may be confusing to users until multi-pipeline is introduced. I'd suggest massaging this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR will get merged into master, too, which we use to build 6.0.0-alpha2, so let's keep the description. I can massage it for 5.5 after we've merged the PR.
==== Processing Events in the Dead Letter Queue | ||
|
||
When you are ready to process events in the dead letter queue, you create a | ||
pipeline that uses the `dead_letter_queue` input plugin to read from the dead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should link to the DLQ input docs here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do once the DLQ input docs are building.
docs/static/resiliency.asciidoc
Outdated
pipeline without interruption, Logstash provides the following data resiliency | ||
features. | ||
|
||
* <<persistent-queues>> protect against data loss by storing events in a message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe call this an "internal queue"
Left some comments. @suyograo one thing that missing here is how to cleanup/purge data from the DLQ. We should address that even if its a manual process today. |
@acchen97 we should punt on the manual cleanup instructions as of now. I expect the data volume for DLQ is gonna be low, so this should really not be a big deal. Also, we're already adding retention feature for 5.6, so I'm a little wary of adding manual delete-these-files instructions. Can we see how it goes and we can re-evaluate? |
78c0cf8
to
1bf8679
Compare
@suyograo This is ready for a final review. I need to retest the config in the example, but I can update in a separate request, if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments, but LGTM from my side.
information about the plugin that wrote the event, and the timestamp for when | ||
the event entered the dead letter queue. | ||
|
||
//REVIEWERS: Decided that it might be more useful to show the event in the section where we show how to process it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
|
||
output { | ||
stdout { | ||
codec => rubydebug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use stdout { codec => rubydebug { metadata => true } }
in this example. If you add metadata => true
, LS will emit the metadata information which is useful in the DLQ case.
1bf8679
to
11540b3
Compare
First draft of content for DLQ. The index changes add a new top-level entry to the TOC. It looks like this: