-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add lines skipped metric to pattern ingesters #14997
Conversation
Reasons for skipping: - too few tokens - too many tokens - line too long
} | ||
return nil | ||
} | ||
if len(tokens) > 80 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 is the value adaptive logs uses, so it seems reasonable to do the same. the difference there is they truncate the tokens slice at 80, whereas we drop. my reason for that is the integration between pattern ingester patterns and pattern search in Explore Logs, and searching by a truncated set of tokens won't yield the same result unless we know it's truncated and insert a wildcard at the end of the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We might have to tweak the 80 value judging by the tests as I think these tokenizers generate more tokens than the adaptive logs ones do.
I'm happy to judge that once it has rolled out - we might need to make this a per-tenant config eventually.
pkg/pattern/drain/drain.go
Outdated
return nil | ||
} | ||
if len(tokens) > 80 { | ||
print(tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug leftovers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What this PR does / why we need it:
This adds a
lines_skpped
metric to the pattern ingesters, which counts log lines that have been skipped for pattern ingestion. This also adds logic to skip lines with too many (> 50) tokens.Reasons for skipping:
Which issue(s) this PR fixes:
Fixes #14882
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR