Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support search logs by timestamp for structured and unstructured logs. #42
base: main
Are you sure you want to change the base?
feat: Support search logs by timestamp for structured and unstructured logs. #42
Changes from 32 commits
ea23947
7edaa48
7481573
3ee05a2
f1a71a1
5263385
c423ec5
467e998
c66cb80
89338f7
a99ec2a
9ec039f
4f125b8
5221588
abeb4f8
6647fbd
046191e
4c7386d
c400361
ca4a616
9443ba3
db60efd
0e3e21b
412b96e
54c7df1
eff1849
00f89d4
122f1cf
f4a0207
46fa81f
8c8c42e
a984d7b
4788157
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::upper_bound assumes the range is sorted, right? What would happen if the log events aren't in ascending timestamp order? It's technically not impossible and we have seen log files where that's true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junhaoliao iirc, I was told to assume the log events are in accending order. If that's not the case, this whole search function won't work.
If that's not the case, we have to sort all of the log events, cuz' that's the only way to fully garantee all log events are sorted. And we (might?) also need a map to remember their original location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the case with out of order timestamps is rare (as far as we know), we could maybe do something like:
find_nearest_log_event_by_timestamp
. Otherwise, we use the implementation you have here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if that's the case, can we divide the log events up to chunks using these timestamp outliers, perform searches in the proper chunk, and finally iterate those outliers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that be straightforward to do given that the log events collection changes based on the active filter? If not, then given how rare the case is (as far as we know), I'm not sure the extra code complexity is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an offline discussion with @junhaoliao, we reach to an agreement that we can doc the assumption, and neglact this corner case.
By neglacting this corner case, we can save plenty of resource from verifying log events are in chronological order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Attention: Log events ordering may not be enforced.
Our investigation shows that the implementation of
generic_find_nearest_log_event_by_timestamp
relies on log events being in ascending timestamp order, yet searches for sorting or ordering logic (e.g. explicit calls tostd::sort
or other ordering mechanisms) produced no evidence to enforce this order during deserialization or insertion. This could potentially lead to incorrect log event lookups if the input sequence isn’t already sorted.emplace_back
without explicit ordering.🔗 Analysis chain
Verify timestamp ordering assumption.
The implementation assumes log events are sorted by timestamp in ascending order. This assumption should be verified as log files might not always maintain this order.
Run this script to check if timestamps are always in ascending order:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
Length of output: 115
Script:
Length of output: 1721