Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise pcap parser file selection algorithm to eventually process 100% of the data #1022

Open
mattmathis opened this issue Sep 24, 2021 · 2 comments
Assignees
Labels
review/triage Team should review and assign priority

Comments

@mattmathis
Copy link
Contributor

Revise the archive file selection algorithm for the pcap parser to rotate through all of the data in 10% batches.

Consider a hash based selection:
if (HASH(filename)+epoch) % 10 == 0 { process file }
where epoch is incremented every time the pcap gardner reaches the end of the data.

@autolabel autolabel bot added the review/triage Team should review and assign priority label Sep 24, 2021
@mlab-code-reviews
Copy link

mlab-code-reviews commented Sep 24, 2021 via email

@mattmathis
Copy link
Contributor Author

We are now processing 10% of the pcaps every 16 days. Please update to process all current and historical files.
SELECT COUNT (DISTINCT date) AS days, MIN(parser.Time) OldestParse, FROM mlab-oti.ndt_raw.pcap`
Yields: 838 2022-03-06 02:31:10.345666 UTC on 2022-03-22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review/triage Team should review and assign priority
Projects
None yet
Development

No branches or pull requests

4 participants