-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added BUSpark notebook to forked repo. #191
base: master
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Hi @parkerwstone. Thanks for your PR. I'm waiting for a aicoe-aiops member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:41Z
|
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:41Z
|
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:42Z Is this cell part of the analysis? |
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:43Z
|
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:43Z
|
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:44Z
|
View / edit / reply to this conversation on ReviewNB Shreyanand commented on 2021-03-26T01:46:45Z Is the rest of this notebook old code? If yes, remove it, use this notebook only for code that you want to publish. If the following cells are preprocessing steps or prelim. analysis then add it in the beginning of this notebook before the drain code. In that case, add interpretation of the analysis (word based analysis of logs suggests that ....) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey guys! In the current state, the notebook has a lot of noise in it. Please use the comments to clean and structure it. The drain bit looks promising, let's work on understanding and presenting it in detail.
…to results from drain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes are a good start, but the notebook is still incoherent. Take this notebook for example, it starts with a heading and tells us what to expect in the notebook. The cells are separated and connected based on logical steps or sections. Each section has markdown associated with it explaining what to expect in the code next. Following this format, the notebook should become more clear.
One of the major thing to focus at this point is to add explanations on the output of the drain parsing. Importing and applying is the easy part, understanding and dissecting the results would require more time and effort. What does the current output mean? I think coming up with log examples to show how and when this method works would help.
…bout each code cell and describes what the expected output should be. Also separated each log by its cluster ID.
…bout each code cell and describes what the expected output should be. Also separated each log by its cluster ID.
…bout each code cell and describes what the expected output should be. Also separated each log by its cluster ID.
@@ -0,0 +1,1087 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions:
- If there are 256 logs, training with 100 and testing with 156 may not be enough for the model to train. 80% training and 20% test with StratifiedShuffleSplit split should give better results.
- Try xgboost:
from xgboost import XGBClassifier XGBClassifier().fit(X_train, y_train)
Reply via ReviewNB
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…unprocessed logs.
…unprocessed logs.
…ed log classifying.
…ed log classifying.
b6d6abf
to
a02d97b
Compare
499ae55
to
49b6f60
Compare
Related Issues and Dependencies
…
This introduces a breaking change
This Pull Request implements
… Explain your changes.
Description