-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COS / More alerts #517
Comments
Thank you for reporting us your feedback! The internal ticket has been created: https://warthogs.atlassian.net/browse/DPE-6073.
|
Hi @gustavosr98 I have just opened a PR(#521) for these desired alerts. Thanks for taking the time to outline your wishlist! <3 I've gone ahead and added most of them. The ones I did not add in that PR are:
The first two I did not add since they are already there (screenshots 1+2) the third I do not believe is possible since I don't think grafana supports alerts based on user provided input (i.e. X), if you have a specific latency that you have in mind let me know and I can implement that for you ASAP Please note that #1 will be further improved on o11y end since they are currently undergoing work for it |
Addressing #517 by adding the following requested alerts: - Cluster is not writable - Cluster will not be writable if I lose one more node - Number of connections is close to max connections limit along with a few others from the Percona alert rules ## testing - Cluster is not writable <img width="1137" alt="Screenshot 2024-12-09 at 14 22 58" src="https://github.com/user-attachments/assets/9deb7250-7701-4a9f-bdc0-ee74b5069641"> - Cluster will not be writable if I lose one more node - note this is firing because it was deployed with a single replica, when the replica set is scaled up it goes back to green <img width="1148" alt="Screenshot 2024-12-09 at 14 22 04" src="https://github.com/user-attachments/assets/50516710-97b5-4c08-a37d-37e43796bfb9"> - Number of connections is close to max connections limit (80%) <img width="1117" alt="Screenshot 2024-12-09 at 14 23 32" src="https://github.com/user-attachments/assets/14da278e-e9e7-42b6-ba69-11927f6c9b0e">
Steps to reproduce
Follow tutorial for COS integration
Expected behavior
A few alerts I would like to see
And any similar related to degradation of to prevent the system stops working as expected
Actual behavior
No errors in the alerts, just two alerts
Versions
Juju 3.5.4
The text was updated successfully, but these errors were encountered: