Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Alerting Display and Templates #1678

Open
3 tasks done
CannonLock opened this issue Oct 23, 2024 · 1 comment
Open
3 tasks done

Prometheus Alerting Display and Templates #1678

CannonLock opened this issue Oct 23, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@CannonLock
Copy link
Contributor

Pelican Service:

  • Director
  • Origin
  • Cache

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Create a UI item to display and provide templated Alerts for prometheus that can be enabled. Each template will have a description as well as the actual alert trigger equation. If it is easy add a way to adjust the actual alert tolerances/triggers so that they can be more broadly available.

Currently thinking of things like % of failures, # of threads, GBs transferred, slow request #... Going to reach out to system admins to get more ideas of things they saw that correlates with issues.

@CannonLock CannonLock added the enhancement New feature or request label Oct 23, 2024
@CannonLock CannonLock added this to the v7.12.0 milestone Oct 23, 2024
@CannonLock CannonLock self-assigned this Oct 23, 2024
@CannonLock
Copy link
Contributor Author

CannonLock commented Oct 23, 2024

One thing I want to stress is a v1 goal for this will be easy tweaking of the triggers so that people don't automatically turn them off if they are constantly triggering and potentially adding some guidance for what values are good.

Another idea is that for the first round we have smart triggers that are determined by previous context. So no less "Trigger as 40mbps transfers" and more "Trigger at 2 deviations above existing max load average".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant