-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record current number of live pipelines running #3312
Conversation
@@ -152,6 +152,9 @@ func startControlPublish(control *url.URL, params aiRequestParams) { | |||
ControlPub: controlPub, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably move this initialisation of the LivePipeline somewhere more central now that we're using it to check stream existence as well as just storing these control related fields. Also do we need a mutex around it like we do in cleanupLive()
@j0sh ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you... I don't see a big issue right now. We might outgrow it at some point but if you want to be preemptive about refactoring, go ahead. Just want to avoid spreading initialization logic around too much, while it is kinda self contained right now.
Also do we need a mutex around it like we do in cleanupLive()
Yep we need the mutex but I believe we already hold it at this point. Only slight issue is that we also hold the mutex while calling stats.Record
but that should be OK for the most part. Recording prom metrics is supposed to be fairly quick but its still kind of non-trivial IIRC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah missed the mutex 👍
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3312 +/- ##
===================================================
+ Coverage 34.03772% 34.04646% +0.00874%
===================================================
Files 141 141
Lines 37006 37020 +14
===================================================
+ Hits 12596 12604 +8
- Misses 23691 23697 +6
Partials 719 719
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
@@ -152,6 +152,9 @@ func startControlPublish(control *url.URL, params aiRequestParams) { | |||
ControlPub: controlPub, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you... I don't see a big issue right now. We might outgrow it at some point but if you want to be preemptive about refactoring, go ahead. Just want to avoid spreading initialization logic around too much, while it is kinda self contained right now.
Also do we need a mutex around it like we do in cleanupLive()
Yep we need the mutex but I believe we already hold it at this point. Only slight issue is that we also hold the mutex while calling stats.Record
but that should be OK for the most part. Recording prom metrics is supposed to be fairly quick but its still kind of non-trivial IIRC
@@ -555,6 +556,9 @@ func (ls *LivepeerServer) cleanupLive(stream string) { | |||
pub, ok := ls.LivepeerNode.LivePipelines[stream] | |||
delete(ls.LivepeerNode.LivePipelines, stream) | |||
ls.LivepeerNode.LiveMu.Unlock() | |||
if monitor.Enabled { | |||
monitor.AICurrentLiveSessions(len(ls.LivepeerNode.LivePipelines)) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjh1 @j0sh Are we sure that cleanupLive()
is called always, even in case of errors, panics, etc?
The number of used in-use orchestrators seems high https://eu-metrics-monitoring.livepeer.live/grafana/d/be6llteqebk00b/ai-overview-livestream?orgId=1&from=1734220800000&to=now&viewPanel=6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we might be double counting somehow, the current number is 12 which is double Eric's number of 24/7 streams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjh1 I think the versions with / without music are separate streams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was wondering that, I think the added music streams are just regular studio streams only though https://github.com/ericxtang/random-ai-stream
What does this pull request do? Explain your changes. (required)
Introduce a new metric to record the current number of inflight live AI pipelines.
How did you test each of these updates (required)
Does this pull request close any open issues?
Checklist:
make
runs successfully./test.sh
pass