Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-Task cpu time metric #56320

Open
wants to merge 53 commits into
base: master
Choose a base branch
from

Conversation

nickrobinson251
Copy link
Contributor

@nickrobinson251 nickrobinson251 commented Oct 24, 2024

Close #47351 (builds on top of #48416)

Adds to Task a metric for scheduled CPU time and wall time

  • cpu time = amount of time we were actually running this task (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR)
  • wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed).

I have put record_cpu_time! in wait(), so there is a singular place this is recorded by the scheduler, plus in yield(t), yieldto(t) and throwto(t) (which skip the scheduler). Channel, ReentrantLock, Event, Timer, Semaphore are all implemented in terms of wait(Condition), which in turn calls wait(). And LibuvStream similarly calls wait(). So hopefully this captures everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit wait()).

The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in jl_task_t. They are all atomic, except the metrics_enabled flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked const on the julia side.

I'd love to add these metrics in a way that is extendable, so that in future PRs we could add more per-task metrics (e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially count the number of times it yielded).

Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by @time (e.g. writing this same timing data to a ScopedValue like in #55103 but only for tasks lexically scoped to inside the @time block).

Timings are off be default but can be turned on globally via starting julia with --task-metrics=yes or calling Base.Experimental.task_metrics(true). Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via Base.Experimental.task_metrics does not affect existing Tasks, only new Tasks.

The other new APIs are Base.Experimental.task_cpu_time_ns(::Task) and Base.Experimental.task_wall_time_ns(::Task) for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). Names are up for discussion. I've put these all in Base.Experimental to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines).

cc @NHDaly @kpamnany @d-netto

@nickrobinson251 nickrobinson251 marked this pull request as ready for review October 24, 2024 21:59
@giordano giordano added multithreading Base.Threads and related functionality needs news A NEWS entry is required for this change feature Indicates new feature / enhancement requests labels Oct 24, 2024
src/init.c Outdated Show resolved Hide resolved
@vilterp vilterp mentioned this pull request Oct 25, 2024
4 tasks
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. I believe you may need to add the reporting functions to public and docs, and the PR to NEWs

Copy link
Contributor

@kpamnany kpamnany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I'm not too satisfied with the metrics being directly in jl_task_t; hopefully @vtjnash has a suggestion for what to do there.

We now have to separate out spin time, wait time, compilation time, and GC time (in another PR I guess) and the task CPU time will be actual user CPU time.

base/timing.jl Outdated
This metric is only updated when the task yields or completes.
See also [`task_wall_time_ns`](@ref).

Will be `UInt64(0)` if task timings are not enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this method should throw if timings are not enabled for this task instead of returning 0, which is a valid return value. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or missing or nothing are also possible options for that case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO you should not use unsigned integers for representing positive values - you should use Int64 for this instead.

It's better in julia than it is in C, since julia's defaults are less dangerous, but using unsigned ints for positive numeric values is a consistent source of bugs in C programs, and i think should be still frowned upon in julia. (https://jacobegner.blogspot.com/2019/11/unsigned-integers-are-dangerous.html)

IMO Unsigned ints should be used only for things like hashes, identifiers, addresses, and possibly for indexes but even for indexes i'd use signed ints.

But anyway, another benefit is that this also gives -1 as a possible return value for the unset case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've so far opted for returning an Int when enabled and nothing otherwise. I think nothing is maybe nicer because it'll force an error if you try to use it like an Int.

That said, these functions currently have _ns() names, similar to cumulative_compile_time_ns() and gc_time_ns() and time_ns() which all return UInt64, so don't love the inconsistency

I also moved these all to Experimental so we can change these APIs as we learn more about what makes sense (as we add more metrics).

base/timing.jl Outdated Show resolved Hide resolved
doc/man/julia.1 Outdated Show resolved Hide resolved
src/init.c Outdated Show resolved Hide resolved
src/jlapi.c Outdated Show resolved Hide resolved
src/jlapi.c Outdated Show resolved Hide resolved
src/jltypes.c Outdated
Comment on lines 3764 to 3768
"is_timing_enabled",
"first_enqueued_at",
"last_dequeued_at",
"cpu_time_ns",
"wall_time_ns"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these go in a separate struct called metrics or something @vtjnash? We won't be able to initialize that struct type before this, so it would have to be jl_any_type, but that would mean we'd have to alloc this in jl_new_task. I don't know how/if we can do an embedded struct here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be a separate struct inlined here, but I don't see much value to that. More valuable would probably be marking each of these as atomic though (in jltypes.c) so that observing them isn't UB. Most reads can probably be monotonic, but some writes are probably release and some reads might be acquires

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay on the separation concern. And yeah, I guess it is necessary for these to be atomic. Will discuss with @nickrobinson251.

Copy link
Contributor Author

@nickrobinson251 nickrobinson251 Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to atomics. But i wasn't sure from the docs how to select the correct memory ordering (so put :monotonic everywhere for now) -- would love advice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash would you be able to take a look and advise on which orderings to use for the various atomic reads and writes in the PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge it with the current orderings.

src/julia.h Outdated Show resolved Hide resolved
@IanButterworth
Copy link
Member

It'd be helpful to add an example of using this to the original post.

@nickrobinson251 nickrobinson251 changed the title Add per-Task user cpu time metric Add per-Task cpu time metric Oct 25, 2024
src/task.c Outdated
@@ -1146,6 +1146,11 @@ JL_DLLEXPORT jl_task_t *jl_new_task(jl_function_t *start, jl_value_t *completion
t->ptls = NULL;
t->world_age = ct->world_age;
t->reentrant_timing = 0;
t->is_timing_enabled = jl_atomic_load_relaxed(&jl_task_timing_enabled) != 0;
Copy link
Member

@NHDaly NHDaly Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting, so once a Task is spawned, enabling/disabling the feature doesn't affect live tasks?

So i guess if we wanted to enable this e.g. in @time, we would have to explicitly set this field on the Task struct? 🤔 I think that seems okay to me. Either you enable the global field before spawning any Task of interest, or you set the field on the Task when you want to start timing it. 👍 Coolcool.

🤔 And actually, i'm not sure if these numbers would get screwed up if you enable it while a task is running. So maybe you really do need to do this only at Task creation?

src/julia.h Outdated Show resolved Hide resolved
src/julia.h Outdated Show resolved Hide resolved
test/cmdlineargs.jl Outdated Show resolved Hide resolved
base/timing.jl Outdated Show resolved Hide resolved
test/threads_exec.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
src/julia.h Outdated Show resolved Hide resolved
Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coooooooool! 😁 Looks great from my perspective!

I'll let everyone else finish their reviews and your responses and such, but this LGTM from my perspective. 😎

src/task.c Outdated Show resolved Hide resolved
base/task.jl Outdated Show resolved Hide resolved
base/task.jl Show resolved Hide resolved
Comment on lines +3766 to +3770
"metrics_enabled",
"first_enqueued_at",
"last_started_running_at",
"cpu_time_ns",
"finished_at"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to instead store this state out-of-line and just have a single pointer here?

You could be cute and have a Union{Nothing,Metric} for the metric enabled query.
Alternativly these values could live in task-local-storage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We originally proposed that, yeah, but they were moved inline. I don't remember why. Maybe to dodge the extra cache-miss on the task switches?

CC @vtjnash

JL_DLLEXPORT void jl_task_metrics_enable(void)
{
// Increment the flag to allow reentrant callers.
jl_atomic_fetch_add(&jl_task_metrics_enabled, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a global, an alternative option here would be a scoped value.
That would avoid having to deal with underflow and global state and allow for precise measuring of task-metrics for sub-regions of the program.

Copy link
Member

@NHDaly NHDaly Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested the same, but i think the concern was whether scoped values are accessible from C, if I remember right?

Are they?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call a Julia function from C that performs that query.

Copy link
Member

@NHDaly NHDaly Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that not going to be slow?
Since this has to happen on every task switch.

Co-authored-by: Valentin Churavy <[email protected]>
src/julia.h Outdated Show resolved Hide resolved
!!! compat "Julia 1.12"
This method was added in Julia 1.12.
"""
function task_cpu_time_ns(t::Task=current_task())
Copy link
Member

@topolarity topolarity Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name seems like it might be a bit mis-leading - What about task_running_time_ns()?

As the note you added makes clear, this is the time that the task has been "in the running state" (i.e. scheduled by Julia) but not the time the task has actually spent running on-cpu (i.e. scheduled by the OS)

It's plausible to me that one day we might want to hook up perf with our scheduler so that you can get the actual instruction / branch / tsc spent on-core, so it's probably good to leave room for the future names

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like good feedback to me. 👍
I'm supportive of a name change.

I like task_running_time_ns(). I also like task_executing_time_ns() or maybe task_scheduled_time_ns()?
I think task_running_time_ns is the best I've seen so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Indicates new feature / enhancement requests multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

observability: add CPU time counter to tasks
10 participants