-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely WIP/RFC: Extend invoke
to accept CodeInstance
#56660
base: master
Are you sure you want to change the base?
Conversation
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into `invoke` rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment). In particular, this PR: 1. Extends `invoke` to support a CodeInstance in place of its usual `types` argument. 2. Adds a new `typeinf` optimized generic. The semantics of this optimized generic allow the compiler to instead call a companion `typeinf_edge` function, allowing a mid-inference interpreter switch (like #52964), without being forced through a concrete signature bottleneck. However, if calling `typeinf_edge` does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support. The additional benefit of the `typeinf` optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that for `owner == nothing` CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call the `typeinf` optimized generic on the dead CodeInstance's `owner` and `def` fields to obtain a cured CodeInstance (or a user-actionable error from the plugin). This PR includes an implementation of `with_new_compiler` from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced: ``` julia> @code_typed with_new_compiler(sin, 1.0) CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_get_tls_world_age), UInt64, svec(), 0, :(:ccall)))::UInt64 │ %2 = builtin Core.getfield(args, 1)::Float64 │ %3 = invoke sin(%2::Float64)::Float64 └── return %3 ) => Float64 ``` However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I like this direction more.
let typeinf_world_age = Base.tls_world_age() | ||
@eval @noinline typeinf(::SplitCacheOwner, mi::MethodInstance, source_mode::UInt8) = | ||
Base.invoke_in_world($typeinf_world_age, Compiler.typeinf_ext, SplitCacheInterp(; world=Base.tls_world_age()), mi, source_mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why you can't precompile it right?
I was doing something like:
const COMPILER_WORLD = Ref{UInt}(0)
function __init__()
COMPILER_WORLD[] = Base.tls_world_age()
end
To set the world age post loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to tie it to the definition world of the method, but needs some codegen work to make that fast.
Does this mean that propagating the compiler plugin context for a specific scope would involve performing a "Cassette"-like transformation, but by making the transformed source call |
This is an alternative mechanism to #56650 that largely achieves the same result, but by hooking into
invoke
rather than a generated function. They are orthogonal mechanisms, and its possible we want both. However, in #56650, both Jameson and Valentin were skeptical of the generated function signature bottleneck. This PR is sort of a hybrid of mechanism in #52964 and what I proposed in #56650 (comment).In particular, this PR:
Extends
invoke
to support a CodeInstance in place of its usualtypes
argument.Adds a new
typeinf
optimized generic. The semantics of this optimized generic allow the compiler to instead call a companiontypeinf_edge
function, allowing a mid-inference interpreter switch (like Support execution of code from external abstract interpreters #52964), without being forced through a concrete signature bottleneck. However, if callingtypeinf_edge
does not work (e.g. because the compiler version is mismatched), this still has well defined semantics, you just don't get inference support.The additional benefit of the
typeinf
optimized generic is that it lets custom cache owners tell the runtime how to "cure" code instances that have lost their native code. Currently the runtime only knows how to do that forowner == nothing
CodeInstances (by re-running inference). This extension is not implemented, but the idea is that the runtime would be permitted to call thetypeinf
optimized generic on the dead CodeInstance'sowner
anddef
fields to obtain a cured CodeInstance (or a user-actionable error from the plugin).This PR includes an implementation of
with_new_compiler
from #56650. This PR includes just enough compiler support to make the compiler optimize this to the same code that #56650 produced:However, the implementation here is extremely incomplete. I'm putting it up only as a directional sketch to see if people prefer it over #56650. If so, I would prepare a cleaned up version of this PR that has the optimized generics as well as the curing support, but not the full inference integration (which needs a fair bit more work).