-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring Fusion Executor, pulling out compiled kernel #3468
base: main
Are you sure you want to change the base?
Conversation
…d in executor::compile.
…need to run compileRTC and runRTC calls in tests with CompiledKernel instances directly. Also fix profiling calls in KernelExecutor.
…owering but is now checked after lowering.
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a few TODO's in this PR, I might not take on all of them in this PR.
csrc/fusion.cpp
Outdated
@@ -231,6 +231,8 @@ void Fusion::removeVal(Val* val) { | |||
void Fusion::addInput(Val* input) { | |||
assertInContainer(input, "Cannot register input "); | |||
|
|||
std::cout << "Registering input: " << input->toString() << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
buffer << cuda_src.rdbuf(); | ||
return buffer.str(); | ||
} | ||
} // namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything above this is only code motion.
: compile_params_(compile_params), | ||
lowered_(std::make_unique<GpuLower>(fusion, compile_params)) { | ||
FUSER_PERF_SCOPE("CompiledKernel::CompiledKernel"); | ||
// TODO: No hooks can be sent because this is in the constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
csrc/runtime/compiled_kernel.cpp
Outdated
lowered_->run(); | ||
} | ||
|
||
// TODO:Rename to "compile" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
csrc/runtime/compiled_kernel.cpp
Outdated
// between them. | ||
return val->isFusionInput() && !val->isA<TensorView>(); | ||
})) { | ||
// TODO: parameter cache is too big a hammer here. We should consider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an old TODO I don't plan to address here. I'm not sure what the challenge of caching with scalar inputs is.
// This could be refactored. | ||
struct CompiledKernel : public NonCopyable { | ||
NVF_API ~CompiledKernel(); | ||
struct CudaExecutable : public NonCopyable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO can this be merged into compiled_kernel.h?
@@ -58,20 +47,11 @@ struct CompiledKernel : public NonCopyable { | |||
int register_spills = -1; | |||
}; | |||
|
|||
// Returns executable function and the ptxas log from compilation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to compiled_kernel.cpp
@@ -253,12 +233,5 @@ void validateCircularBuffering( | |||
kir::Kernel* kernel, | |||
ExpressionEvaluator& expr_eval); | |||
|
|||
//! Query the target GPU version number NVRTC compiles CUDA kernels for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to compiled_kernel.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually these are likely just removed as they should now be contained in runtime/compiled_kernel.cpp
@@ -32,117 +30,6 @@ | |||
#include <cstdlib> | |||
|
|||
namespace nvfuser { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to compiled_kernel.cpp
@@ -194,14 +81,6 @@ bool detectComputeSanitizer() { | |||
|
|||
namespace nvfuser { | |||
|
|||
namespace executor_utils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to compiled_kernel.cpp
TODO list:
|
!test |
Pull out kernel compilation from the KernelExecutor, trying to separate out the two concepts as we will move towards a world where the execution of a kernel is done through HostIr.
Moved code: