Skip to content

Commit 145bc5c

Browse files
ezyangfacebook-github-bot
authored andcommitted
Rename Math to CompositeImplicitAutograd (pytorch#54466)
Summary: Pull Request resolved: pytorch#54466 I had to very carefully audit all the use sites since there are a lot of other uses of the string Math; I did most of the conversion by grepping for all occurrences of Math and then doing a search replace. I also updated documentation for clarity. Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27253239 Pulled By: ezyang fbshipit-source-id: afb485d07ff39575742a4f0e1e205179b60bc953
1 parent 87989a6 commit 145bc5c

File tree

17 files changed

+213
-166
lines changed

17 files changed

+213
-166
lines changed

BUILD.bazel

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ genrule(
130130
"aten/src/ATen/RegisterMkldnnCPU.cpp",
131131
"aten/src/ATen/RegisterQuantizedCPU.cpp",
132132
"aten/src/ATen/RegisterSparseCPU.cpp",
133-
"aten/src/ATen/RegisterMath.cpp",
133+
"aten/src/ATen/RegisterCompositeImplicitAutograd.cpp",
134134
"aten/src/ATen/RegisterMeta.cpp",
135135
"aten/src/ATen/RegisterDefaultBackend.cpp",
136136
"aten/src/ATen/RegisterSchema.cpp",

aten/src/ATen/core/boxing/KernelFunction.cpp

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,10 @@ void fallthrough_kernel(OperatorKernel*, const OperatorHandle&, DispatchKeySet,
2121

2222
void ambiguous_autogradother_kernel(OperatorKernel*, const OperatorHandle& op, DispatchKeySet, Stack*) {
2323
TORCH_INTERNAL_ASSERT(0,
24-
op.operator_name(), " has kernels registered to both Math and a backend mapped to AutogradOther. "
25-
"This makes the backend kernel unreachable (see Note [Ambiguity in AutogradOther kernel]). "
26-
"If it's intended to override Math kernel behavior, please open an issue to request a dedicated "
24+
op.operator_name(), " has kernels registered to both CompositeImplicitAutograd and a backend mapped to AutogradOther. "
25+
"This makes the backend kernel unreachable; the dispatcher will always prefer the CompositeImplicitAutograd lowering "
26+
"(see Note [Ambiguity in AutogradOther kernel]). "
27+
"If you want to override CompositeImplicitAutograd, please open an issue to request a dedicated "
2728
"Autograd dispatch key for the backend.\n",
2829
"If you only want to run inference instead of training, add `at::AutoNonVariableTypeMode guard(true);` "
2930
"before model.forward(). Note this guard is only available in C++ but not Python at present.",

aten/src/ATen/core/boxing/KernelFunction.h

Lines changed: 37 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,15 +18,43 @@ struct OperatorKernel;
1818
TORCH_API void fallthrough_kernel(OperatorKernel*, const OperatorHandle&, DispatchKeySet, Stack*);
1919

2020
// Note [Ambiguity in AutogradOther kernel]
21-
// This kernel implements reporting an error message when there're kernels registered
22-
// to both Math and a backend of AutogradOther, we don't know which kernel to pick:
23-
// - if we pick Math kernel for AutogradOther, the kernel registered to backend will be
24-
// silently ignored and never called.
25-
// - if we skip using Math kernel for AutogradOther (it might pick Autograd kernel if available),
26-
// it'll break all backends mapped to AutogradOther without a direct registration to backend.
27-
// See c10/core/DispatchKeySet.cpp for a list of backends mapped to AutogradOther.
28-
// Thus if backend extender indeed want to override Math kernel behavior, they should request
29-
// a dedicated Autograd key for their backend to resolve the ambiguity.
21+
// ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22+
// This error-reporting kernel is registered to the AutogradOther entry in the
23+
// dispatch table when there is both a CompositeImplicitAutograd kernel and a
24+
// backend kernel for ANY backend that maps to AutogradOther. To see why
25+
// this is necessary in the AutogradOther case, it's helpful to first see
26+
// why everything works out fine for a backend that has a reserved Autograd
27+
// entry (see rule 2.2 in [Note] DispatchTable computation):
28+
//
29+
// CPU AutogradCPU
30+
// reg? registers with...
31+
// -------------------------------------------------
32+
// y Autograd registration takes precedence
33+
// over CompositeImplicitAutograd.
34+
// This is good, because the CPU specific backend
35+
// implementation is more specialized and typically better;
36+
// if we used the composite, we would bypass it.
37+
// (NB: the Autograd key is guaranteed to exist because
38+
// the autograd codegen requires it!)
39+
//
40+
// n CompositeImplicitAutograd takes precedence.
41+
// This is also good, because the Autograd
42+
// registration (if it exists) would try to redispatch
43+
// to the (non-existent) CPU implementation; by
44+
// using the composite, we ensure the operator
45+
// actually works.
46+
//
47+
// As you can see, when we have a specific Autograd key (AutogradCPU), we can
48+
// decide whether or not to use the CompositeImplicitAutograd kernel or the
49+
// Autograd kernel based on whether or not the backend kernel exists.
50+
//
51+
// However, for AutogradOther (which is the catchall autograd kernel for
52+
// everything that doesn't have a specific Autograd key), we can't do this
53+
// trick because there isn't any unique backend to peek at to disambiguate;
54+
// if there are some backends that have implementations they prefer Autograd,
55+
// but unimplemented backends would prefer CompositeImplicitAutograd. Rather
56+
// than arbitrarily pick one or the other, we just register a kernel that raises
57+
// an error and let the user decide how to proceed.
3058
TORCH_API void ambiguous_autogradother_kernel(OperatorKernel*, const OperatorHandle&, DispatchKeySet, Stack*);
3159

3260
// Note [named_not_supported_kernel]

aten/src/ATen/core/dispatch/OperatorEntry.cpp

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,8 @@ std::list<AnnotatedKernel>::iterator OperatorEntry::registerKernel(
108108

109109
// Add the kernel to the kernels list,
110110
// possibly creating the list if this is the first kernel.
111-
// Redirect catchAll registrations to Math.
112-
auto& k = dispatch_key.has_value() ? kernels_[*dispatch_key] : kernels_[DispatchKey::Math];
111+
// Redirect catchAll registrations to CompositeImplicitAutograd.
112+
auto& k = dispatch_key.has_value() ? kernels_[*dispatch_key] : kernels_[DispatchKey::CompositeImplicitAutograd];
113113

114114
if (k.size() > 0) {
115115
TORCH_WARN("Overriding a previously registered kernel for the same operator and the same dispatch key\n",
@@ -138,8 +138,8 @@ void OperatorEntry::deregisterKernel_(
138138
c10::optional<DispatchKey> dispatch_key,
139139
std::list<AnnotatedKernel>::iterator kernel
140140
) {
141-
// Redirect catchAll deregistrations to Math.
142-
DispatchKey dk = dispatch_key.has_value() ? *dispatch_key : DispatchKey::Math;
141+
// Redirect catchAll deregistrations to CompositeImplicitAutograd.
142+
DispatchKey dk = dispatch_key.has_value() ? *dispatch_key : DispatchKey::CompositeImplicitAutograd;
143143
auto found = kernels_.find(dk);
144144
TORCH_INTERNAL_ASSERT(found != kernels_.end(), "Tried to deregister a kernel for dispatch key ", toString(dispatch_key), " but there are no kernels registered for this dispatch key. The operator is ", toString(name_));
145145
auto& k = found->second;
@@ -186,13 +186,13 @@ std::pair<const AnnotatedKernel&, const char*> OperatorEntry::computeDispatchTab
186186
// (2.1) Use kernel from DispatchKey::DefaultBackend if available.
187187
// This is used to register a kernel that works for all backend in inference. But it requires
188188
// separate registration for Autograd keys to support training.
189-
// (2.2) Use kernel from DispatchKey::Math if available.
190-
// For autograd keys, we only use kernel from Math when there's no direct registration
191-
// to its corresponding backend key or DefaultBackend. See Note [DefaultBackend and Math].
189+
// (2.2) Use kernel from DispatchKey::CompositeImplicitAutograd if available.
190+
// For autograd keys, we only use kernel from CompositeImplicitAutograd when there's no direct registration
191+
// to its corresponding backend key or DefaultBackend. See Note [DefaultBackend and CompositeImplicitAutograd].
192192
// For AutogradOther, we eagerly return ambiguousAutogradOtherKernel_ if there's registration to any of
193193
// its backends and ask backend extender to request a decicated Autograd key for the backend.
194194
// See Note [Ambiguity in AutogradOther kernel] for more details.
195-
// A DefaultBackend kernel prevents Math kernel being used for Autograd keys, but it doesn't
195+
// A DefaultBackend kernel prevents CompositeImplicitAutograd kernel being used for Autograd keys, but it doesn't
196196
// cause confusion for AutogradOther. It's pretty straightforward to use Autograd (if available)
197197
// in this case.
198198
// (2.3) Use kernel from DispatchKey::Autograd if available
@@ -201,11 +201,11 @@ std::pair<const AnnotatedKernel&, const char*> OperatorEntry::computeDispatchTab
201201
// backend key. See Note [Refresh Runtime Autograd entries in dispatchTable_]
202202
// (3) Use fallthrough kernel that are registered as fallback.
203203
// Alias Key Precedence:
204-
// DefaultBackend > Math > Autograd
205-
// Note [DefaultBackend and Math]
206-
// When there're registrations to both DefaultBackend & Math & Autograd, from (2.2) we know DefaultBackend
207-
// and Autograd kernels will be picked up and Math is overriden.
208-
// This is fine and in practice DefaultBackend and Math shouldn't co-exist for an op.
204+
// DefaultBackend > CompositeImplicitAutograd > Autograd
205+
// Note [DefaultBackend and CompositeImplicitAutograd]
206+
// When there're registrations to both DefaultBackend & CompositeImplicitAutograd & Autograd, from (2.2) we know DefaultBackend
207+
// and Autograd kernels will be picked up and CompositeImplicitAutograd is overriden.
208+
// This is fine and in practice DefaultBackend and CompositeImplicitAutograd shouldn't co-exist for an op.
209209
// TODO: Update alias key precedence after we add new alias keys AutogradDispatchCPUOrCUDA .
210210

211211
// 1. Operator registration
@@ -226,13 +226,13 @@ std::pair<const AnnotatedKernel&, const char*> OperatorEntry::computeDispatchTab
226226
bool has_backend_kernel =
227227
hasKernelForAnyDispatchKey(getBackendKeySetFromAutograd(dispatch_key).add(DispatchKey::DefaultBackend));
228228

229-
// 2.2. Use Math kernel if available. For autograd keys, we only use kernel from Math
229+
// 2.2. Use CompositeImplicitAutograd kernel if available. For autograd keys, we only use kernel from CompositeImplicitAutograd
230230
// when there's no direct registration to its corresponding backend key or DefaultBackend.
231231
// For AutogradOther, we return ambiguousAutogradOtherKernel_ if there's registration
232232
// to any of its backends.
233233
// See Note [Undefined in dispatchTable_] for the special handling for Undefined.
234-
if (dispatch_key == DispatchKey::Undefined || isIncludedInAlias(dispatch_key, DispatchKey::Math)) {
235-
if (auto math_registration = getKernelForDispatchKey(DispatchKey::Math)) {
234+
if (dispatch_key == DispatchKey::Undefined || isIncludedInAlias(dispatch_key, DispatchKey::CompositeImplicitAutograd)) {
235+
if (auto math_registration = getKernelForDispatchKey(DispatchKey::CompositeImplicitAutograd)) {
236236
if (dispatch_key == DispatchKey::AutogradOther
237237
&& hasKernelForAnyDispatchKey(c10::autogradother_backends)) {
238238
return {ambiguousAutogradOtherKernel_, "ambiguous autogradother"};
@@ -286,9 +286,9 @@ void OperatorEntry::updateDispatchTable_(const c10::Dispatcher& dispatcher, Disp
286286
for (auto k : c10::getRuntimeDispatchKeySet(dispatch_key)) {
287287
updateDispatchTableEntry_(dispatcher, k);
288288
}
289-
// Registration to DefaultBackend and Math should be populated to Undefined.
289+
// Registration to DefaultBackend and CompositeImplicitAutograd should be populated to Undefined.
290290
// We cannot do this above since Undefined cannot be represented in DispatchKeySet.
291-
if (dispatch_key == DispatchKey::Math || dispatch_key == DispatchKey::DefaultBackend) {
291+
if (dispatch_key == DispatchKey::CompositeImplicitAutograd || dispatch_key == DispatchKey::DefaultBackend) {
292292
updateDispatchTableEntry_(dispatcher, DispatchKey::Undefined);
293293
}
294294
// Note [Refresh Runtime Autograd entries in dispatchTable_]
@@ -319,7 +319,7 @@ void OperatorEntry::updateDispatchTableFull_(const c10::Dispatcher& dispatcher)
319319
// the error message.
320320
// In the old world of catchAll, the only way to "register" a kernel to Undefined is by registering it to
321321
// catchAll. After catchAllKernel_ is removed, Undefined now can get a kernel from either DefaultBackend
322-
// or Math alias key so that we don't break the support. Ideally isIncludedInAlias(Undefined, Math)
322+
// or CompositeImplicitAutograd alias key so that we don't break the support. Ideally isIncludedInAlias(Undefined, CompositeImplicitAutograd)
323323
// should return true, it returns false because Undefined cannot be represented in a DispatchKeySet.
324324
for (uint8_t iter = 0; iter != static_cast<uint8_t>(DispatchKey::NumDispatchKeys); ++iter) {
325325
updateDispatchTable_(dispatcher, static_cast<DispatchKey>(iter));

aten/src/ATen/core/op_registration/op_registration_test.cpp

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -520,7 +520,7 @@ TEST(OperatorRegistrationTest, whenRegisteringAutogradKernelWithCatchAllKernel_t
520520
auto op = Dispatcher::singleton().findSchema({"_test::dummy", ""});
521521
ASSERT_TRUE(op.has_value());
522522

523-
// catchAll now maps to Math which has higher precedence than Autograd
523+
// catchAll now maps to CompositeImplicitAutograd which has higher precedence than Autograd
524524
called_nonautograd = called_autograd = false;
525525
op->typed<void (Tensor)>().call(dummyTensor(DispatchKey::CPU, /*requires_grad=*/true));
526526
EXPECT_TRUE(called_nonautograd);
@@ -1306,7 +1306,7 @@ TEST(NewOperatorRegistrationTest, whenRegisteringBackendFallbackKernelAndCatchal
13061306

13071307
called = false;
13081308
auto stack = callOp(*op, dummyTensor(c10::DispatchKey::CPU), "hello ");
1309-
// CatchAll now maps to Math and has higher precedence than backend fallback.
1309+
// CatchAll now maps to CompositeImplicitAutograd and has higher precedence than backend fallback.
13101310
EXPECT_TRUE(called);
13111311
}
13121312

@@ -1325,10 +1325,10 @@ TEST(NewOperatorRegistrationTest, whenRegisteringAutogradKernelWithRegularKernel
13251325
EXPECT_FALSE(called_autograd);
13261326
}
13271327

1328-
TEST(NewOperatorRegistrationTest, dispatchWithMathKernel) {
1328+
TEST(NewOperatorRegistrationTest, dispatchWithCompositeImplicitAutogradKernel) {
13291329
bool math_called = false;
13301330
auto m = MAKE_TORCH_LIBRARY(test);
1331-
m.def("fn", torch::dispatch(c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; }));
1331+
m.def("fn", torch::dispatch(c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; }));
13321332

13331333
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
13341334
ASSERT_TRUE(op.has_value());
@@ -1370,17 +1370,17 @@ TEST(NewOperatorRegistrationTest, dispatchWithMathKernel) {
13701370
}
13711371
}
13721372

1373-
TEST(NewOperatorRegistrationTest, dispatchWithMathAndAutogradKernel) {
1373+
TEST(NewOperatorRegistrationTest, dispatchWithCompositeImplicitAutogradAndAutogradKernel) {
13741374
bool math_called = false;
13751375
bool autograd_called = false;
13761376
auto m = MAKE_TORCH_LIBRARY(test);
1377-
m.def("fn", torch::dispatch(c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; }));
1377+
m.def("fn", torch::dispatch(c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; }));
13781378
m.impl("fn", c10::DispatchKey::Autograd, [&](const Tensor& x) { autograd_called = true; return x; });
13791379

13801380
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
13811381
ASSERT_TRUE(op.has_value());
13821382

1383-
// Math has higher precedence than Autograd
1383+
// CompositeImplicitAutograd has higher precedence than Autograd
13841384
{
13851385
math_called = autograd_called = false;
13861386
callOp(*op, dummyTensor(c10::DispatchKey::CPU, /*requires_grad=*/true));
@@ -1396,17 +1396,17 @@ TEST(NewOperatorRegistrationTest, dispatchWithMathAndAutogradKernel) {
13961396
}
13971397
}
13981398

1399-
TEST(NewOperatorRegistrationTest, dispatchWithMathAndCatchAllKernel) {
1399+
TEST(NewOperatorRegistrationTest, dispatchWithCompositeImplicitAutogradAndCatchAllKernel) {
14001400
bool math_called = false;
14011401
bool catchall_called = false;
14021402
auto m = MAKE_TORCH_LIBRARY(test);
1403-
m.def("fn", torch::dispatch(c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; }));
1403+
m.def("fn", torch::dispatch(c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; }));
14041404
m.impl("fn", [&](const Tensor& x) { catchall_called = true; return x; });
14051405

14061406
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
14071407
ASSERT_TRUE(op.has_value());
14081408

1409-
// catchAll now maps to Math, which means we have two registrations to Math key.
1409+
// catchAll now maps to CompositeImplicitAutograd, which means we have two registrations to CompositeImplicitAutograd key.
14101410
// The last registration is used.
14111411
{
14121412
catchall_called = math_called = false;
@@ -1423,11 +1423,11 @@ TEST(NewOperatorRegistrationTest, dispatchWithMathAndCatchAllKernel) {
14231423
}
14241424
}
14251425

1426-
TEST(NewOperatorRegistrationTest, AutogradBackendOverridesMathKernel) {
1426+
TEST(NewOperatorRegistrationTest, AutogradBackendOverridesCompositeImplicitAutogradKernel) {
14271427
bool math_called = false;
14281428
bool autograd_called = false;
14291429
auto m = MAKE_TORCH_LIBRARY(test);
1430-
m.def("fn", torch::dispatch(c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; }));
1430+
m.def("fn", torch::dispatch(c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; }));
14311431
m.impl("fn", c10::DispatchKey::AutogradCPU, [&](const Tensor& x) { autograd_called = true; return x; });
14321432

14331433
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
@@ -1462,11 +1462,11 @@ TEST(NewOperatorRegistrationTest, AutogradBackendOverridesMathKernel) {
14621462
}
14631463
}
14641464

1465-
TEST(NewOperatorRegistrationTest, BackendOverridesMathKernel) {
1465+
TEST(NewOperatorRegistrationTest, BackendOverridesCompositeImplicitAutogradKernel) {
14661466
bool math_called = false;
14671467
bool backend_called = false;
14681468
auto m = MAKE_TORCH_LIBRARY(test);
1469-
m.def("fn", torch::dispatch(c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; }));
1469+
m.def("fn", torch::dispatch(c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; }));
14701470
m.impl("fn", c10::DispatchKey::CPU, [&](const Tensor& x) { backend_called = true; return x; });
14711471

14721472
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
@@ -1550,12 +1550,12 @@ TEST(NewOperatorRegistrationTest, dispatchWithDefaultBackendKernel) {
15501550
}
15511551
}
15521552

1553-
TEST(NewOperatorRegistrationTest, dispatchWithDefaultBackendAndMathKernel) {
1553+
TEST(NewOperatorRegistrationTest, dispatchWithDefaultBackendAndCompositeImplicitAutogradKernel) {
15541554
bool backend_called = false;
15551555
bool math_called = false;
15561556
auto m = MAKE_TORCH_LIBRARY(test);
15571557
m.def("fn", torch::dispatch(c10::DispatchKey::DefaultBackend, [&](const Tensor& x) { backend_called = true; return x; }));
1558-
m.impl("fn", c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; });
1558+
m.impl("fn", c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; });
15591559

15601560
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
15611561
ASSERT_TRUE(op.has_value());
@@ -1735,7 +1735,7 @@ TEST(NewOperatorRegistrationTest, throwsWhenRegisterToBackendMapsToAutogradOther
17351735
bool sparsecpu_called, math_called = false;
17361736
auto m = MAKE_TORCH_LIBRARY(test);
17371737
m.def("fn", torch::dispatch(c10::DispatchKey::SparseCPU, [&](const Tensor& x) { sparsecpu_called = true; return x; }));
1738-
m.impl("fn", c10::DispatchKey::Math, [&](const Tensor& x) { math_called = true; return x; });
1738+
m.impl("fn", c10::DispatchKey::CompositeImplicitAutograd, [&](const Tensor& x) { math_called = true; return x; });
17391739

17401740
auto op = Dispatcher::singleton().findSchema({"test::fn", ""});
17411741
ASSERT_TRUE(op.has_value());
@@ -1748,7 +1748,7 @@ TEST(NewOperatorRegistrationTest, throwsWhenRegisterToBackendMapsToAutogradOther
17481748
{
17491749
expectThrows<c10::Error>([&] {
17501750
callOp(*op, dummyTensor(c10::DispatchKey::SparseCPU, /*requires_grad=*/true));
1751-
}, "test::fn has kernels registered to both Math and a backend mapped to AutogradOther.");
1751+
}, "test::fn has kernels registered to both CompositeImplicitAutograd and a backend mapped to AutogradOther.");
17521752
}
17531753
}
17541754

0 commit comments

Comments
 (0)