mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
39df901b2a
63 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
24efd29d19 |
Check commutativity for computed dispatch table and add a test to check entries. (#44088)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44088 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23492793 Pulled By: ailzhang fbshipit-source-id: 37502f2a8a4d755219b400fcbb029e49d6cdb6e9 |
||
|
|
1b2da9ed82 |
Expose alias key info in dumpState and update test_dispatch. (#44081)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44081 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D23492794 Pulled By: ailzhang fbshipit-source-id: 27a2978591900463bda2e92e0201c9fd719f9792 |
||
|
|
224232032c |
Move Autograd to an alias dispatch key (#43070)
Summary: This PR moves `DispatchKey::Autograd` to an alias dispatch key mapping to `AutogradCPU, AutogradCUDA, AutogradXLA, AutogradOther, AutogradPrivate*` keys. A few things are handled in this PR: - Update alias dispatch key mapping and precompute dispatchTable logic - Move `Autograd` key from `always_included` set to TensorImpl constructor. - Update `dummyTensor` constructor to take `requires_grad` as optional argument so that it's closer to the real application in op_registration_test. - Use `BackendSelect` key for both backend select before and after autograd layer. (1 liner in backend_select codegen) A few planned followups ordered by priority: - [cleanup] Update `test_dispatch.py` to include testing `Autograd`. - [cleanup] Add Math alias key and move catchAll to Math. (to remove 2.2 in `computeDispatchTableEntryWithDebug`) - [new feature] Add support for Math in native_functions.yaml - [cleanup] Add iterator like functionality to DispatchKeySet - [cleanup/large] Only add Autograd backend keys when tensor requires grad. (cc: ljk53 ?) Pull Request resolved: https://github.com/pytorch/pytorch/pull/43070 Reviewed By: ezyang Differential Revision: D23281535 Pulled By: ailzhang fbshipit-source-id: 9ad00b17142e9b83304f63cf599f785500f28f71 |
||
|
|
a0ba7fb43e |
Precompute entries in dispatch tables (#40512)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40512 Fixes https://github.com/pytorch/pytorch/issues/32454 The heart of this diff is changing this: ``` inline const KernelFunction& Dispatcher::dispatch_(const DispatchTable& dispatchTable, DispatchKey dispatchKey) c nst { const KernelFunction* backendKernel = dispatchTable.lookup(dispatchKey); if (nullptr != backendKernel) { return *backendKernel; } const auto& backendFallbackKernel = backendFallbackKernels_[dispatchKey]; if (backendFallbackKernel.isValid()) { return backendFallbackKernel; } const KernelFunction* catchallKernel = dispatchTable.lookupCatchallKernel(); if (C10_LIKELY(nullptr != catchallKernel)) { return *catchallKernel; } reportError(dispatchTable, dispatchKey); } ``` to this: ``` const KernelFunction& OperatorEntry::lookup(DispatchKey k) const { const auto& kernel = dispatchTable_[static_cast<uint8_t>(k)]; if (C10_UNLIKELY(!kernel.isValid())) { reportError(k); } return kernel; } ``` The difference is that instead of checking a bunch of places to find the right kernel to use for an operator, all of the operators are precomputed into dispatchTable_ itself (so you don't have to consult anything else at runtime.) OperatorEntry::computeDispatchTableEntry contains that computation (which is exactly the same as it was before.) By doing this, we are able to substantially simplify many runtime components of dispatch. The diff is fairly large, as there are also some refactors interspersed with the substantive change: - I deleted the DispatchTable abstraction, folding it directly into OperatorEntry. It might make sense to have some sort of DispatchTable abstraction (if only to let you do operator[] on DispatchKey without having to cast it to integers first), but I killed DispatchTable to avoid having to design a new abstraction; the old abstraction wasn't appropriate for the new algorithm. - I renamed OperatorEntry::KernelEntry to AnnotatedKernel, and use it to store backend fallbacks as well as regular kernel registrations (this improves error messages when you incorrectly register a backend fallback twice). - I moved schema_ and debug_ into an AnnotatedSchema type, to make the invariant clearer that these are set together, or not at all. - I moved catch-all kernels out of kernels_ into its own property (undoing a refactor I did before). The main reason I did this was because our intended future state is to not have a single catch-all, but rather possibly multiple catch-alls which fill-in different portions of the dispatch table. This may change some more in the future: if we allow registrations for multiple types of catch alls, we will need a NEW data type (representing bundles of dispatch keys) which can represent this case, or perhaps overload DispatchKey to also record these types. The key changes for precomputation: - OperatorEntry::updateDispatchTable_ is now updated to fill in the entry at a DispatchKey, considering both kernels (what it did before) as well as catch-all and backend fallback. There is also OperatorEntry::updateDispatchTableFull_ which will update the entire dispatch table (which is necessary when someone sets a catch-all kernel). OperatorEntry::computeDispatchTableEntry holds the canonical algorithm specifying how we decide what function will handle a dispatch key for the operator. - Because dispatch table entry computation requires knowledge of what backend fallbacks are (which is recorded in Dispatcher, not OperatorEntry), several functions on OperatorEntry now take Dispatcher as an argument so they can query this information. - I modified the manual boxing wrapper invariant: previously, kernels stored in kernels_ did NOT have manual boxing wrappers and this was maintained by DispatchTable. Now, we just ALWAYS maintain manual boxing wrappers for all KernelFunctions we store. - DispatchKeyExtractor is greatly simplified: we only need to maintain a single per-operator bitmask of what entries are fallthrough (we don't need the global bitmask anymore). - Introduced a new debugging 'dumpComputedTable' method, which prints out the computed dispatch table, and how we computed it to be some way. This was helpful for debugging cases when the dispatch table and the canonical metadata were not in sync. Things that I didn't do but would be worth doing at some point: - I really wanted to get rid of the C10_UNLIKELY branch for whether or not the KernelFunction is valid, but it looks like I cannot easily do this while maintaining good error messages. In principle, I could always populate a KernelFunction which errors, but the KernelFunction needs to know what the dispatch key that is missing is (this is not passed in from the calling convention). Actually, it might be possible to do something with functors, but I didn't do it here. - If we are going to get serious about catchalls for subsets of operators, we will need to design a new API for them. This diff is agnostic to this question; we don't change public API at all. - Precomputation opens up the possibility of subsuming DispatchStub by querying CPU capability when filling in the dispatch table. This is not implemented yet. (There is also a mild blocker here, which is that DispatchStub is also used to share TensorIterator configuration, and this cannot be directly supported by the regular Dispatcher.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D22236352 Pulled By: ezyang fbshipit-source-id: d6d90f267078451816b1899afc3f79737b4e128c |
||
|
|
a4cabd1a3c |
Generalize Python dispatcher testing API; disallow overwriting fallback (#40469)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40469 - The old testing interface C._dispatch_import was based off the old c10::import variation, which meant the API lined up in a strange way with the actual torch/library.h. This diff reduces the differences by letting you program the Library constructor directly. - Using this newfound flexibility, we add a test for backend fallbacks from Python; specifically testing that we disallow registering a backend fallback twice. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D22236351 Pulled By: ezyang fbshipit-source-id: f8365e3033e9410c7e6eaf9f78aa32e1f7d55833 |
||
|
|
e29348f828 |
Switch to pybind11 style registration function API. (#36258)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36258 Previous we had a && chaining style API. There are some downsides to this API: - It's easy to forget the 'static' qualifier in front, leading to subtle ODR bugs. - It is not compatible with torchbind class_ definitions, as these need multiple levels of chaining. So in practice people end up having to define multiple static initializers, one per class. - It's not like pybind11. - There's no way to conveniently get the file and line number of the registration, as there is no macro point in the API. - The old API doesn't really encourage people to put all of their definitions for a library in one place, and to give a custom namespace for it. Similarly, the old API wasn't very DRY, because you had to keep repeating the namespace/dispatch key you were writing implementations for. The new API is modeled exactly off of the PYBIND11_MODULE macro: you write: ``` TORCH_LIBRARY(aten, m) { m.def("aten::add(Tensor self, Tensor other) -> Tensor"); ... } ``` in a non-chaining fashion, and under the hood the macro expands to define a function, and define a static initializer that allocates c10::Library (previously called c10::Module, but we renamed it to avoid confusion with the existing NN module concept), passes it to your function, and then retains it for the rest of the lifetime of the program. Specification of the namespace is mandatory, and in later commit I plan to make it a hard error to TORCH_LIBRARY the same library name twice. If you are specifying an implementation for an existing operator (e.g., you're the XLA backend, or even if you're just putting registrations for implementations at the implementation site), you should use TORCH_LIBRARY_IMPL, which instead takes a backend argument (instead of namespace) and can be used to specify an implementation for a backend. Unlike TORCH_LIBRARY, you can do as many of these as you want for a backend. This needs updates to the mobile code analyzer. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20929257 Pulled By: ezyang fbshipit-source-id: ba04d78492e8c93ae7190165fb936f6872896ada |
||
|
|
dd64e738c5 |
Expunge TensorId from all DispatchKey names. (#36240)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36240 It's annoying, historical, and unnecessary (enum class is already namespaced). I did this codemod with: ``` git grep -l 'CPUTensorId' | xargs sed -i 's/CPUTensorId/CPU/g' git grep -l 'CUDATensorId' | xargs sed -i 's/CUDATensorId/CUDA/g' git grep -l 'VariableTensorId' | xargs sed -i 's/VariableTensorId/Autograd/g' git grep -l 'HIPTensorId' | xargs sed -i 's/HIPTensorId/HIP/g' git grep -l 'MSNPUTensorId' | xargs sed -i 's/MSNPUTensorId/MSNPU/g' git grep -l 'XLATensorId' | xargs sed -i 's/XLATensorId/XLA/g' git grep -l 'PrivateUse1_TensorId' | xargs sed -i 's/PrivateUse1_TensorId/PrivateUse1/g' git grep -l 'PrivateUse2_TensorId' | xargs sed -i 's/PrivateUse2_TensorId/PrivateUse2/g' git grep -l 'PrivateUse3_TensorId' | xargs sed -i 's/PrivateUse3_TensorId/PrivateUse3/g' git grep -l 'AutocastTensorId' | xargs sed -i 's/AutocastTensorId/Autocast/g' git grep -l '_PreAutogradTensorId' | xargs sed -i 's/_PreAutogradTensorId/_PreAutograd/g' git grep -l 'TESTING_ONLY_GenericWrapperTensorId' | xargs sed -i 's/TESTING_ONLY_GenericWrapperTensorId/TESTING_ONLY_GenericWrapper/g' git grep -l 'TESTING_ONLY_GenericModeTensorId' | xargs sed -i 's/TESTING_ONLY_GenericModeTensorId/TESTING_ONLY_GenericMode/g' ``` Then I did a git grep for remaining TensorId occurrences, and manually killed those (mostly in codegen, and some docs that needed updating). Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20929255 Pulled By: ezyang fbshipit-source-id: dc371b6aa6e6ea7c0a5660137c14debde806a09d |
||
|
|
ef07bb65e9 |
[RELAND] Add DispatchKey impl overload; remove use of torch::dispatch (#36222)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36222 Reland of #35706, with fixes to code analyzer. It is extremely common to define implementations of operators at a specific dispatch key, so we add an overload to impl specifically for this case. I then delete most uses of torch::dispatch dispatch_autograd call sites can't make use of this overload. So instead the new preferred way to specify something as autograd is to pass kAutograd as the dispatch key (short form, analogous to kCPU/kCUDA which we support today). I flip flopped about whether or not kAutograd should have the type DispatchKey or some other type (to help better encapsulate the DispatchKey enum); this is more direct and I can't think of any BC problems from this usage. Some other reorganization I did: - I renamed all of the worker functions in op_registration to have a leading underscore and made them private, just to make it more clear what the public versus private API were (the private API shouldn't be used by users because it doesn't come with && overloads) Note that this means I needed to adjust the regex in the code analyzer, because - In a few places where I was touching lines already, I replaced full DispatchKey typed out enums with shorter kFoo names, similar to kAutograd but I didn't publish these globally. - Code analyzer now prints a unified diff, and in the other order (because I tend to think of the diff as reporting how the /new/ result is different) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20929256 Pulled By: ezyang fbshipit-source-id: c69b803d2b3a1a8aff70e14da33d3adec5239f13 |
||
|
|
0f99b28431 |
Revert D20775783: Add DispatchKey impl overload; remove use of torch::dispatch
Test Plan: revert-hammer Differential Revision: D20775783 Original commit changeset: e45b289e5d1f fbshipit-source-id: 08551428fa886e93cfda14eb51a0f920c335df34 |
||
|
|
2db61193bb |
Add DispatchKey impl overload; remove use of torch::dispatch (#35706)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35706 It is extremely common to define implementations of operators at a specific dispatch key, so we add an overload to impl specifically for this case. I then delete most uses of torch::dispatch dispatch_autograd call sites can't make use of this overload. So instead the new preferred way to specify something as autograd is to pass kAutograd as the dispatch key (short form, analogous to kCPU/kCUDA which we support today). I flip flopped about whether or not kAutograd should have the type DispatchKey or some other type (to help better encapsulate the DispatchKey enum); this is more direct and I can't think of any BC problems from this usage. Some other reorganization I did: - I renamed all of the worker functions in op_registration to have a leading underscore and made them private, just to make it more clear what the public versus private API were (the private API shouldn't be used by users because it doesn't come with && overloads) - In a few places where I was touching lines already, I replaced full DispatchKey typed out enums with shorter kFoo names, similar to kAutograd but I didn't publish these globally. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20775783 Pulled By: ezyang fbshipit-source-id: e45b289e5d1f86c180b24cf14c63cf4459ab5337 |
||
|
|
9e3605de98 |
[RELAND] New operator registration API (#35061) (#35629)
Summary: Reland of https://github.com/pytorch/pytorch/pull/35061 ; removed the get qualified type name magic from debug strings to work around MSVC 2017 bug. Main points of the new API: - You can register implementations (impl) without having to specify a schema. - Registrations are commutative, so no matter what order your static initializers run, you end up with the same end result. op_registration_test.cpp contains a reasonably comprehensive accounting for the available API surface How does this implementation proceed? The basic concept is to relax the internal invariants of Dispatcher data structures to allow the possibility that a FunctionSchema is not specified in an Operator. - DispatchKeyExtractor has an uninitialized state where it doesn't look for dispatch keys in any arguments of the stack. It can have a schema (de)registered to itself post facto with registerSchema/unregisterSchema. - DispatchTable has a new constructor taking only an OperatorName for the uninitialized state. It can have a schema (de)registered to itself post facto with registerSchema/unregisterSchema - OperatorDef maintains counts of both defs and well as defs_and_impls. defs_and_impls keeps track of the outstanding impl registrations; you may have impl registrations but no defs. If there are no defs (no schema), the operator is not returned by findSchema. A new findOperatorByName fucntion unconditionally returns the OperatorHandle even if there's no schema. OperatorHandle::hasSchema can be used to check if the operator has schema. - Replaced 'registerKernel' with 'registerImpl', which is the new interface for directly registering kernels without implementations. - Because 'registerImpl' no longer requires an OperatorHandle, change 'registerDef' to only return a RegistrationHandleRAII. This is marginally less efficient (since we're doing two hash table lookups on a registration now), but this won't matter in the long term, and probably doesn't matter now either. - Rename registerBackendFallbackKernel to registerFallback (this exposed a bunch of places where we're improperly directly interfacing with Dispatcher; we need to add this capability to the true public API) - All code generated internal registrations are switched to use the new API. This includes VariableType registrations (which previously weren't converted) and the mobile autograd stuff - Switch the new-style def()/impl() APIs to interact directly with Dispatcher, rather than indirecting through the old API - We deleted alias analysis kind merging entirely. As a nod to BC, it's possible to define a full schema with alias analysis kind, and then later do another full schema def with missing alias analysis kind, but the opposite direction is not allowed. We can remove this entirely following the plan at https://github.com/pytorch/pytorch/issues/35040 - Schema matching is moved inside the dispatcher, because we might not be able to immediately schema match at the point of an impl() (because we don't have the schema yet). To do this, we store the inferred function schema inside a KernelEntry, so we can check it when we get the real schema. - Registered kernel functions now store a debug string which can be used to more easily identify them. Tests use this to distinguish between multiple distinct registrations; regular invocations get only very basic information. Because we need our static initializers to work no matter what order they're run, the testing strategy on this PR is quite involved. The general concept: - Bind a (very gimped) version of the dispatcher API from Python, so that we can easily write a more complex testing harness using expect tests. - For series of registrations we want to test, exhaustively test every possible permutation of registrations (and deregistrations), and show that the intermediate states agree no matter what path is taken. - Intermediate states are rendered using a new dumpState() debugging method that prints the internal state of the dispatcher. This method may be generally useful for people who want to see what's in the dispatcher. - Simultaneously, add a new invariant testing function which checks that the internal invariants of the dispatcher are upheld (so we don't have to print internal implementation details of the dispatcher) The testing framework found a few bugs in development. For example, here is a case where we registered schema too early, before checking if it was valid: ``` Traceback (most recent call last): File "test/test_dispatch.py", line 164, in test_def_impl_schema_mismatch ], raises=True) File "test/test_dispatch.py", line 135, in commute results=results, raises=raises) File "test/test_dispatch.py", line 83, in run_permutation .format(ctor_order[:i], op_ix)) File "test/test_dispatch.py", line 59, in check_invariants .format(expected_provenance, actual_provenance) AssertionError: 'name[16 chars]ema: (none)\ncatchall: boxed unboxed :: (Tenso[18 chars]0)\n' != 'name[16 chars]ema: test::foo(Tensor x, Tensor y) -> (Tensor)[53 chars]0)\n' name: test::foo - schema: (none) + schema: test::foo(Tensor x, Tensor y) -> (Tensor) catchall: boxed unboxed :: (Tensor _0) -> (Tensor _0) : expected from running ctors (1,); actual from running ctors (1,) and then failing to run ctor 0 (did this failure leave the dispatcher in a wedged state? it shouldn't!) ``` There are also C++ smoketests for the API. These tests comprehensively cover the C++ API surface of the new operator registration API, but don't check very hard if the API does the right thing (that's what test_dispatch.py is for) Some miscellaneous changes which could have been split into other PRs, but I was too lazy to do so: - Add torch::jit::parseName (mirroring parseSchema/parseSchemaOrName) - Add cloneWithName functionality to FunctionSchema - Unconditionally generate schema registration, even when type_method_dispatch is a dict. The one exception is for manual registrations.... - Add fallback, CppFunction::makeFallthrough and CppFunction::makeFromBoxedFunction to public API of op_registration, so we can stop calling internal registerImpl directly - Add new syntax sugar dispatch_autograd for registering autograd kernels - Minor OperatorName cleanup, storing OperatorName in DispatchTable and defining operator<< on OperatorName - Refactored the op registration API to take FunctionSchema directly. We now do namespacing by post facto fixing up the OperatorName embedded in FunctionSchema. This also means that you can now do torch::import("ns1").def("ns2::blah") and have the ns2 override ns1 (although maybe this is not the correct behavior.) - New torch::schema public API, for attaching alias analysis kind annotation kinds. This meant we had to template up some function signatures which previously took const char*. There's now a nice comment explaining this strategy. - torch::import now takes std::string which means we can use the namespacing from Python Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/35629 Differential Revision: D20724551 Pulled By: ezyang fbshipit-source-id: befa46a1affb4ec4ae1fb39e3564a63695a6ca41 |
||
|
|
227beb9095 |
Revert D20680520: New operator registration API
Test Plan: revert-hammer Differential Revision: D20680520 Original commit changeset: 5d39a28e4ec7 fbshipit-source-id: 5b2497ffc24db9a05b01d526f161bc0164f9f707 |
||
|
|
28ab8c6ff8 |
New operator registration API (#35061)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35061 Main points of the new API: - You can register implementations (impl) without having to specify a schema. - Registrations are commutative, so no matter what order your static initializers run, you end up with the same end result. op_registration_test.cpp contains a reasonably comprehensive accounting for the available API surface How does this implementation proceed? The basic concept is to relax the internal invariants of Dispatcher data structures to allow the possibility that a FunctionSchema is not specified in an Operator. - DispatchKeyExtractor has an uninitialized state where it doesn't look for dispatch keys in any arguments of the stack. It can have a schema (de)registered to itself post facto with registerSchema/unregisterSchema. - DispatchTable has a new constructor taking only an OperatorName for the uninitialized state. It can have a schema (de)registered to itself post facto with registerSchema/unregisterSchema - OperatorDef maintains counts of both defs and well as defs_and_impls. defs_and_impls keeps track of the outstanding impl registrations; you may have impl registrations but no defs. If there are no defs (no schema), the operator is not returned by findSchema. A new findOperatorByName fucntion unconditionally returns the OperatorHandle even if there's no schema. OperatorHandle::hasSchema can be used to check if the operator has schema. - Replaced 'registerKernel' with 'registerImpl', which is the new interface for directly registering kernels without implementations. - Because 'registerImpl' no longer requires an OperatorHandle, change 'registerDef' to only return a RegistrationHandleRAII. This is marginally less efficient (since we're doing two hash table lookups on a registration now), but this won't matter in the long term, and probably doesn't matter now either. - Rename registerBackendFallbackKernel to registerFallback (this exposed a bunch of places where we're improperly directly interfacing with Dispatcher; we need to add this capability to the true public API) - All code generated internal registrations are switched to use the new API. This includes VariableType registrations (which previously weren't converted) and the mobile autograd stuff - Switch the new-style def()/impl() APIs to interact directly with Dispatcher, rather than indirecting through the old API - We deleted alias analysis kind merging entirely. As a nod to BC, it's possible to define a full schema with alias analysis kind, and then later do another full schema def with missing alias analysis kind, but the opposite direction is not allowed. We can remove this entirely following the plan at https://github.com/pytorch/pytorch/issues/35040 - Schema matching is moved inside the dispatcher, because we might not be able to immediately schema match at the point of an impl() (because we don't have the schema yet). To do this, we store the inferred function schema inside a KernelEntry, so we can check it when we get the real schema. - Registered kernel functions now store a debug string which can be used to more easily identify them. There's some best effort stuff based on __FUNCSIG__ but this is only really capable of reporting types and not function symbols. Tests use this to distinguish between multiple distinct registrations. Because we need our static initializers to work no matter what order they're run, the testing strategy on this PR is quite involved. The general concept: - Bind a (very gimped) version of the dispatcher API from Python, so that we can easily write a more complex testing harness using expect tests. - For series of registrations we want to test, exhaustively test every possible permutation of registrations (and deregistrations), and show that the intermediate states agree no matter what path is taken. - Intermediate states are rendered using a new dumpState() debugging method that prints the internal state of the dispatcher. This method may be generally useful for people who want to see what's in the dispatcher. - Simultaneously, add a new invariant testing function which checks that the internal invariants of the dispatcher are upheld (so we don't have to print internal implementation details of the dispatcher) The testing framework found a few bugs in development. For example, here is a case where we registered schema too early, before checking if it was valid: ``` Traceback (most recent call last): File "test/test_dispatch.py", line 164, in test_def_impl_schema_mismatch ], raises=True) File "test/test_dispatch.py", line 135, in commute results=results, raises=raises) File "test/test_dispatch.py", line 83, in run_permutation .format(ctor_order[:i], op_ix)) File "test/test_dispatch.py", line 59, in check_invariants .format(expected_provenance, actual_provenance) AssertionError: 'name[16 chars]ema: (none)\ncatchall: boxed unboxed :: (Tenso[18 chars]0)\n' != 'name[16 chars]ema: test::foo(Tensor x, Tensor y) -> (Tensor)[53 chars]0)\n' name: test::foo - schema: (none) + schema: test::foo(Tensor x, Tensor y) -> (Tensor) catchall: boxed unboxed :: (Tensor _0) -> (Tensor _0) : expected from running ctors (1,); actual from running ctors (1,) and then failing to run ctor 0 (did this failure leave the dispatcher in a wedged state? it shouldn't!) ``` There are also C++ smoketests for the API. These tests comprehensively cover the C++ API surface of the new operator registration API, but don't check very hard if the API does the right thing (that's what test_dispatch.py is for) Some miscellaneous changes which could have been split into other PRs, but I was too lazy to do so: - Add torch::jit::parseName (mirroring parseSchema/parseSchemaOrName) - Add cloneWithName functionality to FunctionSchema - Unconditionally generate schema registration, even when type_method_dispatch is a dict. The one exception is for manual registrations.... - Add fallback, CppFunction::makeFallthrough and CppFunction::makeFromBoxedFunction to public API of op_registration, so we can stop calling internal registerImpl directly - Add new syntax sugar dispatch_autograd for registering autograd kernels - Minor OperatorName cleanup, storing OperatorName in DispatchTable and defining operator<< on OperatorName - Refactored the op registration API to take FunctionSchema directly. We now do namespacing by post facto fixing up the OperatorName embedded in FunctionSchema. This also means that you can now do torch::import("ns1").def("ns2::blah") and have the ns2 override ns1 (although maybe this is not the correct behavior.) - New torch::schema public API, for attaching alias analysis kind annotation kinds. This meant we had to template up some function signatures which previously took const char*. There's now a nice comment explaining this strategy. - torch::import now takes std::string which means we can use the namespacing from Python Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20680520 Pulled By: ezyang fbshipit-source-id: 5d39a28e4ec7c73fe4b1fb2222e865ab65e188f5 |