pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Sebastian Messmer	4534bf5799	Fix NativeFunctions.h for c10-full ops (#46090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46090 ghstack-source-id: 114269272 Test Plan: vs base diff: https://www.internalfb.com/intern/fblearner/details/223884639/ Reviewed By: ezyang Differential Revision: D24219942 fbshipit-source-id: 6f338c7c0dd5adfe2fba8b36ccc340032d3faef8	2020-10-14 06:32:36 -07:00
Edward Yang	d705083c2b	Refactor dispatcher and native to use Signature structure. (#45990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45990 In #45890 we introduced the concept of a CppSignature, which bundled up all of the information necessary to declare a C++ signature for the cpp API. This PR introduces analogous concepts for dispatcher and native: DispatcherSignature and NativeSignature. The three interfaces are not particularly well coupled right now, but they do have some duck typing coincidences: - defn() which renders the C++ definition "bool f(int x)" - decl() which renders the C++ declaration "bool f(int x = 2)" - type() which renders the C++ function type "bool(int)" Maybe at some point we'll introduce a Protocol, or a supertype. Many other methods (like arguments()) have varying types. These signatures also have some helper methods that forward back to real implementations in the api modules. Something to think about is whether or not we should attempt to reduce boilerplate here or not; I'm not too sure about it yet. The net effect is we get to reduce the number of variables we have to explicitly write out in the codegen, since now these are all bundled together into a signature. Something extra special happens in BackendSelect, where we now dynamically select between dispatcher_sig and native_sig as "how" the backend select is implemented. A little bit of extra cleanup: - Some places where we previously advertised Sequence, we now advertise a more informative Tuple. - defn() may take an optional positional parameter overriding the entire name, or a kwarg-only prefix parameter to just add a prefix to the name. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24223100 Pulled By: ezyang fbshipit-source-id: f985eced08af4a60ba9641d125d0f260f8cda9eb	2020-10-13 08:34:48 -07:00
Edward Yang	8d5c899b19	Rename legacy_dispatcher to native. (#45974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45974 The term "legacy dispatcher" caused a bunch of confusion between me and Sebastian when discussing what the intended semantics of legacy dispatcher argument is. Legacy dispatcher argument implies that you ought NOT to use it when you have use_c10_dispatcher: full; but that's not really what's going on; legacy dispatcher API describes the API that you write native:: functions (NativeFunctions.h) to. Renaming it here makes this more clear. I applied these seds: ``` git grep -l 'legacy_dispatcher' \| xargs sed -i 's/legacy_dispatcher/native/g' git grep -l 'legacydispatcher' \| xargs sed -i 's/legacydispatcher/native/g' git grep -l 'LegacyDispatcher' \| xargs sed -i 's/LegacyDispatcher/Native/g' ``` and also grepped for "legacy" in tools/codegen and fixed documentation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24223101 Pulled By: ezyang fbshipit-source-id: d1913b8b823b3b95e4546881bc0e876acfa881eb	2020-10-13 08:34:43 -07:00
Edward Yang	9079aea1ac	Rewrite implementation of faithful cpp signatures (#45890 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890 This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506 I did the rewrite by reverting #44087 and then reimplementing it on top. You may find it easier to review by diffing against master with only #44087 reverted. There are two main ideas. First, we now factor cpp argument processing into two phases operating on three representations of data: 1. `FunctionSchema` - this is the source from native_functions.yaml 2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is the arguments after doing some basic semantic analysis to group them (for TensorOptions) or identify the this argument (if this is a method). There is only ever one of these per functions. 3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` - this is the arguments after we've elaborated them to C++. There may be multiple of these per actual C++ signature. You can think of (2) as common processing, whereas (3) bakes in specific assumptions about whether or not you have a faithful or non-faithful signature. Second, we now have CppSignature and CppSignatureGroup representing the total public C++ API signature. So those dataclasses are what know how to render definitions/declarations, and you no longer have to manually type it out in the Functions/TensorMethods codegen. Here is an exhaustive accounting of the changes. tools.codegen.api.types - CppSignature and CppSignatureGroup got moved to tools.codegen.api.types - Add new CppThisArgument and CppTensorOptionsArguments (modeled off of ThisArgument and TensorOptionsArguments) so that we can retain high level semantic structure even after elaborating terms with C++ API information. Once this is done, we can refine CppArgument.argument to no longer contain a ThisArgument (ThisArgument is always translated to CppThisArgument. Note that this doesn't apply to TensorOptionsArguments, as those may be expanded or not expanded, and so you could get a single CppArgument for 'options') - Add no_default() functional mutator to easily remove default arguments from CppArgument and friends - Add an explicit_arguments() method to CppArgument and friends to extract (flat) argument list that must be explicitly written in the signature. This is everything except (Cpp)ThisArgument, and is also convenient when you don't care about the extra structure of CppTensorOptionsArguments tools.codegen.api.cpp - group_arguments is back, and it doesn't send things directly to a CppSignatureGroup; instead, it moves us from representation (1) to (2) (perhaps it should live in model). Here I changed my mind from my PR comment; I discovered it was not necessary to do classification at grouping time, and it was simpler and easier to do it later. - argument got split into argument_not_this/argument/argument_faithful. argument and argument_faithful are obvious enough what they do, and I needed argument_not_this as a more refined version of argument so that I could get the types to work out on TensorOptionsArguments tools.codegen.api.dispatcher - Here we start seeing the payoff. The old version of this code had a "scatter" mode and a "gather" mode. We don't need that anymore: cppargument_exprs is 100% type-directed via the passed in cpp arguments. I am able to write the functions without any reference to use_c10_dispatcher tools.codegen.gen - Instead of having exprs_str and types_str functions, I moved these to live directly on CppSignature, since it seemed pretty logical. - The actual codegen for TensorMethods/Functions is greatly simplified, since (1) all of the heavy lifting is now happening in CppSignature(Group) construction, and (2) I don't need to proxy one way or another, the new dispatcher translation code is able to handle both cases no problem. There is a little faffing about with ordering to reduce the old and new diff which could be removed afterwards. Here are codegen diffs. For use_c10_dispatcher: full: ``` +// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) { - return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt()); + static auto op = c10::Dispatcher::singleton() + .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "") + .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>(); + return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt()); } ``` Otherwise: ``` +// aten::empty_meta(int[] size, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) { - return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format); + static auto op = c10::Dispatcher::singleton() + .findSchemaOrThrow("aten::empty_meta", "") + .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>(); + return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format); } ``` Things that I probably did not get right: - The Union[Argument, TensorOptionsArguments, ThisArgument] and the Cpp variants are starting to get a little unwieldy. Not sure if this means I should add a supertype (or at the very least an alias); in some cases I do purposely omit one of these from the Union - Code may not necessarily live in the most logical files. There isn't very much rhyme or reason to it. - The fields on CppSignature. They're not very well constrained and it will be better if people don't use them directly. - Disambiguation. We should do this properly in #44087 and we don't need special logic for deleting defaulting for faithful signatures; there is a more general story here. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24144035 Pulled By: ezyang fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a	2020-10-13 08:31:54 -07:00
Sebastian Messmer	6ba6ecb048	Only use hacky_wrapper_for_legacy_signatures if an op needs it (#45742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45742 Add a new flag to native_functions.yaml: `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures` and the codegen only wraps kernels in the aforementioned wrapper if that flag is set. Apart from that, `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures` is equivalent to `full`, i.e. it has full boxing and unboxing support. This greatly reduces the number of ops we apply the hacky_wrapper to, i.e. all ops marked as `use_c10_dispatcher: full` don't have it anymore. ghstack-source-id: 113982139 Test Plan: waitforsandcastle vs fbcode: https://www.internalfb.com/intern/fblearner/details/214511705/ vs base diff: https://www.internalfb.com/intern/fblearner/details/214693207/ Reviewed By: ezyang Differential Revision: D23328718 fbshipit-source-id: be120579477b3a05f26ca5f75025bfac37617620	2020-10-12 09:39:18 -07:00
Sebastian Messmer	6e2eee2b9d	Add faithful C++ API (#44087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44087 Each op taking a TensorOptions argument now has an additional overload in the C++ frontend where it takes scattered ScalarType, Layout, Device, bool instead of one TensorOptions argument. If it is a c10-full op, then the scattered version calls into the dispatcher and the gathered version is a proxy calling into the scattered version. If it is a non-c10-full op, then the gathered version calls into the dispatcher and the scattered version is a proxy calling into the gathered version. This should minimize the amount of gathering and scattering needed. This PR is also a prerequisite to remove the re-gathering of arguments that is currently happening in VariableKernel. Currently, VariableKernels gather arguments into a TensorOptions object to call into the C++ API. In a PR stacked on top of this, VariableKernel will just directly call into the scattered C++ API introduced here and avoid the gathering step. ghstack-source-id: 113355689 Test Plan: waitforsandcastle vs master: https://www.internalfb.com/intern/fblearner/details/216169815/ vs previous diff: https://www.internalfb.com/intern/fblearner/details/216169957/ Reviewed By: ezyang Differential Revision: D23492188 fbshipit-source-id: 3e84c467545ad9371e98e09075a311bd18411c5a	2020-10-02 04:08:53 -07:00
Ailing Zhang	606b1a9a2e	Move xla codegen to aten. (#45241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45241 Test Plan: Imported from OSS Reviewed By: soumith Differential Revision: D23926750 Pulled By: ailzhang fbshipit-source-id: f768e24a9baeca9f9df069a62d6f8b94a853a1ee	2020-09-25 18:07:32 -07:00
Sebastian Messmer	2ac7de7d53	Remove hacky_wrapper from BackendSelect kernels (#44062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44062 Previously, BackendSelect kernels were still written in the legacy way, i.e. they took one TensorOptions argument instead of scattered dtype, layout, device, pin_memory, and they used hacky_wrapper to be callable. This caused a re-wrapping step. Calling into a BackencSelect kernel required taking the individual scattered arguments, packing them into a TensorOptions, and the kernel itself then gathered them again for redispatch. Now with this PR, BackendSelect kernels are written in the new way and no hacky_wrapper or rewrapping is needed for them. ghstack-source-id: 112825789 Test Plan: vs master: https://www.internalfb.com/intern/fblearner/details/216117032/ vs previous diff: https://www.internalfb.com/intern/fblearner/details/216170194/ Reviewed By: ezyang Differential Revision: D23484192 fbshipit-source-id: e8fb49c4692404b6b775d18548b990c4cdddbada	2020-09-25 09:04:03 -07:00
Edward Yang	6ea89166bd	Rewrite of ATen code generator (#42629 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42629 How to approach reviewing this diff: - The new codegen itself lives in `tools/codegen`. Start with `gen.py`, then read `model.py` and them the `api/` folder. The comments at the top of the files describe what is going on. The CLI interface of the new codegen is similar to the old one, but (1) it is no longer necessary to explicitly specify cwrap inputs (and now we will error if you do so) and (2) the default settings for source and install dir are much better; to the extent that if you run the codegen from the root source directory as just `python -m tools.codegen.gen`, something reasonable will happen. - The old codegen is (nearly) entirely deleted; every Python file in `aten/src/ATen` was deleted except for `common_with_cwrap.py`, which now permanently finds its home in `tools/shared/cwrap_common.py` (previously cmake copied the file there), and `code_template.py`, which now lives in `tools/codegen/code_template.py`. We remove the copying logic for `common_with_cwrap.py`. - All of the inputs to the old codegen are deleted. - Build rules now have to be adjusted to not refer to files that no longer exist, and to abide by the (slightly modified) CLI. - LegacyTHFunctions files have been generated and checked in. We expect these to be deleted as these final functions get ported to ATen. The deletion process is straightforward; just delete the functions of the ones you are porting. There are 39 more functions left to port. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D23183978 Pulled By: ezyang fbshipit-source-id: 6073ba432ad182c7284a97147b05f0574a02f763	2020-08-31 09:00:22 -07:00

9 Commits