pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Jiakai Liu	d91cefb0d8	[pytorch][codegen] migrate gen_annotated_fn_args.py to new codegen model (#47745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47745 This is a relatively small codegen. Reintroduced 'simple_type' to preserve old codegen output. It depends on some methods defined in gen_python_functions.py - next PR will clean up the remaining Declarations.yaml methods in gen_python_functions.py. Confirmed byte-for-byte compatible with the old codegen: ``` Run it before and after this PR: .jenkins/pytorch/codegen-test.sh <baseline_output_dir> .jenkins/pytorch/codegen-test.sh <test_output_dir> Then run diff to compare the generated files: diff -Naur <baseline_output_dir> <test_output_dir> ``` Differential Revision: D24885068 Test Plan: Imported from OSS Reviewed By: ezyang Pulled By: ljk53 fbshipit-source-id: c0fbd726bcc450c3c7fe232c23e5b31779d0b65f	2020-11-14 02:24:39 -08:00
Edward Yang	809660ffa4	ATen DerivedType is dead, long live ATen RegisterDispatchKey (#47011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47011 smessmer has complained about how it is difficult to find generated code. Well hopefully this diffs helps a bit with that. There are three components to this refactor: - Rename TypeDerived (CPUType) to RegisterDispatchKey (RegisterCPU). The 'Type' nomenclature is vestigial and I think Register says what these files do a lot more clearly. I also got rid of the CPUType namespace; everything just goes in anonymous namespace now, less moving parts this way. - Give Math and DefaultBackend their own files (RegisterMath and RegisterDefaultBackend) - Restructure code generation so that schema definition is done completely separately from RegisterDispatchKey I decided to name the files RegisterCPU rather than the old convention BackendSelectRegister, because it seems better to me if these files clump together in an alphabetical listing rather than being spread out everywhere. There are a few manual registration files which should probably get similar renaming. I also did a little garden cleaning about how we identify if a dispatch key is a cuda key or a generic key (previously called KEYWORD_ALL_BACKENDS but I like my naming better). Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D24600806 Test Plan: Imported from OSS Reviewed By: smessmer Pulled By: ezyang fbshipit-source-id: c1b510dd7515bd95e3ad25b8edf961b2fb30a25a	2020-11-12 09:53:48 -08:00
Edward Yang	0c64f9f526	Convert from higher order functions to classes in tools.codegen.gen (#47008 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47008 bhosmer has been complaining about how it is difficult to distinguish between local variables and closed over variables in the higher order functions. Well, closures and objects do basically the same thing, so just convert all these HOFs into objects. The decoder ring: - Higher order function => Constructor for object - Access to closed over variable => Access to member variable on object - with_native_function => method_with_native_function (because it's hard writing decorators that work for both functions and methods) I didn't even have to change indentation (much). When there is no need for closed over variables (a few functions), I kept them as plain old functions, no need for an object with no members. While I was at it, I also deleted the kwargs, since the types are enough to prevent mistakes. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24600805 Pulled By: ezyang fbshipit-source-id: 7e3ce8cb2446e3788f934ddcc17f7da6e9299511	2020-11-11 10:30:50 -08:00
Iurii Zdebskyi	1c45631f10	Revert D24737050: [WIP] Adding bunch of unary foreach APIs Test Plan: revert-hammer Differential Revision: D24737050 (`b6a2444eff`) Original commit changeset: deb59b41ad1c fbshipit-source-id: 76cd85028114cfc8fc5b7bb49cd27efc2e315aa5	2020-11-10 09:41:41 -08:00
Iurii Zdebskyi	b6a2444eff	[WIP] Adding bunch of unary foreach APIs (#47383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47383 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D24737050 Pulled By: izdeby fbshipit-source-id: deb59b41ad1c79b66cafbd9a9d3d6b069794e743	2020-11-09 14:14:28 -08:00
Jiakai Liu	4159191f0e	[pytorch] split out trace type generator and migrate to new codegen model (#47438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47438 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24808211 Pulled By: ljk53 fbshipit-source-id: 44dfadf550a255c05aa201e54b48101aaf722885	2020-11-09 12:39:39 -08:00
Jiakai Liu	499d2fad98	[pytorch] factor out return_names api (#47437 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47437 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24808213 Pulled By: ljk53 fbshipit-source-id: 8ec6d58952fd677ab2d97e63b060cafda052411a	2020-11-09 12:39:37 -08:00
Jiakai Liu	16c72a5a6b	[pytorch] continue to rewrite gen_python_functions.py with typed models (#46978 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46978 Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D24589210 Pulled By: ljk53 fbshipit-source-id: e0c7e5b3672b41983f321400c2e2330d1462e76e	2020-11-08 01:34:12 -08:00
Brian Hirsh	7a0f0d24d0	Codegen - error when an argument that looks like an out argument isn't a kwarg (fix #43273 ) (#47284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47284 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D24706763 Pulled By: bdhirsh fbshipit-source-id: 60fbe81a0dff7e07aa8c169235d15b84151d3ed7	2020-11-03 16:30:01 -08:00
Edward Yang	843cab3f2e	Delete TypeDefault.h and TypeDerived.h codegen entirely. (#47002 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47002 There was no good reason for TypeDerived.h (CPUType.h) codegen to exist after static dispatch was deleted, and now that we have Math alias key TypeDefault.h header is not needed either. Sorry to anyone who was using these out of tree. I didn't entirely delete TypeDefault.h as it has a use in a file that I can't conveniently compile test locally. Will kill it entirely in a follow up. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24596583 Pulled By: ezyang fbshipit-source-id: b5095d3509098ff74f836c5d0c272db0b2d226aa	2020-10-29 14:43:53 -07:00
Edward Yang	41f8641f1e	Delete SchemaRegister.cpp, make flag operate on TypeDefault.cpp (#46991 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46991 This change is motivated by a problem bdhirsh observed which is that in internal builds that include both SchemaRegister.cpp and TypeDefault.cpp, some operators have their schemas defined multiple times. Instead of dumping schema registrations in multiple files, it seems better to just toggle how many schemas we write into TypeDefault.cpp. ljk53 observes that technically SchemaRegister.cpp is only needed by full-JIT frontend, and not by light interpreter (to resolve schema lookups). However, in practice, the registration file seems to be unconditionally loaded. This change will make it harder to do the optimization where we drop schemas in the light interpreter, but you probably want to architect this differently (similar to per-op registrations, DON'T do any registrations in ATen, and then write out the schema registrations in a separate library.) I took this opportunity to also simplify the TypeDefault generation logic by reworking things so that we only ever call with None argument when registering. Soon, we should be able to just split these files up entirely. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D24593704 Pulled By: ezyang fbshipit-source-id: f01ea22a3999493da77b6e254d188da0ce9adf2f	2020-10-29 14:43:47 -07:00
Edward Yang	54d83296a9	Desugar missing dispatch field into singleton Math entry (#46970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46970 Now that catchall declarations are reinterpreted as registrations to dispatch key Math, we can now simplify code generation logic by directly generating to Math, and bypasing logic for catchall. This also helps avoid bugs where we incorrectly classify some kernels as Math and others as not, even though they get registered in the same way. Bill of changes: - Give Math its own unique TORCH_LIBRARY_IMPL - Make it so NativeFunction.dispatch is always non-None. Simplify downstream conditionals accordingly - When parsing NativeFunction, fill in missing dispatch with a singleton Math entry (pointing to the cpp.name!) One thing that is a little big about this change is a lot of kernels which previously didn't report as "math" now report as math. I picked a setting for these booleans that made sense to me, but I'm not sure if e.g. XLA will handle it 100% correctly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24592391 Pulled By: ezyang fbshipit-source-id: 2e3355f19f9525698864312418df08411f30a85d	2020-10-29 14:43:44 -07:00
Edward Yang	87e86fa84c	Some miscellaneous cleanup in codegen (#46940 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46940 - Remove inaccurate generated comments - Delete some dead code - Delete some unused headers - Delete unnecessary SparseTypeDerived.cpp template Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24573971 Pulled By: ezyang fbshipit-source-id: 3de05d9cd9bada4c73f01d6cfaf51f16ada66013	2020-10-29 14:43:41 -07:00
Edward Yang	dc6f723cb4	Delete Vulkan from code generator. (#46938 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46938 It turns out that after https://github.com/pytorch/pytorch/pull/42194 landed we no longer actually generate any registrations into this file. That means it's completely unnecessary. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: IvanKobzarev Differential Revision: D24573518 Pulled By: ezyang fbshipit-source-id: b41ada9e394b780f037f5977596a36b896b5648c	2020-10-29 14:40:54 -07:00
Jiakai Liu	9d23fd5c00	[pytorch] get rid of cpp_type_str from pybind codegen (#46977 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46977 Clean up a few TODOs in the new python binding codegen. Get rid of the _simple_type() hack and the uses of cpp_type_str. Now python argument type strings and PythonArgParser unpacking methods are directly generated from the original Type model. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D24589209 Pulled By: ljk53 fbshipit-source-id: b2a6c3911d58eae49c031d319c8ea6f804e2cfde	2020-10-28 21:25:55 -07:00
Jiakai Liu	79474a1928	[pytorch] simplify tensor options logic in pybinding codegen (#46976 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46976 Technically, it's not semantic preserving, e.g.: emition of 'requires_grad' is no longer gated by 'has_tensor_return' - there is no guarantee that is_like_or_new_function should all have tensor return. But the output is identical so there might be some invariant - could also add assertion to fail loudly when it's broken. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D24589211 Pulled By: ljk53 fbshipit-source-id: 47c7e43b080e4e67a526fde1a8a53aae99df4432	2020-10-28 21:22:59 -07:00
Alban Desmaison	46b252b83a	Revert D24262885: [pytorch][PR] Added foreach_zero_ API Test Plan: revert-hammer Differential Revision: D24262885 (`8e37dcb1f3`) Original commit changeset: 144c283dd009 fbshipit-source-id: 451b202e23bc1fcb11b20d26c11d9a1329789d22	2020-10-28 06:48:59 -07:00
iurii zdebskyi	8e37dcb1f3	Added foreach_zero_ API (#46215 ) Summary: Adding Added foreach_zero_(TensorList) API Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/46215 Reviewed By: zhangguanheng66 Differential Revision: D24262885 Pulled By: izdeby fbshipit-source-id: 144c283dd00924083096d6d92eb9085cbd6097d3	2020-10-27 18:03:34 -07:00
Ansley Ussery	6c5f634657	Fix grammar and spelling errors (#46713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46713 Test Plan: Imported from OSS Reviewed By: Lilyjjo Differential Revision: D24477771 Pulled By: ansley fbshipit-source-id: bc39b63ab2158a5233e48b89bfaa97a4cfb1f7a1	2020-10-23 01:31:17 -07:00
Alexander Grund	93719440b8	Replace map(lambda constructs (#46462 ) Summary: Follow-up of https://github.com/pytorch/pytorch/issues/46461 with a similar goal Makes them more readable and possibly faster. Care has to be taken because `map` applies the function immediately while `(x for x in xs)` is a generator expression which gets evaluated later. This is a benefit in some cases where it is not required to actually create the list of values in memory (e.g. when passing to `tuple` or `extend` or `join`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46462 Reviewed By: zou3519 Differential Revision: D24422343 Pulled By: ezyang fbshipit-source-id: 252e33499c92ac0b15238f2df32681dbbda2b237	2020-10-22 09:50:22 -07:00
Ailing Zhang	33e82c0269	Update error message to include link to readme. (#46613 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46613 Test Plan: CI Reviewed By: ezyang Differential Revision: D24430852 fbshipit-source-id: 811e4d10508d47ef830d2b8445f11592f342461f	2020-10-21 19:38:19 -07:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Jiakai Liu	3d421b3137	[pytorch] rewrite of the python binding codegen with the v2 API (#46244 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46244 - What does the generated binding code do? The Python binding codegen produces code that takes the input list of PyObjects, finds the matching ATen C++ function using PythonArgParser, converts the PyObjects into C++ types and calls the ATen C++ function: ``` +--------+ parsing +------------------------+ binding +-----------------------+ \| PyObjs \| ---------> \| PythonArgParser Output \| ---------> \| Cpp Function Dispatch \| +--------+ +------------------------+ +-----------------------+ ``` - Are Python arguments 1-1 mapped to C++ arguments? Python arguments might be reordered, packed, unpacked when binding to C++ arguments, as illustrated below: ``` // Binding - Reorder & Packing // aten::empty.names(int[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor Python Args Cpp Args ----------------------------------------------------------- 0: size size 1: names names 2: memory_format -------+ 3: dtype -----+-\|--> options 4: layout / \| 5: device / +--> memory_format 6: pin_memory / 7: requires_grad -+ // Binding - Unpacking // aten::max.names_dim(Tensor self, Dimname dim, bool keepdim=False) -> (Tensor values, Tensor indices) Python Args Cpp Args ----------------------------------------------------------- +----> max /-----> max_values 0: input / self 1: dim / dim 2: keepdim / keepdim 3: out -----+ ``` - Why do we want to rewrite the python binding codegen? The old codegen takes Declarations.yaml as input. It doesn't distinguish between Python arguments and C++ arguments - they are all mixed together as a bag of non-typed dict objects. Different methods process these arg objects and add new attributes for various different purposes. It's not so obvious to figure out the semantics of these attributes. The complicated binding logic happens implicitly and scatteredly. ``` +--------------------+ \| Native Functions \| +--------------------+ \| \| v +--------------------+ \| Cpp Signatures \| +--------------------+ \| \| v +--------------------+ \| Declarations.yaml \| +--------------------+ \| +-------------------------------------+ \| +-------> \| PythonArgParser Schema \| \| \| +-------------------------------------+ \| \| . \| \| . v \| . +--------------------+ +-------------------------------------+ \| NonTyped Args Objs \| --> \| PythonArgParser -> Cpp Args Binding \| +--------------------+ +-------------------------------------+ \| . \| . \| . \| +-------------------------------------+ +-------> \| Cpp Function Dispatch \| +-------------------------------------+ ``` This PR leverages the new immutable data models introduced in the new aten codegen. It introduces dedicated data models for python schema. This way, we can not only avoid subtle Declaration.yaml conversions but also decouple the generation of python schema, python to c++ binding and c++ function call. The ultimate state will be like the following diagram: ``` +-------------------+ +-------------------------------------+ +-------> \| Python Signatures \| --> \| PythonArgParser Schema \| \| +-------------------+ +-------------------------------------+ \| \| . \| \| . \| \| . +------------------+ \| +-------------------------------------+ \| Native Functions \| +-------> \| PythonArgParser -> Cpp Args Binding \| +------------------+ \| +-------------------------------------+ \| \| . \| \| . \| \| . \| +-------------------+ +-------------------------------------+ +-------> \| Cpp Signatures \| --> \| Cpp Function Dispatch \| +-------------------+ +-------------------------------------+ ``` This PR has migrated the core binding logic from tools/autograd/gen_python_functions.py to tools/codegen/api/python.py. It produces the byte-for-byte same results (tested with #46243). Will migrate the rest of gen_python_functions.py in subsequent PRs. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24388874 Pulled By: ljk53 fbshipit-source-id: f88b6df4e917cf90d868a2bbae2d5ffb680d1841	2020-10-19 17:36:45 -07:00
Iurii Zdebskyi	e7564b076c	Refactor scalar list APIs to use overloads (#45673 ) Summary: Refactor foreach APIs to use overloads in case of scalar list inputs. Tested via unit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45673 Reviewed By: heitorschueroff Differential Revision: D24053424 Pulled By: izdeby fbshipit-source-id: 35976cc50b4acfe228a32ed26cede579d5621cde	2020-10-19 09:28:49 -07:00
Dhruv Matani	0c5cd8c2b9	[RFC] Switch PyTorch Selective Build (Custom Build) to use the SelectiveBuilder abstraction (#45722 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45722 This diff does a bunch of things: 1. Introduces some abstractions as detailed in https://fb.quip.com/2oEzAR5MKqbD to help with selective build related codegen in multiple files. 2. Adds helper methods to combine operators, debug info, operator lists, etc... 3. Currently, the selective build machinery querying `op_registration_whitelist` directly at various places in the code. `op_registration_whitelist` is a list of allowed operator names (without overload name). We want to move to a world where the overload names are also included so that we can be more selective about which operators we include. To that effect, it makes sense to hide the checking logic in a separate abstraction and have the build use that abstraction instead of putting all this selective build specific logic in the code-generator itself. This change is attempting to do just that. 4. Updates generate_code, unboxing-wrapper codegen, and autograd codegen to accept the operator selector paradigm as opposed to a selected operator list. 5. Update `tools/code_analyzer/gen_op_registration_allowlist.py` to expose providing an actual structured operator dependency graph in addition to a serialized string. There are a bunch of structural changes as well: 1. `root_op_list.yaml` and `combined_op_list.yaml` are now actual YAML files (not a space separated list of operator names) 2. `generate_code.py` accepts only paths to operator list YAML files (both old style as well as new style) and not list of operator names on the command line as arguments 3. `gen.py` optionally also accepts a custom build related operators YAML path (this file has information about which operators to register in the generated library). ghstack-source-id: 114578753 (Note: this ignores all push blocking failures!) Test Plan: `buck test caffe2/test:selective_build` Generated YAML files after the change: {P143981979} {P143982025} {P143982056} Ensure that the generated files are same before and after the change: ``` [dhruvbird@devvm2490 /tmp/TypeDefault.cpp] find -name ".cpp" \| xargs md5sum d72c3d125baa7b77e4c5581bbc7110d2 ./after_change/gen_aten/TypeDefault.cpp 42353036c83ebc7620a7159235b9647f ./after_change/lite_predictor_lib_aten/TypeDefault.cpp d72c3d125baa7b77e4c5581bbc7110d2 ./before_change/gen_aten/TypeDefault.cpp 42353036c83ebc7620a7159235b9647f ./before_change/lite_predictor_lib_aten/TypeDefault.cpp ``` `VariableTypes_N.cpp` are generated the same both before and after the change: ``` [dhruvbird@devvm2490 /tmp/VariableType] find -name ".cpp" \| xargs -n 1 md5sum \| sort 3be89f63fd098291f01935077a60b677 ./after/VariableType_2.cpp 3be89f63fd098291f01935077a60b677 ./before/VariableType_2.cpp 40a3e59d64e9dbe86024cf314f127fd6 ./after/VariableType_4.cpp 40a3e59d64e9dbe86024cf314f127fd6 ./before/VariableType_4.cpp a4911699ceda3c3a430f08c64e8243fd ./after/VariableType_1.cpp a4911699ceda3c3a430f08c64e8243fd ./before/VariableType_1.cpp ca9aa611fcb2a573a8cba4e269468c99 ./after/VariableType_0.cpp ca9aa611fcb2a573a8cba4e269468c99 ./before/VariableType_0.cpp e18f639ed23d802dc4a31cdba40df570 ./after/VariableType_3.cpp e18f639ed23d802dc4a31cdba40df570 ./before/VariableType_3.cpp ``` Reviewed By: ljk53 Differential Revision: D23837010 fbshipit-source-id: ad06b1756af5be25baa39fd801dfdf09bc565442	2020-10-18 15:10:42 -07:00
Ailing Zhang	8c629ecc9a	[WIP] Move catchAll to Math (#45939 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45939 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24165890 Pulled By: ailzhang fbshipit-source-id: 72fe71ea95a738251b2fafc9eea4ab3831cf426b	2020-10-16 16:17:16 -07:00
Ailing Zhang	ec5f81f9d3	Remove variable_excluded_from_dispatch() check for factory functions. (#46371 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46371 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24324545 Pulled By: ailzhang fbshipit-source-id: 78038054690dff14883df711073be4c2da4e1f8b	2020-10-15 21:15:41 -07:00
Ailing Zhang	419dafe791	[Reland] Update native_functions.yaml to add DefaultBackend. (#46236 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46236 Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D24273378 Pulled By: ailzhang fbshipit-source-id: bed1d4c84c0bba88a7da4d9bd2ccaa58253cf91e	2020-10-14 22:37:28 -07:00
Iurii Zdebskyi	8a074af929	Added scalar lists APIs for addcdiv and addcmul (#45932 ) Summary: 1) Added new APIs: _foreach_addcdiv(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcdiv_(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcmul(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) _foreach_addcmul_(Tensor(a!)[] self, Tensor[] tensor1, Tensor[] tensor2, float[] scalars) 2) Updated optimizers to use new APIs Tested via unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/45932 Reviewed By: navahgar Differential Revision: D24150306 Pulled By: izdeby fbshipit-source-id: c2e65dedc95d9d81a2fdd116e41df0accb0b6f26	2020-10-14 08:12:37 -07:00
Sebastian Messmer	69e152e60b	Fix device guard for c10-full ops (#46091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46091 ghstack-source-id: 114269274 Test Plan: vs prev diff: https://www.internalfb.com/intern/fblearner/details/224487971/ vs D23328718 (`6ba6ecb048`) : https://www.internalfb.com/intern/fblearner/details/224488043/ Reviewed By: ezyang Differential Revision: D24219943 fbshipit-source-id: bbabafb5c5b76ce0e93df4fdae2f08221354d9f7	2020-10-14 06:32:43 -07:00
Sebastian Messmer	4534bf5799	Fix NativeFunctions.h for c10-full ops (#46090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46090 ghstack-source-id: 114269272 Test Plan: vs base diff: https://www.internalfb.com/intern/fblearner/details/223884639/ Reviewed By: ezyang Differential Revision: D24219942 fbshipit-source-id: 6f338c7c0dd5adfe2fba8b36ccc340032d3faef8	2020-10-14 06:32:36 -07:00
Ailing Zhang	a37f2749cd	Avoid computing AutogradKey if not needed. (#46252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46252 Test Plan: CI Reviewed By: ngimel Differential Revision: D24272744 fbshipit-source-id: 6cb66d13e6c910df1ad1a8badd43f990e7b55368	2020-10-13 15:01:55 -07:00
chengjun	5741de883a	Define the record_stream method in native_functions.yaml (#44301 ) Summary: The record_stream method was hard coded for CUDA device. Define the record_stream in the native_functions.yaml to enable the dynamic dispatch to different end device. Fixes https://github.com/pytorch/pytorch/issues/36556 Pull Request resolved: https://github.com/pytorch/pytorch/pull/44301 Reviewed By: glaringlee Differential Revision: D23763954 Pulled By: ezyang fbshipit-source-id: e6d24f5e7892b56101fa858a6cad2abc5cdc4293	2020-10-13 09:15:22 -07:00
Edward Yang	d705083c2b	Refactor dispatcher and native to use Signature structure. (#45990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45990 In #45890 we introduced the concept of a CppSignature, which bundled up all of the information necessary to declare a C++ signature for the cpp API. This PR introduces analogous concepts for dispatcher and native: DispatcherSignature and NativeSignature. The three interfaces are not particularly well coupled right now, but they do have some duck typing coincidences: - defn() which renders the C++ definition "bool f(int x)" - decl() which renders the C++ declaration "bool f(int x = 2)" - type() which renders the C++ function type "bool(int)" Maybe at some point we'll introduce a Protocol, or a supertype. Many other methods (like arguments()) have varying types. These signatures also have some helper methods that forward back to real implementations in the api modules. Something to think about is whether or not we should attempt to reduce boilerplate here or not; I'm not too sure about it yet. The net effect is we get to reduce the number of variables we have to explicitly write out in the codegen, since now these are all bundled together into a signature. Something extra special happens in BackendSelect, where we now dynamically select between dispatcher_sig and native_sig as "how" the backend select is implemented. A little bit of extra cleanup: - Some places where we previously advertised Sequence, we now advertise a more informative Tuple. - defn() may take an optional positional parameter overriding the entire name, or a kwarg-only prefix parameter to just add a prefix to the name. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24223100 Pulled By: ezyang fbshipit-source-id: f985eced08af4a60ba9641d125d0f260f8cda9eb	2020-10-13 08:34:48 -07:00
Edward Yang	f086032676	Remove unnecessary byte-for-byte compatibility code that is not needed. (#45975 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45975 I reordered declarations in the faithful API reimplementation to make sure the diffs lined up nicely; they're not necessary now. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24223102 Pulled By: ezyang fbshipit-source-id: 77c6ae40c9a3dac36bc184dd6647d6857c63a50c	2020-10-13 08:34:46 -07:00
Edward Yang	8d5c899b19	Rename legacy_dispatcher to native. (#45974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45974 The term "legacy dispatcher" caused a bunch of confusion between me and Sebastian when discussing what the intended semantics of legacy dispatcher argument is. Legacy dispatcher argument implies that you ought NOT to use it when you have use_c10_dispatcher: full; but that's not really what's going on; legacy dispatcher API describes the API that you write native:: functions (NativeFunctions.h) to. Renaming it here makes this more clear. I applied these seds: ``` git grep -l 'legacy_dispatcher' \| xargs sed -i 's/legacy_dispatcher/native/g' git grep -l 'legacydispatcher' \| xargs sed -i 's/legacydispatcher/native/g' git grep -l 'LegacyDispatcher' \| xargs sed -i 's/LegacyDispatcher/Native/g' ``` and also grepped for "legacy" in tools/codegen and fixed documentation. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24223101 Pulled By: ezyang fbshipit-source-id: d1913b8b823b3b95e4546881bc0e876acfa881eb	2020-10-13 08:34:43 -07:00
Edward Yang	527a8bee02	Reorder dispatcher/legacy_dispatcher types (#45973 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45973 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24163527 Pulled By: ezyang fbshipit-source-id: 2631a2ccd7ab525fe32fa56192ded4ff7ac3723f	2020-10-13 08:34:39 -07:00
Edward Yang	944eb0e31d	Add NativeFunctionGroup (#45918 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45918 This groups together related native functions (functional, inplace, out) into a single group. It's not used by anything but Jiakai said this would be useful for his stuff so I'm putting it in immediately. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24163526 Pulled By: ezyang fbshipit-source-id: 9979b0fe9249c78e4a64a50c5ed0e2ab99f499b9	2020-10-13 08:34:36 -07:00
Edward Yang	9079aea1ac	Rewrite implementation of faithful cpp signatures (#45890 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890 This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506 I did the rewrite by reverting #44087 and then reimplementing it on top. You may find it easier to review by diffing against master with only #44087 reverted. There are two main ideas. First, we now factor cpp argument processing into two phases operating on three representations of data: 1. `FunctionSchema` - this is the source from native_functions.yaml 2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is the arguments after doing some basic semantic analysis to group them (for TensorOptions) or identify the this argument (if this is a method). There is only ever one of these per functions. 3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` - this is the arguments after we've elaborated them to C++. There may be multiple of these per actual C++ signature. You can think of (2) as common processing, whereas (3) bakes in specific assumptions about whether or not you have a faithful or non-faithful signature. Second, we now have CppSignature and CppSignatureGroup representing the total public C++ API signature. So those dataclasses are what know how to render definitions/declarations, and you no longer have to manually type it out in the Functions/TensorMethods codegen. Here is an exhaustive accounting of the changes. tools.codegen.api.types - CppSignature and CppSignatureGroup got moved to tools.codegen.api.types - Add new CppThisArgument and CppTensorOptionsArguments (modeled off of ThisArgument and TensorOptionsArguments) so that we can retain high level semantic structure even after elaborating terms with C++ API information. Once this is done, we can refine CppArgument.argument to no longer contain a ThisArgument (ThisArgument is always translated to CppThisArgument. Note that this doesn't apply to TensorOptionsArguments, as those may be expanded or not expanded, and so you could get a single CppArgument for 'options') - Add no_default() functional mutator to easily remove default arguments from CppArgument and friends - Add an explicit_arguments() method to CppArgument and friends to extract (flat) argument list that must be explicitly written in the signature. This is everything except (Cpp)ThisArgument, and is also convenient when you don't care about the extra structure of CppTensorOptionsArguments tools.codegen.api.cpp - group_arguments is back, and it doesn't send things directly to a CppSignatureGroup; instead, it moves us from representation (1) to (2) (perhaps it should live in model). Here I changed my mind from my PR comment; I discovered it was not necessary to do classification at grouping time, and it was simpler and easier to do it later. - argument got split into argument_not_this/argument/argument_faithful. argument and argument_faithful are obvious enough what they do, and I needed argument_not_this as a more refined version of argument so that I could get the types to work out on TensorOptionsArguments tools.codegen.api.dispatcher - Here we start seeing the payoff. The old version of this code had a "scatter" mode and a "gather" mode. We don't need that anymore: cppargument_exprs is 100% type-directed via the passed in cpp arguments. I am able to write the functions without any reference to use_c10_dispatcher tools.codegen.gen - Instead of having exprs_str and types_str functions, I moved these to live directly on CppSignature, since it seemed pretty logical. - The actual codegen for TensorMethods/Functions is greatly simplified, since (1) all of the heavy lifting is now happening in CppSignature(Group) construction, and (2) I don't need to proxy one way or another, the new dispatcher translation code is able to handle both cases no problem. There is a little faffing about with ordering to reduce the old and new diff which could be removed afterwards. Here are codegen diffs. For use_c10_dispatcher: full: ``` +// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) { - return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt()); + static auto op = c10::Dispatcher::singleton() + .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "") + .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>(); + return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt()); } ``` Otherwise: ``` +// aten::empty_meta(int[] size, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) { - return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format); + static auto op = c10::Dispatcher::singleton() + .findSchemaOrThrow("aten::empty_meta", "") + .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>(); + return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format); } ``` Things that I probably did not get right: - The Union[Argument, TensorOptionsArguments, ThisArgument] and the Cpp variants are starting to get a little unwieldy. Not sure if this means I should add a supertype (or at the very least an alias); in some cases I do purposely omit one of these from the Union - Code may not necessarily live in the most logical files. There isn't very much rhyme or reason to it. - The fields on CppSignature. They're not very well constrained and it will be better if people don't use them directly. - Disambiguation. We should do this properly in #44087 and we don't need special logic for deleting defaulting for faithful signatures; there is a more general story here. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: smessmer Differential Revision: D24144035 Pulled By: ezyang fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a	2020-10-13 08:31:54 -07:00
Ailing Zhang	e6d30c89c1	Revert D24165889: Update native_functions.yaml to add DefaultBackend. Test Plan: revert-hammer Differential Revision: D24165889 (`1f9ddf64d2`) Original commit changeset: 7f3ccdb3499b fbshipit-source-id: b5d0de57d918011f1e19c9ef6aafa89fefcb42d5	2020-10-12 23:17:06 -07:00
Ailing Zhang	1f9ddf64d2	Update native_functions.yaml to add DefaultBackend. (#45938 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45938 Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D24165889 Pulled By: ailzhang fbshipit-source-id: 7f3ccdb3499b40795bc34af716d0e63241ae8de3	2020-10-12 22:06:50 -07:00
Sebastian Messmer	6ba6ecb048	Only use hacky_wrapper_for_legacy_signatures if an op needs it (#45742 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45742 Add a new flag to native_functions.yaml: `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures` and the codegen only wraps kernels in the aforementioned wrapper if that flag is set. Apart from that, `use_c10_dispatcher: hacky_wrapper_for_legacy_signatures` is equivalent to `full`, i.e. it has full boxing and unboxing support. This greatly reduces the number of ops we apply the hacky_wrapper to, i.e. all ops marked as `use_c10_dispatcher: full` don't have it anymore. ghstack-source-id: 113982139 Test Plan: waitforsandcastle vs fbcode: https://www.internalfb.com/intern/fblearner/details/214511705/ vs base diff: https://www.internalfb.com/intern/fblearner/details/214693207/ Reviewed By: ezyang Differential Revision: D23328718 fbshipit-source-id: be120579477b3a05f26ca5f75025bfac37617620	2020-10-12 09:39:18 -07:00
Ailing Zhang	d811d4d7ba	Support DefaultBackend keyword in native_functions.yaml. (#45719 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45719 Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24165888 Pulled By: ailzhang fbshipit-source-id: 9b3c5e71f5b6a985e1a43157813e7d77dbe13b07	2020-10-09 16:28:26 -07:00
Peter Bell	8d14b50e94	codegen: Improve array default handing (#45163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45163 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D24132279 Pulled By: mruberry fbshipit-source-id: 77069e7526b35cf8d13ba448e313c90f20cc67cf	2020-10-07 22:27:28 -07:00
Peter Bell	8b39498a23	codegen: Allow string arguments to have defaults (#45665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45665 Fixes #43944 Note that the codegen doesn't use a proper parser so, in the same way as with lists, the string `, ` cannot appear in defaults or it will be interpreted as a splitting point between arguments. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D24141835 Pulled By: ezyang fbshipit-source-id: 578127861fd2504917f4486c44100491a2c40343	2020-10-06 21:53:56 -07:00
Sebastian Messmer	6e2eee2b9d	Add faithful C++ API (#44087 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44087 Each op taking a TensorOptions argument now has an additional overload in the C++ frontend where it takes scattered ScalarType, Layout, Device, bool instead of one TensorOptions argument. If it is a c10-full op, then the scattered version calls into the dispatcher and the gathered version is a proxy calling into the scattered version. If it is a non-c10-full op, then the gathered version calls into the dispatcher and the scattered version is a proxy calling into the gathered version. This should minimize the amount of gathering and scattering needed. This PR is also a prerequisite to remove the re-gathering of arguments that is currently happening in VariableKernel. Currently, VariableKernels gather arguments into a TensorOptions object to call into the C++ API. In a PR stacked on top of this, VariableKernel will just directly call into the scattered C++ API introduced here and avoid the gathering step. ghstack-source-id: 113355689 Test Plan: waitforsandcastle vs master: https://www.internalfb.com/intern/fblearner/details/216169815/ vs previous diff: https://www.internalfb.com/intern/fblearner/details/216169957/ Reviewed By: ezyang Differential Revision: D23492188 fbshipit-source-id: 3e84c467545ad9371e98e09075a311bd18411c5a	2020-10-02 04:08:53 -07:00
Edward Yang	4583edb5d6	Add NativeFunction.signature and kind. (#45131 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45131 These make it easier to group native functions together and determine what kind of native function it is (inplace/out/functional). Currently they are not used but they may be useful for tools.autograd porters. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zhangguanheng66 Differential Revision: D23872526 Pulled By: ezyang fbshipit-source-id: 1d6e429ab9a1f0fdb764be4228c5bca4dce8f24e	2020-10-01 08:46:40 -07:00
Edward Yang	41bd5a5ee0	Switch all Sequences in tools.codegen.model to Tuple (#45127 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45127 I thought I was being clever by using Sequence, which doesn't commit to List or Tuple, but forces read-onlyness in the type system. However, there is runtime implication to using List or Tuple: Lists can't be hashed, but Tuples can be! This is important because I shortly want to group by FunctionSchema, and to do this I need FunctionSchema to be hashable. Switch everything to Tuple for true immutability. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D23872527 Pulled By: ezyang fbshipit-source-id: 5c8fae1c50a5ae47b4167543646d94ddcafff8c3	2020-10-01 08:41:53 -07:00
Michael Carilli	72bc3d9de4	Use MTA for amp grad unscaling, enforce op math type in MTA functors, and allow op lambdas (#44778 ) Summary: Amp gradient unscaling is a great use case for multi tensor apply (in fact it's the first case I wrote it for). This PR adds an MTA unscale+infcheck functor. Really excited to have it for `torch.cuda.amp`. izdeby your interface was clean and straightforward to use, great work! Labeled as bc-breaking because the native_functions.yaml exposure of unscale+infcheck changes from [`_amp_non_finite_check_and_unscale_` to `_amp_foreach_non_finite_check_and_unscale_`]( https://github.com/pytorch/pytorch/pull/44778/files#diff-f1e4b2c15de770d978d0eb77b53a4077L6289-L6293). The PR also modifies Unary/Binary/Pointwise Functors to - do ops' internal math in FP32 for FP16 or bfloat16 inputs, which improves precision ([and throughput, on some architectures!](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#arithmetic-instructions)) and has no downside for the ops we care about. - accept an instantiated op functor rather than an op functor template (`template<class> class Op`). This allows calling code to pass lambdas. Open question: As written now, the PR has MTA Functors take care of pre- and post-casting FP16/bfloat16 inputs to FP32 before running the ops. However, alternatively, the pre- and post-math casting could be deferred/written into the ops themselves, which gives them a bit more control. I can easily rewrite it that way if you prefer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44778 Reviewed By: gchanan Differential Revision: D23944102 Pulled By: izdeby fbshipit-source-id: 22b25ccad5f69b413c77afe8733fa9cacc8e766d	2020-10-01 07:51:16 -07:00
Ailing Zhang	606b1a9a2e	Move xla codegen to aten. (#45241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45241 Test Plan: Imported from OSS Reviewed By: soumith Differential Revision: D23926750 Pulled By: ailzhang fbshipit-source-id: f768e24a9baeca9f9df069a62d6f8b94a853a1ee	2020-09-25 18:07:32 -07:00

1 2

62 Commits