pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Yang	1f36ce6e4d	Restore storage on meta tensors; increase meta coverage (#53973 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53973 Two parts to this PR; I had to put them together because adding support for X causes more test code to be exercised, which in turn may require a fix for Y. The first part is restoring the concept of storage to meta tensors. Previously, meta tensors had a nullptr storage (e.g., `meta_tensor.storage()` is an error.) As I was increasing the coverage of meta tensors, I started running into test cases (specifically memory overlap tests) that were failing because not having storage meant I couldn't check for memory overlap. After some discussion, we decided that it would make sense for meta tensors to model this as well (we already model strides, so getting accurate view information also seems useful). This PR does that by: * Rewrite all of the factory functions in MetaTensor.cpp to use the generic versions (which are very carefully written to not actually poke at the data pointer, so everything works out). The key idea here is we give meta tensors a special allocator, MetaAllocator, which always returns a nullptr even if you ask for a nonzero number of bytes. resize_ is also made generic; the normal variant can be used directly rather than having to instruct it to avoid resizing storage * Turn on memory overlap checking in TensorIterator even for meta tensors * Although meta tensors now have storage, the concept of meta storage is NOT exposed to Python land (as it would imply I would have to codegen MetaFloatStorage, MetaDoubleStorage, etc. classes). So `x.storage()` still raises an error and I have a cludge in `__deepcopy__` to break storage sharing upon deep copy (this is wrong, but no tests exercise this at the moment). The second part is adding more support for the most used functions in the test suite. * Inplace operations have very simple meta functions. I added `fill_`, `zero_`, `random_`, `uniform_` and `normal_`. In the case of random, I take advantage of pbelevich's templates for defining random kernels, so that I can reuse the common scaffolding, and then just register a noop stub that actually does the RNG. (Look, another structured kernels tiny variant!) * `copy_` is now implemented. Copying into a meta tensor is always OK, but copying out of a meta tensor raises an error (as we don't know what the "correct" data to copy out is in this case) * `empty_strided` usage from structured kernels now is implemented (TBH, this could have been done as soon as `empty_strided` was added) * Meta was missing in a few places in TensorOptions/DispatchKey utility functions, so I added them * Autograd engine now correctly homes meta tensors with CPU tensors (they have -1 device index so CUDA queues wouldn't work anyway) * `apply_`, `map_` and `map2_` are special cased to no-op on meta tensor self. These count as inplace operations too but they are implemented a little differently. Getting more meta function support triggers a number of bugs in the test suite, which I then fix: - Linear algebra functions sometimes don't report NotImplementedError because they get swallowed by catch all try blocks. This is tracked in https://github.com/pytorch/pytorch/issues/53739 - dlpack obviously doesn't work with meta tensors, I just disabled the test Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D27036572 Test Plan: Imported from OSS Reviewed By: agolynski, bdhirsh Pulled By: ezyang fbshipit-source-id: 7005ecf4feb92a643c37389fdfbd852dbf00ac78	2021-03-29 08:37:46 -07:00
Edward Yang	13b1ca9466	Rename DefaultBackend to CompositeExplicitAutograd (#54470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54470 ``` git grep -l 'DefaultBackend' \| xargs sed -i 's/DefaultBackend/CompositeExplicitAutograd/g' ``` Plus a quick fixup in native/README.md Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D27253240 Pulled By: ezyang fbshipit-source-id: 964df951ea8b52fa72937f3cc66aeaf49a702e6f	2021-03-26 10:53:30 -07:00
Edward Yang	6e8c4ad7fd	s/StructuredNativeFunctions/NativeFunctionsGroup/ (#54427 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54427 A StructuredNativeFunctions is no longer guaranteed to actually be structured (test structured property for that), so we rename this to a more neutral name. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27235380 Pulled By: ezyang fbshipit-source-id: 2b438d615bf06a47fc9c7bf6eb66fd8b4df31bc8	2021-03-23 00:43:57 -07:00
Edward Yang	bf2ca35f35	Rejigger to use NativeFunctionsGroup even without structured: True (#54426 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54426 Previously, we only put NativeFunctions in StructuredNativeFunctions if the out variant advertised that the kernel was structured. However, there are a few code generation things that can take advantage of this trio structure, even if the kernel itself hasn't been ported to be structured. So better to always group things when they are related, and then let clients decide whether or not to use the structure or throw it away. While doing this, I had hoped that there weren't any functional/inplace pairs that didn't also have an out variant. This turned out to not be true. These are probably all oversights and should get fixed at some point. Bill of changes: - The actual operational change happens in StructuredNativeFunctions.from_dict; then I need to relax some __post_init__ invariants. To tell if a StructuredNativeFunctions is actually structured, there is a new structured property, which is queried from a few new locations in code - Refactor native_functions.py into gen_structured/gen_unstructured functions so I can easily call gen_unstructured from two contexts I intend to s/StructuredNativeFunctions/NativeFunctionsGroup/ but for ease of review this rename hasn't been done in this PR. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D27235379 Pulled By: ezyang fbshipit-source-id: d8a15de9abb75b365348ab94e67b830704e30cf0	2021-03-23 00:43:54 -07:00
Edward Yang	547f435763	Fix restriding logic for structured kernels (#53759 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53759 Fixes #53587, see issue for in-depth explanation of the bug. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D26971342 Pulled By: ezyang fbshipit-source-id: 805983fed2658e27fb033f36a71fd30950a29328	2021-03-14 20:41:23 -07:00
Brian Hirsh	49a5f99440	skip dispatch in resize_ (#53575 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53575 Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D26902348 Pulled By: bdhirsh fbshipit-source-id: b322f233d934278f03e56cd1e35acc0665389398	2021-03-10 17:06:35 -08:00
Edward Yang	0f81a69a96	Make meta a device (getting rid of empty_meta) (#53143 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53143 Meta is now an honest to goodness device type, like cpu, so you can use device='meta' to trigger allocation of meta tensors. This way better than empty_meta since we now have working API for most factory functions (they don't necessarily work yet, though, because need to register Meta versions of those functions.) Some subtleties: - I decided to drop the concept of CPU versus CUDA meta tensors; meta tensors are device agnostic. It's hard to say exactly what the correct level of abstraction here is, but in this particular case implementation considerations trump semantic considerations: it is way easier to have just a meta device, than to have a meta device AND a cpu device AND a cuda device. This may limit the applicability of meta tensors for tracing models that do explicit cpu()/cuda() conversions (unless, perhaps, we make those operations no-ops on meta tensors). - I noticed that the DeviceType uppercase strings are kind of weird. Are they really supposed to be all caps? That's weird. - I moved the Meta dispatch key to live with the rest of the "device" dispatch keys. - I intentionally did NOT add a Backend for Meta. For now, I'm going to hope meta tensors never exercise any of the Backend conversion code; even if it does, better to fix the code to just stop converting to and from Backend. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: samestep Differential Revision: D26763552 Pulled By: ezyang fbshipit-source-id: 14633b6ca738e60b921db66a763155d01795480d	2021-03-03 11:24:13 -08:00
Edward Yang	37bf6c134b	Register DefaultBackend implementations for functional/inplace structured operators (#53037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53037 As remarked in #52277 it is easy to give an (inefficient, due to extra redispatches) DefaultBackend implementation of foo and foo_ in terms of foo_out. This patch enables code generation for DefaultBackend in these cases by default for all structured kernels. You can see the payoff in MSNPU extension: it only has to register a kernel for add.out, and it gets add and add_ kernels automatically. The actual code changes are very modest: - When DefaultBackend, call the dispatched (not direct native::) functions to allocate tensors, change device guard, etc - Don't call impl() for DefaultBackend (as it doesn't exist); instead, directly generate a call to at::foo_out to do the actual work. - Do NOT generate DefaultBackend implementation for foo_out. Actually, there is a case to be made for this being a good idea with more infra; see comments inside. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D26731225 Pulled By: ezyang fbshipit-source-id: 939da7cb69f694722ec293e5e42e74a755dd0985	2021-03-02 14:13:08 -08:00
Edward Yang	4d85e30133	Support at::cpu on non-structured kernels (#51590 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51590 This PR backports a subset of Jiakai's changes from https://github.com/pytorch/pytorch/pull/51554 that adds support for at::cpu in non-structured kernels. The unusual bits: - Need to add a new forward inference rule for doing conversions of const optional<Tensor>& to const Tensor& - Need to give the wrapper functions a prefix so that the call to wrapper is not ambiguous Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D26209871 Pulled By: ezyang fbshipit-source-id: 8162686039675ab92a2af7a14f6b18941f8944df	2021-02-04 09:19:45 -08:00
Edward Yang	668e0f3598	Split anonymous and namespaced definitions in RegisterDispatchKey (#51585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51585 Some payoff from the stack of refactors. When I initially landed at::cpu, Brian asked me why I couldn't just separate the anonymous and namespaced definitions. Well, it used to be annoying. Now it's not annoying anymore, so go ahead and split them up. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D26209873 Pulled By: ezyang fbshipit-source-id: 63057d22acfaa0c17229947d9e65ec1193e360ec	2021-02-04 09:19:41 -08:00
Edward Yang	a626b78467	Factor out structured generation into its own subclass. (#51583 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51583 There are no substantive changes in this PR. The cluster of structured helper methods is now split off into its own class. To make sure all of the original closure was available, I subclassed RegisterDispatchKey and passed it all on; the only new thing closed over is the structured functions group being processed. I also renamed all the methods to remove structured_ from their names as it is now redundant. Most of the benefit is being able to remove a level of indentation from gen_one. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D26209872 Pulled By: ezyang fbshipit-source-id: 76c11410a24968d4f3d8a2bbc9392251a7439e6e	2021-02-04 09:19:37 -08:00
Edward Yang	93c4f9f972	Split out RegisterDispatchKey to its own file (#51508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51508 No substantive changes. The codegen for this file was getting a bit long so I moved it off into tools.codegen.dest submodule (I wanted to do tools.codegen.gen but that conflicts with the existing module; oy vey!) To do this I had to move some other functions around so that they were more generally accessible. Otherwise self-explanatory. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D26187856 Pulled By: ezyang fbshipit-source-id: fd3784571d03d01c4acb7ca589fcde4492526408	2021-02-04 09:19:32 -08:00

12 Commits