pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
BowenBao	8f4edf1e1d	[ONNX] Initial version of diagnostics infrastructure. (#85107 ) This PR introduces a general Python diagnostics infrastructure powered by SARIF, and the exporter diagnostics module that builds on top of it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85107 Approved by: https://github.com/abock, https://github.com/justinchuby	2022-09-30 07:47:26 +00:00
BowenBao	33401ee81f	[ONNX] Rename 'sarif_om' to 'sarif' (#85918 ) 'sarif_om' was the module name in the original repository https://github.com/microsoft/sarif-python-om. But since we have moved along with various extensions, it wouldn't hurt to rename the module for clarity. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85918 Approved by: https://github.com/abock, https://github.com/thiagocrepaldi, https://github.com/justinchuby	2022-09-30 05:39:49 +00:00
BowenBao	e9b254a025	[ONNX] Migrate SARIF from attr to dataclasses (#85651 ) Move to dataclasses since PyTorch does not depend on `attr`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85651 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/abock, https://github.com/thiagocrepaldi	2022-09-30 05:34:40 +00:00
BowenBao	91667d1d21	[ONNX] Introduce SARIF (#85428 ) That's the parent issue tracking this and more follow up tasks, so will keep open after this. This PR introduces the python classes for SARIF object model, along with script for generation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85428 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/abock, https://github.com/thiagocrepaldi	2022-09-30 05:32:41 +00:00
soulitzer	7e4684009c	Improve codegen for jvp decomposition (#84894 ) Fixes: https://github.com/pytorch/pytorch/issues/84888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84894 Approved by: https://github.com/albanD	2022-09-29 03:04:15 +00:00
soulitzer	bd65adf4e9	Properly fix log_sigmoid vmapjvp and remove hack (#84892 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84892 Approved by: https://github.com/albanD, https://github.com/zou3519	2022-09-29 01:19:13 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
Horace He	a4bd89b267	Revert "Revert "Symintified mmm/addmm derivative formulas (#85794 )"" (#85820 ) This reverts commit `823dc33b00`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85820 Approved by: https://github.com/huydhn	2022-09-28 17:34:11 +00:00
PyTorch MergeBot	a0b1693996	Revert "Update `amax/amin/norm/count_nonzero` signatures with `int[*]? dim` (#83300 )" This reverts commit `1c0f0b33a0`. Reverted https://github.com/pytorch/pytorch/pull/83300 on behalf of https://github.com/jeffdaily due to The commit breaks nvfuser tests	2022-09-28 17:04:53 +00:00
PyTorch MergeBot	823dc33b00	Revert "Symintified mmm/addmm derivative formulas (#85794 )" This reverts commit `230edd2515`. Reverted https://github.com/pytorch/pytorch/pull/85794 on behalf of https://github.com/janeyx99 due to Sorry, reverting as this breaks an aot_autograd mac test on functorch `230edd2515`	2022-09-28 16:02:05 +00:00
Horace He	230edd2515	Symintified mmm/addmm derivative formulas (#85794 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85794 Approved by: https://github.com/ezyang	2022-09-28 14:07:57 +00:00
Edward Z. Yang	793488cda2	Revert "Revert "Symintifying slice ops (#85196 )"" (#85746 ) This reverts commit `3a171dfb0c`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85746 Approved by: https://github.com/albanD	2022-09-28 04:37:35 +00:00
Kurt Mohler	1c0f0b33a0	Update `amax/amin/norm/count_nonzero` signatures with `int[]? dim` (#83300 ) Changes `dim` arg to use `int[]?` type for the following functions in `native_funcitons.yaml`: * `amax` * `amin` * `norm` * `frobenius_norm` * `native_norm` * `count_nonzero` Part of #29137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83300 Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/kulinseth	2022-09-28 01:56:37 +00:00
PyTorch MergeBot	572dd862c4	Revert "Update `amax/amin/norm/count_nonzero` signatures with `int[*]? dim` (#83300 )" This reverts commit `8c7c7ed322`. Reverted https://github.com/pytorch/pytorch/pull/83300 on behalf of https://github.com/huydhn due to The commit pin breaks XLA test somehow	2022-09-28 01:36:43 +00:00
Kurt Mohler	8c7c7ed322	Update `amax/amin/norm/count_nonzero` signatures with `int[]? dim` (#83300 ) Changes `dim` arg to use `int[]?` type for the following functions in `native_funcitons.yaml`: * `amax` * `amin` * `norm` * `frobenius_norm` * `native_norm` * `count_nonzero` Part of #29137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83300 Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/kulinseth	2022-09-27 23:50:04 +00:00
PyTorch MergeBot	3a171dfb0c	Revert "Symintifying slice ops (#85196 )" This reverts commit `4c01c51266`. Reverted https://github.com/pytorch/pytorch/pull/85196 on behalf of https://github.com/atalman due to Break internal build Exutorch	2022-09-27 18:01:27 +00:00
soulitzer	15c52ffc4f	Disallow auto_element_wise for in-place and fix some in-place gradients (#85634 ) Fixes https://github.com/pytorch/pytorch/issues/85535 Also fixes the backward and forward gradients of `nn.functional.threshold`. The issue was that in-place gradients weren't tested because the in-place variants were not properly registered to the OpInfo. Perhaps an alternative to this to make auto_element_wise smart enough to actually handle the in-places cases (we have 4 cases total now where we manually copy_ after doing auto_element_wise), but that requires a few more changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85634 Approved by: https://github.com/albanD	2022-09-27 15:35:24 +00:00
George Qi	686555b663	[maskedtensor] port torch/_masked into torch/masked (#85515 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85515 Approved by: https://github.com/cpuhrsch	2022-09-26 23:41:13 +00:00
Brian Hirsh	4a2d2e5e40	Change API type `Tensor[]` for structured kernels. (#73350 ) Partially fixes: #66328 This PR: - adds support for `ITensorList` to the dispatcher for: - computing the dispatch key - boxing and unboxing `ITensorList` - modified the codegen for structured kernels: - codegen APIs use `ITensorList` instead of `ArrayRef<Tensor>` Changes summary: - Signature changes due to the different APIs: - dispatcher API (e.g. `BatchingRegistrations.cpp`) - C++ API (e.g. `TensorShape.cpp`) - Miscelaneous functions used by codegen'd functions (e.g. `FunctionalTensorWrapper.`) - Dispatcher changes for handling `ITensorList` correctly (e.g. `DispatchKeyExtractor.h`) - Signature changes of `at::cat` due to the need of `const` inside `TensorBody.h` - Forward declarations of `ITensorList` (e.g. `MethodOperators.h`) - Codegen changes, special casing structured kernels (e.g. `gen.py`) Short description of structured kernels special casing:* I introduced, mainly, 5 types of changes to the codegen for generating code depending on whether the kernel is structured or not: 1. Added a `structured_type_override` flag to the `argument_type` function definition of the affected APIs (mainly the dispatcher and C++ APIs). - `api/cpp.py`, `api/dispatcher.py`, `api/native.py` 2. Added a `structured_type_override` member to the signature classes (e.g. `CppSignature`), since `FunctionSchema` doesn't really know whether the function is structured or not - `api/types.py` 3. Added a `part_of_structured_group` to `NativeFunction` class, which is just a convenient function to forward to `structured_type_override` wherever needed - `model.py` 4. Appropriately changed the rest of the codegen, whenever it used either the signature classes or the `arguments` function directly 5. Added a check for `const ITensorList&` type wherever there was a check for `TensorList` Pull Request resolved: https://github.com/pytorch/pytorch/pull/73350 Approved by: https://github.com/bdhirsh	2022-09-26 21:46:38 +00:00
Edward Z. Yang	4c01c51266	Symintifying slice ops (#85196 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85196 Approved by: https://github.com/ezyang	2022-09-23 22:01:32 +00:00
Catherine Lee	49e10c1598	[ci] test_ops in parallel, ci tests log to file (#85528 ) part one of splitting up https://github.com/pytorch/pytorch/pull/84961 into (probably 2) parts contains * logging to file * testing test_ops in parallel Pull Request resolved: https://github.com/pytorch/pytorch/pull/85528 Approved by: https://github.com/huydhn	2022-09-23 20:45:20 +00:00
Ivan Yashchuk	539076e2c2	Remove deprecated torch.lstsq (#70980 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.lstsq`. There's a note in `tools/codegen/gen.py` about `lstsq` schema in `native_function.yaml` that I will not remove: `87139d8532/tools/codegen/gen.py (L734-L770)` cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70980 Approved by: https://github.com/lezcano, https://github.com/kit1980	2022-09-23 00:16:55 +00:00
Richard Zou	848437590f	Delete functorch's monkeypatching (#85430 ) By upstreaming functorch's tensor printing logic into PyTorch. There's no way of creating a custom print function for a TensorImpl subclass (as opposed to a torch_dispatch or torch_function tensor subclass, which can just override repr()) right now, so we need to directly interpose inside regular Tensor printing in PyTorch. Monkey patching is bad; users do not expect `import blah` to change something about another library. Fixes https://github.com/pytorch/functorch/issues/900 Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/85430 Approved by: https://github.com/ezyang	2022-09-22 18:47:12 +00:00
kshitij12345	56a41b5998	[composite compliance] ctc_loss (#84752 ) #Ref #69991 I have mixed feelings about adding new (private) operators. Backends writers will have to override them as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84752 Approved by: https://github.com/zou3519	2022-09-22 00:21:11 +00:00
PyTorch MergeBot	3dce26635f	Revert "test in parallel at file granularity (#84961 )" This reverts commit `8107666c6a`. Reverted https://github.com/pytorch/pytorch/pull/84961 on behalf of https://github.com/clee2000 due to makes test_forward_ad_nn_functional_max_unpool2d_cuda_float32 flakily unexpectedly pass	2022-09-21 20:21:25 +00:00
Mikayla Gawarecki	77f1f98479	Re-introduce `torch.Tensor.to_padded_tensor` (#85293 ) Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293 Approved by: https://github.com/cpuhrsch	2022-09-21 18:45:56 +00:00
Catherine Lee	8107666c6a	test in parallel at file granularity (#84961 ) run tests in parallel at the test file granularity runs 3 files in parallel using multiprocessing pool, output goes to a file, which is then printed when the test finishes. Some tests cannot be run in parallel (usually due to lacking memory), so we run those after. Sharding is changed to attempt to mask large files with other large files/run them on the same shard. test_ops* gets a custom handler to run it because it is simply too big (2hrs on windows) and linalg_cholesky fails (I would really like a solution to this if possible, but until then we use the custom handler). reduces cuda tests by a lot, reduces total windows test time by ~1hr Ref. https://github.com/pytorch/pytorch/issues/82894 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84961 Approved by: https://github.com/huydhn	2022-09-21 16:58:11 +00:00
Edward Z. Yang	3eb27229dd	as_strided symbolic support (#85264 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D39662820](https://our.internmc.facebook.com/intern/diff/D39662820) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85264 Approved by: https://github.com/wconstab	2022-09-21 13:34:55 +00:00
Edward Z. Yang	e1f634753c	Setup fake tensor and symbolic shapes once at beginning of AOTAutograd (#85233 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D39662822](https://our.internmc.facebook.com/intern/diff/D39662822) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85233 Approved by: https://github.com/wconstab	2022-09-20 19:11:25 +00:00
Thomas Viehmann	e41d758e26	Handle implicit real->complex casting for backward of stack (#84993 ) Fixes: #75852 P.S.: Yay for the PyTorch foundation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84993 Approved by: https://github.com/soulitzer	2022-09-19 21:20:34 +00:00
Edward Z. Yang	6a18616296	Support for sym_strides() in backwards formulas (#85210 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85210 Approved by: https://github.com/Chillee, https://github.com/voznesenskym	2022-09-19 18:05:09 +00:00
Brian Hirsh	1838957e6f	fix external codegen kernel error checking (#85029 ) Fixes https://github.com/pytorch/pytorch/issues/84987. I followed the repro steps from the issue (changed `empty_symint` to `empty_symint2` and confirmed that and error gets raised. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85029 Approved by: https://github.com/ezyang	2022-09-17 04:08:09 +00:00
Edward Z. Yang	490727a35f	New calling convention for Python dispatcher (#85133 ) Instead of calling into the Python dispatcher for EVERY dispatcher call, we now have a two step process. First, we getattr(op: OpOverload, dispatch_key) to "load" the handler for the function. This can either be a conventional function (in which case we will call it, in the same way the old Python dispatcher worked), or it can be a DispatchKey, in which case we will directly call that DispatchKey in C++, bypassing marshalling between Python and C++ entirely. OpOverload.__getattr__ is carefully written so that it will cache the A further optimization would be to define __slots__ on OpOverload, and ensuring that the DispatchKey strings are interned. The resulting Python dispatcher is less flexible: after the first lookup, the handler is cached and we won't recompute it. Furthermore, by default, dispatches will not go into Python, and so you won't get stack frames for the Python dispatcher by default. But we get a huge performance improvement: on the following microbenchmark we go from 2.5s to 1.9s. ``` import time import torch from functorch import make_fx def f(x): for i in range(1000): x = x * x return x begin = time.time() res = make_fx(f, tracing_mode="symbolic")(torch.randn(10, 20)) print(time.time()-begin) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85133 Approved by: https://github.com/wconstab	2022-09-16 20:38:21 +00:00
lezcano	d710c95cc0	Implement forward AD for scatter_reduce (#85000 ) I left the case `reduction="prod"` for future work as it's a bit of a pain. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85000 Approved by: https://github.com/soulitzer	2022-09-16 17:45:07 +00:00
Edward Z. Yang	00ce302c07	Performance optimizations to proxy tensor (#85049 ) - Lazily allocate FX nodes for size/stride accessors on proxy tensor - Properly track derived computations on strides/numel/etc - Remove unnecessary tree_map at end of proxy tensor trace checking invariants; we will just have to be smart (it's too expensive) - Avoid tree_map in sym proxy tracing Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85049 Approved by: https://github.com/wconstab	2022-09-16 00:28:50 +00:00
soulitzer	7f88934a8f	[reland 2] Call jit decomp in VariableType to improve forward AD coverage (#84976 ) Reland of https://github.com/pytorch/pytorch/pull/84675 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84976 Approved by: https://github.com/zou3519	2022-09-15 22:46:19 +00:00
Michael Voznesensky	8ca1839d32	Python Dispatcher integration with C++ dispatcher (#85050 ) #84826 but without ghstack Pull Request resolved: https://github.com/pytorch/pytorch/pull/85050 Approved by: https://github.com/malfet	2022-09-15 00:43:36 +00:00
PyTorch MergeBot	706b990306	Revert "Python Dispatcher integration with C++ dispatcher (#84826 )" This reverts commit `35f6a69191`. Reverted https://github.com/pytorch/pytorch/pull/84826 on behalf of https://github.com/malfet due to Broke dynamo, see `35f6a69191`	2022-09-14 14:07:58 +00:00
Michael Voznesensky	35f6a69191	Python Dispatcher integration with C++ dispatcher (#84826 ) Signed-off-by: Edward Z. Yang <ezyangfb.com> From @ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See https://github.com/pybind/pybind11/issues/483#issuecomment-1237418106 for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84826 Approved by: https://github.com/ezyang	2022-09-14 06:57:19 +00:00
PyTorch MergeBot	36d79143ce	Revert "[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 )" This reverts commit `bb4e96c964`. Reverted https://github.com/pytorch/pytorch/pull/84675 on behalf of https://github.com/osalpekar due to causing asan xplat link-time errors like ld.lld: error: undefined symbol: torch::jit::has_jit_decomposition(c10::FunctionSchema const&)	2022-09-13 22:54:54 +00:00
drisspg	bda8a5729b	[Nested Tensor] Create differentiable nt to tensor view functions (#83371 ) This PR attempts to implements 2) "the safe way" of creating a view of nested tensor that returns a regular tensor. The rest of the break down is here: https://fb.quip.com/J8QCAx41af11 https://gist.github.com/drisspg/8622e9c97d374fa920ac647e1167cabc This is a short list of some edge cases. After some more work I was able to address two of the test cases in the above gist. There are few complex aspects here that I left defeated comments inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83371 Approved by: https://github.com/bdhirsh	2022-09-13 20:35:58 +00:00
Thomas Orozco	b4799736ee	autograd: fix non-deterministic output in codegen comments (#84695 ) Summary: Like it says in the title. Currently, this will return output like this: In Buck1, that's OK because Buck1's caching doesn't really care too much about However, in Buck2, this is a disaster, because caching is based exclusively on inputs and outputs and The diff here proposes making the path relative to the codegen script itself, which should carry about as much info, but avoid cache misses. Concretely, this: ``` // generated from /dev/shm/uid-34135/cfbc5712-seed-nspid4026533424_cgpid2794673-ns-4026533443/tools/autograd/templates/python_functions.h ``` Becomes, this: ``` // generated from ../tools/autograd/templates/python_functions.h ``` So, we keep the useful part, and we get caching. This matters because those headers are used in actions like: ``` fbcode//deeplearning/fbgemm/fbgemm_gpu/codegen:embedding_ops -- action (cxx_compile gen_embedding_backward_adam_split_unweighted_cuda.cu (pic)) ``` Those actions take upwards of 5 minutes to finish, so by allowing a cache hit, we are a) saving our users a lot of time and b) saving some RE capacity as well. This actually matters a lot because right now those targets are produced by `//caffe2:generate-code`, which itself doesn't get cache hits from RE because `generate_code.par` is non-deterministic (this is, unfortunately, true of PARs in general), so that rule introduces non-determinism that the codegen propagates and we get zero caching. This diff doesn't fix `//caffe2:generate-code`'s inputs being non-deterministic, but it does fix its outputs being non-deterministic, which means the non-determinism stops there, and we get back to cache hits. Test Plan: - CI ``` buck2 build fbcode//caffe2:generate-code buck2 build fbcode//deeplearning/fbgemm/fbgemm_gpu/codegen:embedding_ops ``` Reviewed By: ndmitchell Differential Revision: D39348565 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84695 Approved by: https://github.com/soulitzer	2022-09-13 18:41:15 +00:00
soulitzer	bb4e96c964	[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 ) This reverts commit `acb4a09628`. In addition, we also fix a memory leak in layer norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84675 Approved by: https://github.com/zou3519	2022-09-12 20:33:14 +00:00
Mikayla Gawarecki	e217b30b0f	Add `torch.nested` namespace (#84102 ) First step towards #83775 - only `to_padded_tensor` is moved to the nested namespace for now - following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in `torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`. ~~Question: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~ [generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested) Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102 Approved by: https://github.com/drisspg	2022-09-12 16:31:05 +00:00
Mengwei Liu	2765243cd5	[torchgen] Refactor static_dispatch to take in source signature (#84384 ) Summary: Context: currently `static_dispatch` assumes that given a native function `f`, we always want to map from its `DispatchSignature` to its `CppSignature`. This assumption may not hold true for some use cases, where the source bindings may not come from its `DispatchSignature`. Here I'm changing the argument `sig: DispatcherSignature` to be `sig: Union[CppSignature, DispatcherSignature]`, also removes unused `f` Test Plan: Rely on added unit test. Differential Revision: D39192969 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84384 Approved by: https://github.com/iseeyuan	2022-09-10 06:58:56 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
Eli Uriegas	93aef3a010	Use presence of _symint in kernel name to generate symint sig or not (#84579 ) Something people found confusing was that whether or not a native:: signature would get SymInt or not in its type was based on the dispatch key. This changes it so that SymInt or not in type is based on whether or not you have _symint in the name of the kernel or not. This means that even when we make operators support SymInt, you no longer have to go and update all the preexisting definitions; instead, you now selectively write _symint to opt individual kernels into SymInt support. I then go and update a bunch of kernels that don't have proper SymInt support to make use of this convention. There is some hacking around for view generation code. I also add support for external backends to specify 'symint' operators, for which we generate SymInt signatures instead of regular signatures. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: [D39310060](https://our.internmc.facebook.com/intern/diff/D39310060) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84579 Approved by: https://github.com/wconstab	2022-09-09 18:31:56 +00:00
Dhruv Matani	18a31cc044	[Mobile] Fix The Build For Model Tracer (#84755 ) Summary: Currently, the model tracer build is broken because of 2 reasons: 1. A few source files are missing, resulting in missing link time symbols 2. The `TRACING_BASED` flag isn't passed correctly from the command line (specified as an evnironment variable) as a CMake flag Both these issues were fixed. Test Plan: Ran this command: `USE_CUDA=0 TRACING_BASED=1 python setup.py develop --cmake` and saw that the tracer binary was built at `build/bin/model_tracer` - also ran it to ensure that it can generate a YAML file. Differential Revision: [D39391270](https://our.internmc.facebook.com/intern/diff/D39391270) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84755 Approved by: https://github.com/cccclai	2022-09-09 18:22:24 +00:00
Justin Chu	2fa8142cf9	[ONNX] Rename constants for clarity (#84645 ) Rename constants to make them more clear. Fix styles to upper case. Removed `onnx_stable_opsets` because it can be computed from `ONNX_MIN_OPSET` and `ONNX_MAX_OPSET`. Fixes #84643 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84645 Approved by: https://github.com/BowenBao	2022-09-09 01:22:14 +00:00
PyTorch MergeBot	acb4a09628	Revert "Call jit decomposition in VariableType to increase forward AD coverage (#84151 )" This reverts commit `42d99e6f19`. Reverted https://github.com/pytorch/pytorch/pull/84151 on behalf of https://github.com/malfet due to Regressed test_jvpvjp_nn_functional_layer_norm_cuda_float32, see `42d99e6f19`	2022-09-07 18:02:27 +00:00
soulitzer	42d99e6f19	Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) This PR: - updates forward AD codegen in core to generate code that tries calling into decompositions registered to jit when - (1) the function is not in-place or out variant - AND (2) the function is differentiable (requires_derivative=True) - AND (3) there are no forward AD formulas registered - To simplify things we always generating the if/else (as long as (1) is true), but generate 'false' when either (2) or (3) are false. - removes the mechanism from functorch - (follow up) some functorch tests should be updated here so they no longer have to compute the Jacobian with vjp - factors out some logic to generate the any_has_forward_grad condition - (bc-breaking) when TensorList inputs unexpectedly have forward grad, the error will no longer contain the name See https://github.com/pytorch/pytorch/pull/84151#issuecomment-1238519247 for codegen output and more discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84151 Approved by: https://github.com/samdow, https://github.com/albanD, https://github.com/zou3519	2022-09-07 15:31:46 +00:00
Mikayla Gawarecki	1cad744694	Enable select.int when NestedTensor requires grad (#83875 ) Previously indexing a nested tensor when it requires_grad would raise an error because the backward formula for `select.int` uses `self.sizes()`. This PR fixes that by temporarily registering a _nested_select_backward function which can be removed when we start using the symint approach to register kernels. For now this functionality is needed for creating a POC that nested tensor can be an API to `segment_coo` and `segment_csr` in the torch_scatter repo ``` a = torch.arange(10).reshape(2, 5).float() b = torch.arange(12).reshape(2, 6).float() nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(True) nt[0] # RuntimeError: Internal error: NestedTensorImpl doesn't support sizes. Please file an issue on https://github.com/pytorch/nestedtensor ``` whereas ``` nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(False) nt[0] ``` would succeed Pull Request resolved: https://github.com/pytorch/pytorch/pull/83875 Approved by: https://github.com/albanD, https://github.com/drisspg	2022-09-06 22:19:32 +00:00
mikey dagitses	4f0b9f3c31	move PyTorch internal-only starlark files into fb/ subdirectories (#84548 ) Summary: These are not used in OSS so should not clutter them there. Test Plan: Rely on CI. Differential Revision: D39262135 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84548 Approved by: https://github.com/DanilBaibak	2022-09-06 18:08:42 +00:00
Nikolay Korovaiko	f725009a48	as_strided supports SymInt; codegen supports optional SymInt (#84393 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84393 Approved by: https://github.com/ezyang	2022-09-06 16:39:24 +00:00
Edward Z. Yang	2a332afbf4	Add SymFloat, support SymInt to SymFloat conversion (#84284 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84284 Approved by: https://github.com/albanD	2022-09-03 01:30:32 +00:00
YifanShenSZ	673b35c847	Better reshape with autograd support (#82754 ) (#84154 ) The original author is @YifanShenSZ and the original PR is: #82754 # Summary: Previous reshape [https://github.com/pytorch/pytorch/issues/80981](https://github.com/pytorch/pytorch/pull/80981) is ok for forward, but needs improvement for backward: need to handle "sometimes view sometimes copy" behavior. This pull request fixes it by: 1. add a new alias dispatch key `CompositeImplicitAutogradNestedTensor`, which ideally would work as nested-tensor version of `CompositeImplicitAutograd` 2. register `reshape_nested` to `reshape` by `CompositeImplicitAutogradNestedTensor` Side changes: * add contiguous memory format support to `clone_nested` * add `view_nested` * add `reshape_as_nested` Fix issue [https://github.com/pytorch/pytorch/issues/83041](https://github.com/pytorch/pytorch/issues/83041) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82754 Test Plan: Imported from GitHub, without a `Test Plan:` line. Static Docs Preview: executorch \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D39023822/V13/executorch/)\| \|Modified Pages\| Reviewed By: albanD Differential Revision: D39023822 Pulled By: drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/84154 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-09-01 20:01:39 +00:00
Edward Z. Yang	f1ee162193	Use SymInt signature to compute saved variables (#84354 ) This seems to have been accidentally working, but it broke when I added support for saving optional SymInt directly from input arguments. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84354 Approved by: https://github.com/Krovatkin	2022-09-01 16:30:00 +00:00
Elias Ellison	f701cb04fb	Test Dynamo CI w Fake Tensors (#84282 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84282 Approved by: https://github.com/anijain2305	2022-09-01 00:15:05 +00:00
Nikolay Korovaiko	eda217ab67	Reland symint_numel (#84281 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84281 Approved by: https://github.com/ezyang	2022-08-30 21:53:34 +00:00
Jeff Daily	d09486ab23	[ROCm] enable nvfuser (#82498 ) ### Description The nvfuser is enabled for ROCm. ### Testing CI label ciflow/trunk covers the newly enabled ROCm functionality as well as any CUDA regressions caused by these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82498 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-08-30 21:50:39 +00:00
Nikolay Korovaiko	44a975335e	Revert "Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) (#82855 )" (#84207 ) This reverts commit `bfebf254dd`. Differential Revision: [D39104562](https://our.internmc.facebook.com/intern/diff/D39104562) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84207 Approved by: https://github.com/robieta	2022-08-30 13:22:58 +00:00
Edward Z. Yang	ad44670fa1	Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628 )" (#84173 ) Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature" Original commit changeset: dab4a9dba4fa Original commit changeset: dcaf16c037a9 Original Phabricator Diff: D38984222 Original Phabricator Diff: D39075159 Also update Metal registrations for C++ registration changes. Also update NNPI registration to account for tightened schema checking Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173 Approved by: https://github.com/Krovatkin	2022-08-29 18:01:07 +00:00
PyTorch MergeBot	c7edcd6968	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit `9790d90e4b`. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to Breaks internal builds, see D39076487	2022-08-27 01:23:17 +00:00
Catherine Lee	582c0833d5	mac circleci workflows (#82780 ) Add mac and ios workflows to circleci so they can be run on pull m1 tests not included because circleci doesnt have machines Unsure how to get certain environment variables (specifically for arm64 ios builds that require env vars like `IOS_SIGN_KEY_2022` and `IOS_DEV_TEAM_ID` that are stored in the org-member context which is not accessible by everyone. doc regarding env vars https://docs.google.com/document/d/1J_3Z9sfu2vlHMF1fjdJfeTuxPXC6dgqJs7aU0KpYSBU/edit# Pull Request resolved: https://github.com/pytorch/pytorch/pull/82780 Approved by: https://github.com/malfet, https://github.com/huydhn	2022-08-26 18:48:48 +00:00
Edward Z. Yang	9790d90e4b	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-26 01:35:40 +00:00
Mario Lezcano	f5a3515083	Make linalg.inv composite of linalg.solve (#80074 ) The `getri` kernel calls inside `getrs` so we can do so explicitly ourselves and save ourselves from having to maintain an extra kernel. This way we just need to optimise `lu_factor` and `lu_solve` and `inv` will be as efficient as it can be, as it'll be choosing the best backend to perform the factorisation and the best backend (not necessarily the same) to perform the solve. Fixes https://github.com/pytorch/pytorch/issues/77498 The benchmarks: https://github.com/pytorch/pytorch/pull/80074#issuecomment-1164309071 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80074 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD, https://github.com/malfet	2022-08-25 09:28:55 +00:00
PyTorch MergeBot	a7edf71360	Revert "Don't introduce new overload for SymInt (#83628 )" This reverts commit `8fae7027b3`. Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to breaking internal builds, see https://www.internalfb.com/diff/D38984222	2022-08-25 00:49:40 +00:00
PyTorch MergeBot	5321bf52f2	Revert "Make linalg.inv composite of linalg.solve (#80074 )" This reverts commit `4737b33614`. Reverted https://github.com/pytorch/pytorch/pull/80074 on behalf of https://github.com/malfet due to Depends on the changes from https://github.com/pytorch/pytorch/pull/83628	2022-08-25 00:43:00 +00:00
Catherine Lee	4a6726a840	use condensed disabled tests file (#84017 ) follow up to https://github.com/pytorch/test-infra/pull/545 then we can get rid of the non condensed version Pull Request resolved: https://github.com/pytorch/pytorch/pull/84017 Approved by: https://github.com/huydhn, https://github.com/janeyx99	2022-08-25 00:34:25 +00:00
Mario Lezcano	3e6e0a1d10	Support a stable double backward on linalg.det for real inputs (#80217 ) The complex case still fails. I do not know why. Fixes https://github.com/pytorch/pytorch/issues/62327 Fixes https://github.com/pytorch/pytorch/issues/53364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80217 Approved by: https://github.com/nikitaved, https://github.com/albanD, https://github.com/malfet	2022-08-24 15:18:56 +00:00
Mario Lezcano	4737b33614	Make linalg.inv composite of linalg.solve (#80074 ) The `getri` kernel calls inside `getrs` so we can do so explicitly ourselves and save ourselves from having to maintain an extra kernel. This way we just need to optimise `lu_factor` and `lu_solve` and `inv` will be as efficient as it can be, as it'll be choosing the best backend to perform the factorisation and the best backend (not necessarily the same) to perform the solve. Fixes https://github.com/pytorch/pytorch/issues/77498 The benchmarks: https://github.com/pytorch/pytorch/pull/80074#issuecomment-1164309071 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80074 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD, https://github.com/malfet	2022-08-24 15:18:56 +00:00
Edward Z. Yang	0491e1a13a	Support returning symbolic strides from t.stride() in Python (#83842 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83842 Approved by: https://github.com/albanD, https://github.com/Chillee, https://github.com/bdhirsh	2022-08-24 04:32:51 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
Edward Z. Yang	8fae7027b3	Don't introduce new overload for SymInt (#83628 ) Previously, we introduced new SymInt overloads for every function we wanted. This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented. This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts. This is BC-breaking in the following ways: * The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change. Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually. This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this. This is not BC-breaking in the following ways: * The user facing C++ API remains compatible. Even if a function changes from int to SymInt, the default C++ binding still takes only ints. (e.g., at::empty(IntArrayRef, ...). To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed. * This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type. Structure of the PR: * The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it as if it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other: * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular: * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences. * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!) * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway. * Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes. * The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK. * I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it. * I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload) * I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.) * I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints. * I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628 Approved by: https://github.com/albanD, https://github.com/bdhirsh	2022-08-23 22:04:07 +00:00
Driss Guessous	7cfc8b7820	[MPS] Move mps_linear to mps dispatch key (#80068 ) Fixes #77394 This is related to #79920 which adds linear support for nested tensors. Codegen still throws an assert stoping this from compiling. However I tested locally by commenting out this assert: `61305cd638/tools/autograd/gen_variable_type.py (L798)` and the intended behavior appears to be working. I am not sure what changes need to be made to codegen to make this work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80068 Approved by: https://github.com/albanD, https://github.com/malfet, https://github.com/kulinseth	2022-08-23 01:13:17 +00:00
chenlai	7aba6f8e7b	Rename flatbuffer_serializer to _mobile or _full_jit (#82827 ) The target named `flatbuffer_serializer` in fbcode has dependency from full jit and the one in xplat has dependency for mobile only. Rename them accordingly ``` flatbuffer_serializer in fbode -> flatbuffer_serializer_full_jit flatbuffer_serializer in xplat -> flatbuffer_serializer_mobile ``` so it's more readable. Differential Revision: [D38413369](https://our.internmc.facebook.com/intern/diff/D38413369/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D38413369/)! Differential Revision: [D38413369](https://our.internmc.facebook.com/intern/diff/D38413369) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82827 Approved by: https://github.com/qihqi	2022-08-19 01:29:46 +00:00
Mario Lezcano	88d3acd6b1	Fix and improve the efficiency of the backward of xlog* functions. (#82713 ) That is `xlogy`, `special.xlogy`, `special.xlog1py`. Fixes https://github.com/pytorch/pytorch/issues/80770 Fixes https://github.com/pytorch/pytorch/issues/74279 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82713 Approved by: https://github.com/albanD	2022-08-18 21:55:42 +00:00
Mario Lezcano	aad89bb771	Make the derivative of masked_fill more efficient (#83515 ) There's no need to add all the zeros if we extract all the non-zero elements. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83515 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-08-18 13:00:12 +00:00
Mengwei Liu	badbdb0330	[torchgen] Relax the restriction on number of custom namespaces (#83580 ) Summary: We started to see use cases where it involves more than 1 custom namespace to live within the same yaml file. Hence relaxing the restriction that 1 yaml file can only have 1 custom namespace other than `aten`. Updated unit test as well. Differential Revision: D38775685 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83580 Approved by: https://github.com/JacobSzwejbka	2022-08-18 04:47:13 +00:00
Edward Z. Yang	52be908225	Delete unnecessary sum.SymInt overload (#83591 ) Dims argument only ever takes dimensions, which we do not need to SymInt-ify. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83591 Approved by: https://github.com/albanD	2022-08-18 02:00:50 +00:00
Jay Chae	451c6296af	[kineto] deprecate USE_KINETO_UPDATED (#83305 ) Summary: This is used to do cross repo updates but has not been cleaned up properly Test Plan: CI Reviewed By: aaronenyeshi Differential Revision: D38633379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83305 Approved by: https://github.com/aaronenyeshi	2022-08-17 22:31:49 +00:00
Mikayla Gawarecki	bd0ad7a84f	Add backward support for rudimentary NestedTensor.sum(dim) (#82625 ) Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out. Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625 Approved by: https://github.com/albanD	2022-08-17 18:12:00 +00:00
Larry Liu	11d4d91bdc	[torchgen] Add logic in annotation parser to accept alias set (#83501 ) Extending the current regex in `model.py` to support annotation alias set. See issue #83214. Ideally we should have a full fledged lexer similar to `schema_type_parser.cpp`, since regex can be more and more difficult to read if we add more support to it. Adding this to unblock this issue for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83501 Approved by: https://github.com/SherlockNoMad	2022-08-17 07:04:25 +00:00
Justin Chu	cd68f08992	[ONNX] Update the script for version updates (#83283 ) This PR updates the `tools/onnx/update_default_opset_version.py` script to ensure files are edited correctly to prepare for the opset 17 support in torch.onnx. - (clean up) Move script to `main()` - Add an `--skip_build` option to avoid building pytorch if we want to rerun the process due to errors after compilation is done - Update to edit the correct files now that the onnx files were refactored Pull Request resolved: https://github.com/pytorch/pytorch/pull/83283 Approved by: https://github.com/thiagocrepaldi, https://github.com/AllenTiTaiWang, https://github.com/abock	2022-08-16 22:28:54 +00:00
Nikita Shulga	a8941aa996	[BE] Better test stats errors (#83484 ) When `BUILD_ENVIRONMENT` is not defined, print sensible error message Which is better than: ``` Could not download https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/test-times.json because: 'BUILD_ENVIRONMENT' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/83484 Approved by: https://github.com/huydhn, https://github.com/ZainRizvi	2022-08-16 07:51:12 +00:00
Edward Z. Yang	2d8f091f6a	Move TorchDispatchModeTLS to c10/core (#83370 ) I need to access it directly from TensorImpl to route directly TensorImpl induced operations to modes (upcoming PR). Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83370 Approved by: https://github.com/zou3519	2022-08-15 17:59:57 +00:00
Mor Tzur	316cb8a06a	embedded_interpreter_hip (#83329 ) Summary: Adding embedded_interpreter_hip and deps to enable torch::deploy on AMD. Test Plan: Sandcastle Reviewed By: zrphercule Differential Revision: D38546701 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83329 Approved by: https://github.com/jfix71	2022-08-15 15:08:55 +00:00
Mengwei Liu	d0d6b1f222	[torchgen] Generate out variant for functional operator (#81437 ) Summary: Previously we don't generate out variant (both schema and kernel) for an operator with functional variant only. This adds support for that and adds test. ## Changes on `native_function_generation.py` We are generating out variant for all functional variants if possible. This PR introduces a lot of newly generated out variants and `native_functions.yaml` needs to incorporate the changes by adding `autogen` keywords. The logic for determining what operators we should generate an out variant for is the following: 1. No existing out variant for this `NativeFunction` 2. Contains an existing in place, mutable or functional variant 3. Contains at least 1 tensor like return(s) For operators matching the first two conditions but failing the third, I listed them in `FUNCTIONAL_OPS_THAT_CANNOT_GET_AN_OUT_VARIANT`. ## Special handling The following operators satisfy all 3 criteria above but we chose to not autogen them, with some reasons. * `mkldnn_adaptive_avg_pool2d`, the generated out variant `mkldnn_adaptive_avg_pool2d.out` is colliding with the `mkldnn_adaptive_avg_pool2d_out` kernel in `adaptive_avg_pool2d.out` operator. I manually created `mkldnn_adaptive_avg_pool2d.out` and renamed `mkldnn_adaptive_avg_pool2d_out` to `mkldnn_adaptive_avg_pool2d_out_stub`. * `min`, `max` and `mean`. There already exist `min.out`, `max.out` and `mean.out` but they are having different semantics with the functional ones. I manually created `min.unary_out`, `max.unary_out` and `mean.dtype_out` to disambiguate. ## Autograd Changes We introduced a logic to not match derivatives info in `derivatives.yaml` to out variant, since we are generating `NOT_IMPLEMENTED` kernels for those out variants anyway. The issue we are seeing with the original logic is that it doesn't handle `TensorOption` arguments really well. For example we have these two operators: * `_to_copy(Tensor self, , ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, bool non_blocking=False, MemoryFormat? memory_format=None) -> Tensor` `_to_copy.out(Tensor self, *, bool non_blocking=False, MemoryFormat? memory_format=None, Tensor(a!) out) -> Tensor(a!)` If we uses `_to_copy` derivative info, there will be compilation error since `dtype` is missing from `_to_copy.out` signature. Test Plan: Rely on unit test Differential Revision: D37832342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81437 Approved by: https://github.com/iseeyuan, https://github.com/bdhirsh	2022-08-13 05:44:53 +00:00
Nikolay Korovaiko	88d7322b07	fix a comment since the options in arg parser no longer require Declarations.yaml (#83337 ) fix a comment since the options in arg parser no longer require Declarations.yaml Pull Request resolved: https://github.com/pytorch/pytorch/pull/83337 Approved by: https://github.com/albanD	2022-08-12 21:10:41 +00:00
Edward Z. Yang	d423722607	Add data_dependent_output tag; generalize proxy tensor to test it (#83312 ) Fixes https://github.com/pytorch/pytorch/issues/83251 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83312 Approved by: https://github.com/albanD	2022-08-12 17:31:55 +00:00
Yifan Shen	7f18ef14c1	Register nested matmul as an addition to CompositeImplicit (#82786 ) The initial matmul_nested in [#81957](https://github.com/pytorch/pytorch/pull/81957) is imperfect: * it is allowed now to register another kernel in addition to CompositeImplicit * so we should do that, instead of the code smell is_nested() Pull Request resolved: https://github.com/pytorch/pytorch/pull/82786 Approved by: https://github.com/albanD	2022-08-11 21:46:05 +00:00
richard	382ef1fda7	Autograd graphtask trim unnecessary edges (#82544 ) ### Introduction <!-- What did you change and why was it needed? --> Removing unnecessary weight gradient calculation is very important for applications that need high-order derivatives during training. However, this is not supported by the current Autograd engine. For more detail: The backward function of a `matmul` operator (e.g., `linear` `addmm` `mm`), has two matmuls, one for `input gradient` and another for `weight gradient`. For a typical neural network (nn) with a few linear layers and activation functions, if the user calls `torch.autograd.grad()` to calculate the derivative of the nn output `y` w.r.t the nn input `x`, only the `input gradient` of the `matmul` operator is needed, and the `weight gradient` is discarded. However, the current PyTorch autograd engine will always calculate the `weight gradient` if `weight` requires gradient (the calculation of the high-order derivative is performed during training). The figure attached shows the autograd graph of the following code snippet: ```py y = torch.nn.functional.linear(x, weight, bias) y = y.pow(2) # first order derivative y__x, = torch.autograd.grad(y, x, grad_outputs=grad_outputs, create_graph=True) # first order derivative y__x__x, = torch.autograd.grad(y__x, x, grad_outputs=grad_outputs, create_graph=True) ``` The path with ❌ is not needed when calculating derivatives. <img width="50%" alt="image" src="https://user-images.githubusercontent.com/9999318/182018117-719c5a23-bcc6-4a63-8e8d-1bca3ebda2e3.png"> ### Issue <!-- Link to Issue ticket or RFP --> Related issue: https://github.com/pytorch/pytorch/issues/56500 ### Method When calling `torch.autograd.grad`, `exec_info_` is created for each GraphTask, which allows filtering paths on the graph that are not needed. However, when the GraphTask calls into the node, the node still does not know whether the edges are needed or not. In the case of matmul, `weight.requires_grad is True` so the weight gradient is always calculated. Following https://github.com/pytorch/pytorch/issues/56500#issuecomment-825694656, this PR passes the graph task's thread_local `exec_info_` into the node, so it could trim unnecessary edges during `torch.autograd.grad` calls. ### Benchmark Benchmark script: https://gist.github.com/yueyericardo/24158433a2021c51eeef9c3e2722df99 Benchmark result: 6 hidden layers, batch size 10000, on A100 FP32 result \| hessian benchmark \| FP32 (before) \| FP32 (After) \| FP32 (Functorch v0.1.1) \| \| ----------------------------- \| ------------- \| ----------------- \| ----------------------- \| \| Linear + ReLU (no backward) \| 55.658 ms \| 29.392 ms (1.90X) \| 29.547 ms (1.90X) \| \| Linear + ReLU (with backward) \| 81.173 ms \| 54.917 ms (1.47X) \| 68.988 ms (1.18X) \| TF32 result \| hessian benchmark \| TF32 (before) \| TF32 (after) \| TF32 (Functorch v0.1.1) \| \| ----------------------------- \| ------------- \| ----------------- \| ----------------------- \| \| Linear + ReLU (no backward) \| 19.801 ms \| 11.259 ms (1.76X) \| 10.754 ms (1.84X) \| \| Linear + ReLU (with backward) \| 29.167 ms \| 20.466 ms (1.42X) \| 22.784 ms (1.28X) \| For FP32 result, we could get 1.9X speed up for hessian calculation, and 1.47X speed up during training, which is even faster than functorch `vmap(jacfwd(jacrev` implementation. (functorch has performance regression on v0.2.0, https://github.com/pytorch/functorch/issues/989, so we are using v0.1.1 for benchmark) @zou3519 does functorch also includes similar optimizations during hessian calculation? If not, what do we need to do so the functorch could also benefit from this PR? ### Testing <!-- How did you test your change? --> - [x] we need to figure out a way for unittest ### Thanks Thanks for the great blog: [How Computational Graphs are Executed in PyTorch \| PyTorch](https://pytorch.org/blog/how-computational-graphs-are-executed-in-pytorch/) cc @zasdfgbnm @albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/82544 Approved by: https://github.com/soulitzer	2022-08-11 18:50:09 +00:00
Mengwei Liu	c322fc03a1	[torchgen] Fix selective build error on custom namespace (#83141 ) Summary: Currently `SelectiveBuilder` is hardcoding namespace `aten` for operators. This is not working anymore since operators started to have custom namespaces. This fixes it. Test Plan: Rely on newly added unit test Differential Revision: D38565527 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83141 Approved by: https://github.com/JacobSzwejbka	2022-08-10 21:27:05 +00:00
PyTorch MergeBot	f534b2c627	Revert "Remove split functional wrapper (#74727 )" This reverts commit `a58876ace7`. Reverted https://github.com/pytorch/pytorch/pull/74727 on behalf of https://github.com/seemethere due to Fails internal use cases, might extend out to external use cases as well. Need to assess overall impact of this change more widely	2022-08-10 19:45:23 +00:00
Mikayla Gawarecki	e3e33cfae0	Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801 ) `derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as ``` name: foo(Tensor self, Tensor y) -> Tensor dispatch: Default: x: grad y: grad.expand(y.sizes()) AutogradNestedTensor: x: grad y: NestedTensor_foo_backward(grad, y) output_differentiabilty: [True] ``` However the old schema where there is no `dispatch` entry is still supported. Would greatly appreciate feedback on how to improve the testing strategy of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous. Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](`ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)`) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?) * __->__ #82801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801 Approved by: https://github.com/bhosmer, https://github.com/albanD	2022-08-10 19:26:29 +00:00
Peter Bell	a58876ace7	Remove split functional wrapper (#74727 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/74727 Approved by: https://github.com/albanD, https://github.com/khabinov	2022-08-10 17:57:48 +00:00
Kurt Mohler	be5b3df6cc	Update `std_mean/var_mean/nanmean/nansum` signatures with `int[1]? dim` (#82912 ) ### Description Change the type of the `dim` arg for `std_mean/var_mean/nanmean/nansum` to `int[1]?` in `native_functions.yaml` ### Issue Part of #29137 ### Testing Pull Request resolved: https://github.com/pytorch/pytorch/pull/82912 Approved by: https://github.com/albanD	2022-08-10 16:58:26 +00:00
Nicolas Macchioni	b236352036	Add mask identifier for multiplexed src_mask/src_key_padding_mask in BT (#81947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81947 Transformer fastpath multiplexes two arguments, src_mask [seq_len x seq_len] and src_key_padding_mask [batch_size x seq_len], and later deduces the type based on mask shape. In the event that batch_size == seq_len, any src_mask is wrongly interpreted as a src_key padding_mask. This is fixed by requiring a mask_type identifier be supplied whenever batch_size == seq_len. Additionally, added support for src_mask in masked_softmax CPU path. Test Plan: existing unit tests + new unit tests (batch_size == seq_len) Differential Revision: D37932240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81947 Approved by: https://github.com/zrphercule	2022-08-09 23:42:16 +00:00
soulitzer	b55f9047e1	Add forward AD support for elu_, celu_, selu_ (#83080 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83080 Approved by: https://github.com/albanD	2022-08-09 20:15:44 +00:00
Natalia Gimelshein	e77d4ec5eb	fix where backward to use scalar 0 (#83043 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/83043 Approved by: https://github.com/Chillee	2022-08-09 16:27:44 +00:00
Shunting Zhang	943553965e	support custom class in torchgen schema parser (#82925 ) Differential Revision: [D38480514](https://our.internmc.facebook.com/intern/diff/D38480514/) torchgen schema parser does not support parsing function schemas using custom class so far. Here is an example: ``` quantized::conv2d_relu.new(Tensor qx, __torch__.torch.classes.quantized.Conv2dPackedParamsBase packed_weight, float output_scale, int output_zero_point) -> (Tensor) ``` This PR parse custom class name and encapsulate that into an object of CustomClassType. The only thing we need right now is just store the string class name and return that in `__str__` method. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82925 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2022-08-08 22:24:43 +00:00
Nikolay Korovaiko	35b4ac4eeb	remove unused/debug header (#82845 ) ### Description Missed one of the review comments in https://github.com/pytorch/pytorch/pull/82731 . Namely, to remove an unused `<iostream>` that was used for debugging ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82845 Approved by: https://github.com/Chillee, https://github.com/albanD	2022-08-08 21:40:17 +00:00
Peter Bell	4f255dbfb3	Remove manual bindings for arange (#81380 ) The functional variant of one of the `arange` overloads has a schema mismatch with the out variant. The functional one has `Scalar step`, but the corresponding out variant has `Scalar step=1`. This isn't allowed, so it had to be special-cased in the python codegen and manually bound. This adds the default `step` value to the functional overload and removes the special-casing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81380 Approved by: https://github.com/ngimel	2022-08-07 00:10:27 +00:00
Peter Bell	adc5e7d32e	Remove manual bindings for linspace, logspace and full (#81378 ) These functions are bound manually because their default dtype isn't always the same as `torch.get_default_dtype()`. This was necessary because the python binding codegen effectively translated `ScalarType? dtype=None` to `ScalarType dtype=torch.get_default_dtype()`. I've fixed the python bindings generator to correctly pass through `None`, and thus we can safely remove the manual bindings. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81378 Approved by: https://github.com/ngimel	2022-08-07 00:10:27 +00:00
Nikolay Korovaiko	bfebf254dd	Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) (#82855 ) ### Description This is a reland of (#82374) (#82726) (#82731) This PR has no extra fixes, it simply updates the correct pin to point to the XLA side that has the corresponding changes. ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82855 Approved by: https://github.com/ezyang, https://github.com/qihqi	2022-08-05 03:36:09 +00:00
PyTorch MergeBot	78bd95b13a	Revert "Re-land sym_numel (#82374 ) (#82726 ) (#82731 )" This reverts commit `c90e00cf85`. Reverted https://github.com/pytorch/pytorch/pull/82731 on behalf of https://github.com/zengk95 due to This is breaking XLA tests on trunk. It seems to have passed on PR and was able to checkout that commit `c90e00cf85`.	2022-08-04 22:45:26 +00:00
Nikolay Korovaiko	c90e00cf85	Re-land sym_numel (#82374 ) (#82726 ) (#82731 ) This PR relands sym_numel #82374 and fixes the ios build break in this commit : `8cbd0031c5` which was a type mismatch in an equality. ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82731 Approved by: https://github.com/malfet	2022-08-04 21:05:24 +00:00
Michael Gschwind	82f558feee	Allow user to assert no mask contiguous check is necessary (#82533 ) Summary: Allow user to assert no mask contiguous check is necessary: (1) Prevents sync event which will disrupt CUDA Graph collection, and (2) offers slightly better performance by avoid a sync This needs to be a separate opt-in option because we change behavior of malformed masks. It's the only way to get BT into CUDA Graph based on what I understood about CUDA Graph collection from ngimel. Test Plan: sandcastle unit tests Differential Revision: D38040418 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82533 Approved by: https://github.com/jbschlosser, https://github.com/zrphercule	2022-08-04 17:57:57 +00:00
zengk95	d0e6e5a5bb	Revert "sym_numel (#82374 )" (#82726 ) TSIA It looks like this PR #82374 is breaking mac builds on trunk but I can't revert it normally since there's a merge conflict in the XLA hash. <img width="1753" alt="image" src="https://user-images.githubusercontent.com/34172846/182644661-b7fdda4b-e5ce-45c3-96a2-ad6737d169ae.png"> I reverted it and resolved the conflict using the old XLA hash that this commit was based upon Pull Request resolved: https://github.com/pytorch/pytorch/pull/82726 Approved by: https://github.com/albanD, https://github.com/janeyx99	2022-08-03 15:23:47 +00:00
Nikolay Korovaiko	fd68b0931f	sym_numel (#82374 ) ### Description This PR makes `numel` symint-aware similar to `sym_sizes()` and `sym_strides()`. Similar to https://github.com/pytorch/pytorch/pull/81300 . This PR is the part of a bigger project to support dynamic_shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82374 Approved by: https://github.com/ezyang	2022-08-03 06:33:45 +00:00
Edward Z. Yang	df69660832	Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552 )"" (#82599 ) This reverts commit `532b8a9e00`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82599 Approved by: https://github.com/albanD	2022-08-02 19:37:02 +00:00
Elias Ellison	642aed8b99	Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449 ) From PR: ``` Note: [Fake Tensor Dispatch Keys] In order to model the behavior of device-specific autocast and autograd logic, we update the dispatch keys of FakeTensors to reflect their fake device. This includes the BackendComponent (DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent related Autocast and Autograd keys. __torch__dispatch__ sits below Autocast and Autograd, and is only invoked when we are at the kernel for the BackendComponent. Then, we add Meta to the thread-local dispatch include set to hit the meta kernel instead of the kernel of the BackendComponent for the fake device. ``` Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge. See: https://github.com/pytorch/pytorch/issues/81608 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449 Approved by: https://github.com/ezyang	2022-08-01 21:40:36 +00:00
PyTorch MergeBot	532b8a9e00	Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552 )" This reverts commit `9465c0e0b5`. Reverted https://github.com/pytorch/pytorch/pull/82552 on behalf of https://github.com/zengk95 due to This seems to be breaking windows binary wheels	2022-08-01 20:25:35 +00:00
kshitij12345	e93b5210ec	[composite compliance] allclose, linalg_eig (#82437 ) Ref: #69991 Make `allclose` CompositeExplicit as it calls `item` (we can't get away from it) which makes it non Composite Compliant. `linalg_eig` backward passes CompositeCompliance as it calls on `allclose` Pull Request resolved: https://github.com/pytorch/pytorch/pull/82437 Approved by: https://github.com/zou3519	2022-08-01 18:01:15 +00:00
Edward Z. Yang	9465c0e0b5	Add a lint rule for torch/csrc/util/pybind.h include (#82552 ) We define specializations for pybind11 defined templates (in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently it is important that these specializations always be #include'd when making use of pybind11 templates whose behavior depends on these specializations, otherwise we can cause an ODR violation. The easiest way to ensure that all the specializations are always loaded is to designate a header (in this case, torch/csrc/util/pybind.h) that ensures the specializations are defined, and then add a lint to ensure this header is included whenever pybind11 headers are included. The existing grep linter didn't have enough knobs to do this conveniently, so I added some features. I'm open to suggestions for how to structure the features better. The main changes: - Added an --allowlist-pattern flag, which turns off the grep lint if some other line exists. This is used to stop the grep lint from complaining about pybind11 includes if the util include already exists. - Added --match-first-only flag, which lets grep only match against the first matching line. This is because, even if there are multiple includes that are problematic, I only need to fix one of them. We don't /really/ need this, but when I was running lintrunner -a to fixup the preexisting codebase it was annoying without this, as the lintrunner overall driver fails if there are multiple edits on the same file. I excluded any files that didn't otherwise have a dependency on torch/ATen, this was mostly caffe2 and the valgrind wrapper compat bindings. Note the grep replacement is kind of crappy, but clang-tidy lint cleaned it up in most cases. See also https://github.com/pybind/pybind11/issues/4099 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552 Approved by: https://github.com/albanD	2022-08-01 17:16:58 +00:00
Edward Z. Yang	a9320e6d96	Delete SymInt::data() in favor of as_int_unchecked() (#82477 ) I audited all the sites while I was at it, and marked a few suspicious ones. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82477 Approved by: https://github.com/Chillee	2022-08-01 15:07:22 +00:00
Edward Z. Yang	50e8abbcad	Change SymIntNode into an intrusive pointer (#82548 ) This will make the pointer type a single word, which is important for packing it into an int64_t This time, this diff doesn't segfault when you build with DEBUG mode; more details at https://github.com/pybind/pybind11/issues/4099 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82548 Approved by: https://github.com/albanD	2022-08-01 15:07:21 +00:00
YifanShenSZ	4bb7e148c4	add nested tensor matmul support (#81957 ) There was a discussion on whether letting nested tensor `reshape` support collapsing and splitting dimension 0. The conclusion was to make reshape simple, so we need a tweaked `matmul`, which only supports 3+ dimension nonbroadcast case, i.e. a generalized `bmm`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81957 Approved by: https://github.com/jbschlosser	2022-07-30 22:35:09 +00:00
Kurt Mohler	14d0296e5c	Rename `_Typed/_UntypedStorage` to `Typed/UntypedStorage` and update docs (#82438 ) ### Description Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public. `TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`. Documentation for storages is improved as well. ### Issue Fixes #82436 ### Testing N/A Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438 Approved by: https://github.com/ezyang	2022-07-30 19:37:08 +00:00
Mengwei Liu	301fe8c27d	[torchgen] Fix multiple backends with custom namespace (#82133 ) Summary: Some quantized operators needs `QuantizedCPU` backend, due to an issue in namespace checking, currently if we have two backends as well as a custom namespaces in native function, codegen will hit assertion error. This PR fixes this issue The root cause is that codegen right now asserts that a native function should only have one namespace. The current behavior is that If a native function is not found in a `BackendIndex`, we will use default namespace for that backend, for fallback kernels. However that default namespace may not be listed in the yaml file and it should not be counted when checking if we have two different namespaces for that backend. In our error case, we have 2 `BackendIndex`, one for `QuantizedCPU` and one for `CPU`. The native function doesn't have a kernel in `QuantizedCPU` but we still use a default namespace (`at::native`) for it. Since we have a custom namespace for dispatch key `CPU`, we ran into the assertion error. This PR changes the assertion criteria. We only error out if a namespace has two or more kernels and they have two or more different namespaces. Test Plan: rely on newly added unit test Differential Revision: D38101345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82133 Approved by: https://github.com/iseeyuan	2022-07-29 22:53:58 +00:00
Shintaro Iwasaki	ccd30a12a2	[PyTorch][Kineto] add ActivityType.h when USE_KINETO is not set (#82028 ) Summary: This patch fixes an error "'ActivityType.h' file not found" when `use_kineto()` is false. ## Problem Even when `use_kineto()` is not set (i.e., `-DUSE_KINETO` is not passed), `ActivityType.h` is required for PyTorch compilation: https://github.com/pytorch/pytorch/blob/master/torch/csrc/profiler/kineto_shim.h#L15 ## Solution Add `ActivitiyType.h` dependency even when `use_kineto() == False`. Test Plan: PyTorch internal and external CI tests. Differential Revision: D38090153 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82028 Approved by: https://github.com/kit1980, https://github.com/robieta	2022-07-29 20:57:59 +00:00
Peter Bell	ba4727d4e5	Codegen: Parse deprecated signatures as a full FunctionSchema (#82179 ) Deprecated signatures are currently "parsed" manually to find the relative order of the argument names and all other information is inferred from the aten schema for the non-deprecated overload. However, this leads to problems if the argument names don't match or if there are multiple candidates that match the ATen function call. Instead, this makes the deprecated function a full FunctionSchema and so the entire python signature comes solely from the deprecated schema, with the `aten:` clause only used for the dispatch lambda call. I have confirmed locally that there is no change to `python_torch_functionsEverything.cpp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82179 Approved by: https://github.com/albanD	2022-07-29 17:19:54 +00:00
Kurt Mohler	2bfae07a79	Enable `dim=None` for `torch.mean` (#81286 ) Part of #79525 This will require coordination with XLA before merging, just like #79881 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81286 Approved by: https://github.com/albanD	2022-07-28 22:34:56 +00:00
Edward Z. Yang	fd5ac1e6b5	Rename SymbolicIntNode to SymIntNodeImpl (#82350 ) Done via ``` git grep -l 'SymbolicIntNode' \| xargs sed -i 's/SymbolicIntNode/SymIntNodeImpl/g' ``` Reasoning for the change: * Sym is shorter than Symbolic, and consistent with SymInt * You usually will deal in shared_ptr<...>, so we're going to reserve the shorter name (SymIntNode) for the shared pointer. But I don't want to update the Python name, so afterwards I ran ``` git grep -l _C.SymIntNodeImpl \| xargs sed -i 's/_C.SymIntNodeImpl/_C.SymIntNode/' ``` and manually fixed up the binding code Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82350 Approved by: https://github.com/Krovatkin	2022-07-28 18:27:45 +00:00
PyTorch MergeBot	40a0150f8b	Revert "libtorch: exclude from libomnibus to support multipy usage from pybind (#81672 )" This reverts commit `0933c037e7`. Reverted https://github.com/pytorch/pytorch/pull/81672 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-07-28 17:59:16 +00:00
Catherine Lee	86f038dd56	download test times during build to avoid race conditions (#81915 ) After https://github.com/pytorch/pytorch/pull/81116, we started pulling test times straight from the source instead of first downloading them in the build job and then having the test job take the build jobs version. This can cause an issues where different shards pull different versions of the file, leading to incorrect sharding (ex two shards running the same tests file on accident). This generally happens if the test jobs happen while the test times file is being updated (unlikely, but not impossible) or if someone reruns a test job the next day. In this PR, I return to the old method of downloading the test times file during the build job and having the test jobs pull from the build jobs uploaded artifacts. If there is no test times file in the build job's artifacts, we fall back to the default sharding plan. Notes: * script moved to a new file to avoid needing to import torch, which would require torch to be built, which can cause issues with asan * I got errors with asan (`ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.`), so I put the script at the beginning of the build ### Test Plan Verified that the number of tests ran in the pull and trunk workflows are similar to workflows run on master. Checked logs to see if artifacts were being used for sharding. Spot checked a few test configs to check that their lists of selected tests didn't overlap. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81915 Approved by: https://github.com/huydhn	2022-07-28 16:35:01 +00:00
Edward Z. Yang	d38ffa6a4c	Make all of new_/_like factory functions composite explicit autograd (#82238 ) Once CompositeImplicitAutograd gets registered to Python key, this will ensure that tensor subclasses can interpose on these functions directly rather than getting decomposed. We prefer not decomposing as these functions are functional, but their implementations use inplace operations (and are thus more difficult to deal with, unless you use functionalization.) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82238 Approved by: https://github.com/zou3519, https://github.com/bdhirsh	2022-07-27 18:33:46 +00:00
Tristan Rice	0933c037e7	libtorch: exclude from libomnibus to support multipy usage from pybind (#81672 ) Summary: When libtorch is bundled into libomnibus all of the symbols are marked as unexported which causes issues when deploy/multipy tries to link in a subinterpreter at runtime. This excludes `libtorch` and `ATen-core` from libomnibus so the symbols remain exported and available. Test Plan: stacked diff ``` buck2 test @//mode/opt -c python.package_style=inplace //multipy/runtime:test_deploy_from_python ``` Differential Revision: D37946374 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81672 Approved by: https://github.com/PaliC	2022-07-27 17:27:57 +00:00
Xinya Zhang	ec99a8003a	[ROCM] Improvements of incremental hipification and build (#82190 ) ### Description Improve the incremental build process on ROCM by eliminating unnecessary file changes. ### Issue N/A ### Testing 1. Run `python tools/amd_build/build_amd.py --out-of-place-only` multiple times, and ensure File `third_party/gloo/cmake/Modules/Findrccl.cmake` does not contain patterns like `RCCL_LIBRARY_PATH_PATH` 2. Run `python tools/amd_build/build_amd.py; USE_ROCM=1 python3 setup.py develop` twice, and confirm the second run does not trigger the compiling of thousands of files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82190 Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang	2022-07-27 13:37:40 +00:00
Horace He	fc389cc0a0	Added new_empty.symint overload and a new_empty ref (#82049 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82049 Approved by: https://github.com/ezyang	2022-07-27 00:31:57 +00:00
PyTorch MergeBot	6c10a598ca	Revert "add nested tensor matmul support (#81957 )" This reverts commit `7bdafed4f1`. Reverted https://github.com/pytorch/pytorch/pull/81957 on behalf of https://github.com/osalpekar due to Reverting this in order to revert https://github.com/pytorch/pytorch/pull/80981 cleanly. That diff caused GPU Inference breakage internally	2022-07-26 21:10:28 +00:00
Nikolay Korovaiko	d2c47d559c	Revert "Revert "Enabling SymInt in autograd; take 3 (#81145 )"" ; make sure is_intlist checks for symintnodes (#82189 ) ### Description <!-- What did you change and why was it needed? --> ### Issue <!-- Link to Issue ticket or RFP --> ### Testing <!-- How did you test your change? --> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189 Approved by: https://github.com/ezyang	2022-07-26 20:47:11 +00:00
Nikolay Korovaiko	30e74be784	a new section for ir generation (#81847 ) This is to get a conversation started. * @JackCaoG we could add attributes to items in `ir_codegen` section to customize IR generation logic (e.g. not generating `::Lower`). Though it could be a bit tricky to thread it through. * Adding an extra argument to `map_codegen` to filter native functions out seems like a step in the right direction. Otherwise, it's a bit confusing how do we go from a full list to a codegen list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81847 Approved by: https://github.com/JackCaoG, https://github.com/wconstab, https://github.com/bdhirsh	2022-07-26 20:39:07 +00:00
YifanShenSZ	7bdafed4f1	add nested tensor matmul support (#81957 ) There was a discussion on whether letting nested tensor `reshape` support collapsing and splitting dimension 0. The conclusion was to make reshape simple, so we need a tweaked `matmul`, which only supports 3+ dimension nonbroadcast case, i.e. a generalized `bmm`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81957 Approved by: https://github.com/jbschlosser	2022-07-26 16:58:42 +00:00
Edward Z. Yang	9d45243e24	Move empty_like to DONT_REQUIRE_DERIVATIVE list (#82178 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82178 Approved by: https://github.com/soulitzer	2022-07-26 04:26:22 +00:00
Catherine Lee	6f2a88dd50	script to monitor memory + cpu utilization (#82006 ) Add a python script that runs in the background during test jobs to log cpu + gpu memory usage and cpu utilization of python tests (really any python process) to a file and upload the file as an artifact. I plan on using the the gpu memory usage stats to better understand how to parallelize them, but it is easy to add on other stats if people want them. In the future, we want to add the ability to track network usage to see if we can decrease it. GPU utilization will also likely need to be improved. Click the hud link to see uploaded usage log artifacts Pull Request resolved: https://github.com/pytorch/pytorch/pull/82006 Approved by: https://github.com/huydhn	2022-07-25 16:53:31 +00:00
Kshiteej K	db0e121b46	[composite compliance] put, take (#81094 ) Reference: #69991 This PR makes `put` CompositeExplicit as it is implemented in terms of `put_` (for which we can't handle Composite Compliance at the implementation level). Ref (put implementation) `478081c698/aten/src/ATen/native/TensorAdvancedIndexing.cpp (L619-L621)` Also, we update the `take` gradient formula to handle Tensor Subclass . Pull Request resolved: https://github.com/pytorch/pytorch/pull/81094 Approved by: https://github.com/zou3519	2022-07-25 15:05:16 +00:00
PyTorch MergeBot	c078476eb0	Revert "Enabling SymInt in autograd; take 3 (#81145 )" This reverts commit `032facd6e6`. Reverted https://github.com/pytorch/pytorch/pull/81145 on behalf of https://github.com/jeanschmidt due to breaking internal builds	2022-07-22 11:15:20 +00:00
Zain Rizvi	d28e667159	Update actionlint (#81922 ) This PR will: 1. Update actionlint to fix false positives from https://github.com/pytorch/pytorch/issues/81807 2. Establish a new naming convention for S3 file paths for linter adapters which allows older commits of pytorch to no longer be broken 3. Add update instructions to the s3_init_config.json file. Why are the instructions embedded in this json file and not the pytorch wiki? Anyone who tries to update the binaries will definitely easily this file and can see the instructions above. The wiki is not nearly as searchable and is likely to not get noticed Why embed the comment as data in the json file? Json doesn't support native comments. But since nothing is validating the exact shape of this json file, adding an extra dictionary entry to serve as a comment is perfectly safe. ## Testing I validated the architectures of the old binaries by running `file actionlint` on them and inspecting the outputs I validated the hash was sha256 by checking tools/linter/adapters/s3_init.py and by also downloading the binaries from s3 and verifying their sha256 matches what's in s3_init_config.json I validated end to end behavior by: 1. Deleting `.lintbin\actionlint` locally, running `lintrunner init` and verifying it got installed correctly and could lint files 2. Changing the sha to an invalid value and verifying `lintrunner init` failed to install actionlint Pull Request resolved: https://github.com/pytorch/pytorch/pull/81922 Approved by: https://github.com/kit1980, https://github.com/janeyx99	2022-07-22 01:55:42 +00:00
Nikolay Korovaiko	032facd6e6	Enabling SymInt in autograd; take 3 (#81145 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/81145 Approved by: https://github.com/ezyang	2022-07-22 00:14:50 +00:00
lezcano	c5330183ca	[PrimTorch] Reference for linalg.matrix_norm (#81113 ) As per title. I corrected a thing or two from my previous implementation to make for better errors in some weird edge-cases and have a more clear understanding of when does this function support low_precision types and when it doesn't. We also use the optimisation for bfloat16 within `vector_norm` within this function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81113 Approved by: https://github.com/ngimel	2022-07-21 23:07:32 +00:00
Edward Z. Yang	a7c1f74426	Revert "Revert "Call lift_fresh after scalar_to_tensor in composite derivative formulas (#81609 )"" (#81885 ) This reverts commit `fdc2af0090`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81885 Approved by: https://github.com/soulitzer	2022-07-21 17:35:49 +00:00
Peter Bell	8d0cbce069	Lower randint default dtype to the C++ API (#81410 ) The default dtype for randint is currently handled with manual python binding code, this moves it into the `native_functions.yaml` declaration for API consistency. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81410 Approved by: https://github.com/albanD	2022-07-21 16:42:49 +00:00
Peter Bell	5f2e31797a	Replace _dtype_default_type_hack (#81479 ) Currently any function with a default dtype other than None has to be manually entered into this function. Instead, this reads the default directly from `native_functions.yaml`. In order to do this, I also change `PythonSignatureGroup` to take `tensor_options_args` from the functional variant since the out variant doesn't actually have tensor options arguments to take the default values from. Also note that we need to use `default_init` instead of `default` because the out argument version doesn't have a `tensor_options` argument to extract the default value from and so the PythonSignature objects wouldn't match. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81479 Approved by: https://github.com/albanD	2022-07-21 16:42:49 +00:00
PyTorch MergeBot	fdc2af0090	Revert "Call lift_fresh after scalar_to_tensor in composite derivative formulas (#81609 )" This reverts commit `aad7a1c06c`. Reverted https://github.com/pytorch/pytorch/pull/81609 on behalf of https://github.com/jeanschmidt due to breaking internal builds	2022-07-21 10:50:05 +00:00
Edward Z. Yang	aad7a1c06c	Call lift_fresh after scalar_to_tensor in composite derivative formulas (#81609 ) `scalar_to_tensor` is not dispatched and thus there is no interposition point for modes to ensure that the resulting tensor is appropriately wrapped. `lift_fresh` introduces this interposition point. This prevents FakeTensorMode from erroring. I can't make these wrapped numbers because there is some downstream logic on convolution backwards that expects these inputs to be honest to goodness tensors for conjugation. This fixes test_aot_autograd_exhaustive_special_ndtr_cpu_float32 in https://github.com/pytorch/functorch/pull/935 See https://github.com/pytorch/pytorch/issues/81608 for more discussion Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81609 Approved by: https://github.com/soulitzer	2022-07-21 04:46:05 +00:00
ssjia	96958be6be	[vulkan] Automatically generate shader layout from GLSL (#81715 ) Differential Revision: [D37966838](https://our.internmc.facebook.com/intern/diff/D37966838/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81715 Approved by: https://github.com/kirklandsign	2022-07-20 01:57:59 +00:00
Catherine Lee	06a0cfc0ea	pytest to run test_ops, test_ops_gradients, test_ops_jit in non linux cuda environments (#79898 ) This PR uses pytest to run test_ops, test_ops_gradients, and test_ops_jit in parallel in non linux cuda environments to decrease TTS. I am excluding linux cuda because running in parallel results in errors due to running out of memory Notes: * update hypothesis version for compatability with pytest * use rerun-failures to rerun tests (similar to flaky tests, although these test files generally don't have flaky tests) * reruns are denoted by a rerun tag in the xml. Failed reruns also have the failure tag. Successes (meaning that the test is flaky) do not have the failure tag. * see https://docs.google.com/spreadsheets/d/1aO0Rbg3y3ch7ghipt63PG2KNEUppl9a5b18Hmv2CZ4E/edit#gid=602543594 for info on speedup (or slowdown in the case of slow tests) * expecting windows tests to decrease by 60 minutes total * slow test infra is expected to stay the same - verified by running pytest and unittest on the same job and check the number of skipped/run tests * test reports to s3 changed - add entirely new table to keep track of invoking_file times Pull Request resolved: https://github.com/pytorch/pytorch/pull/79898 Approved by: https://github.com/malfet, https://github.com/janeyx99	2022-07-19 19:50:57 +00:00
Larry Liu	e345138591	[retake2][mobile] Fix lightweight dispatch OOM error by introducing selective build (#80791 ) To fix #78540 I committed #78983 which is reverted due to internal CI failure. Then I comitted #79215 which was only fixing the failure but didn't have the full feature of #78983. This PR is another try. This PR adds script to dump all operators from test models and automatically write into `lightweight_dispatch_ops.yaml`. This way we don't have to manually update the yaml file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80791 Approved by: https://github.com/raziel	2022-07-15 18:04:25 +00:00
Peter Bell	00459c2c87	[primTorch] Implement constant_pad_nd (#80182 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80182 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-07-15 15:13:42 +00:00

1 2 3 4 5 ...

4284 Commits