pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	9a28a7b498	Revert "Add support for `torch.Generator` type in TorchScript (#110413 )" This reverts commit `27e31ab6e8`. Reverted https://github.com/pytorch/pytorch/pull/110413 on behalf of https://github.com/PaliC due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/110413#issuecomment-1799003164))	2023-11-07 15:53:32 +00:00
Antonio Kim	27e31ab6e8	Add support for `torch.Generator` type in TorchScript (#110413 ) - Add support for `torch.Generator` type in TorchScript - Add `generator` args to all `torch.nn.init` functions that call `uniform_` or `normal_` - Add support for `torch.Generator` in LTC's TorchScript backend (CC: @wconstab) CC: @eellison @davidberard98 @GlebKazantaev @behzad-a Pull Request resolved: https://github.com/pytorch/pytorch/pull/110413 Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/glebk-cerebras, https://github.com/davidberard98	2023-11-06 21:27:02 +00:00
Maxime Arthaud	62cbe86ac0	[torch] Skip the assertion on the return type when the annotation is a forward reference (#112870 ) Summary: The assertion is causing build failures when running Pysa, our security-focused static analyzer. This is because we run `pyre infer` on the source code before analyzing it, which introduces annotations such as `def foo() -> 'torch._tensor.Tensor'`. This does not work with the `out_wrapper` decorator which relies on inspecting the signature of the decorated function. Let's skip the check on the return type if we detect that it was introduced by `pyre infer`. Test Plan: eyes Differential Revision: D50976601 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112870 Approved by: https://github.com/ZainRizvi	2023-11-04 00:22:13 +00:00
Peter Bell	46e80ce58a	[ATen] Support multi dim any and all reductions (#110310 ) This adds a new overload to `all` and `any` with support for multiple reduction dims. ``` all.dims(Tensor self, int[1]? dim=None, bool keepdim=False) -> Tensor any.dims(Tensor self, int[1]? dim=None, bool keepdim=False) -> Tensor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110310 Approved by: https://github.com/lezcano, https://github.com/albanD, https://github.com/justinchuby	2023-10-24 21:33:53 +00:00
Zhang, Wuxun	f32eb9bc55	fix missing non-contiguous output handling for add op (#111758 ) patch for https://github.com/pytorch/pytorch/pull/104689 which is missing similiar handling for add op Pull Request resolved: https://github.com/pytorch/pytorch/pull/111758 Approved by: https://github.com/karthiknagasub, https://github.com/ezyang	2023-10-24 17:27:50 +00:00
Aaron Gokaslan	cb856b08b2	[BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496 ) Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496 Approved by: https://github.com/malfet	2023-10-19 21:56:36 +00:00
lezcano	2fd546aa5e	Allow strided layout in torch.normal (#111205 ) Fixes https://github.com/pytorch/pytorch/issues/111119 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111205 Approved by: https://github.com/ezyang	2023-10-13 21:17:38 +00:00
isdanni	2f53085f3f	[BE] Enable Ruff's Flake8 PYI030 (#111103 ) Enable [unnecessary-literal-union (PYI030)](https://docs.astral.sh/ruff/rules/unnecessary-literal-union/) Link: #110950 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111103 Approved by: https://github.com/albanD	2023-10-12 13:31:59 +00:00
PyTorch MergeBot	98c329b19e	Revert "[core ATen IR] Add decompositions for max, min, var_mean (#110906 )" This reverts commit `9606cda64e`. Reverted https://github.com/pytorch/pytorch/pull/110906 on behalf of https://github.com/SS-JIA due to Breaks internal CI ([comment](https://github.com/pytorch/pytorch/pull/110906#issuecomment-1757490740))	2023-10-11 11:41:21 +00:00
Edward Z. Yang	24bf9aeb6b	Fix arange with dynamic end argument. (#110979 ) Fixes https://github.com/pytorch/pytorch/issues/93468 There's a few extra tests that are sort of unrelated, but I ended up writing them while working on the fix and decided to keep them. The big idea here is to split the `_check` so that `expect_true` works; I could have probably also improved the symbolic reasoning but I'm lazy. One small logging fix too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/110979 Approved by: https://github.com/Skylion007	2023-10-11 00:32:34 +00:00
SS-JIA	9606cda64e	[core ATen IR] Add decompositions for max, min, var_mean (#110906 ) ## Context Add decompositions for `aten.max`, `aten.min`, and `aten.var_mean`. These operators follow a pattern of returning a tuple of outputs from two component operators: ``` aten.max(x) -> return aten.amax(x), aten.argmax(x) aten.min(x) -> return aten.amin(x), aten.argmin(x) aten.var_mean(x) -> return aten.var(x), aten.mean(x) ``` For `var_mean`, the `refs` implementation was doing something similar, so I changed it to call `torch.` ops instead like was done for other `refs` implementations previously. cc: @peterbell10 @lezcano Note that Inductor lowers all these directly, so they are excluded from the Inductor decomp table. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110906 Approved by: https://github.com/manuelcandales	2023-10-11 00:06:24 +00:00
Stephen Jia	c2e7a0d689	[core IR] Add decomps for `aten.sum` and `aten.squeeze` variants (#110645 ) Summary: ## Context Both `aten.sum` and `aten.squeeze` have a "most generic" variant in the form of `aten.sum.dim_IntList` and `aten.squeeze.dims` respectively. Add decompositions for other non generic variants of these operators to express them using the most generic variant. Note that to register these decomps, the reference implementation under `_refs` had to be removed as registered decompositions. cc: @lezcano @peterbell10 Test Plan: Github CI + Meta Internal CI Differential Revision: D49965952 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110645 Approved by: https://github.com/peterbell10, https://github.com/digantdesai, https://github.com/manuelcandales	2023-10-07 04:21:51 +00:00
Peter Bell	d796518485	[refs] Fix size check from #108360 (#109083 ) PR #108360 uses the same default `last_dim_size` formula from complex-to-real (C2R) transforms for complex-to-complex (C2C) and real-to-complex (R2C). However, this is not correct because for C2R the input is only half the size of the full tensor, which is not the case for C2C and C2R. This error is mostly benign since `last_dim_size` was only used for the `>= 1` condition which is almost always met anyway. For this PR I now use it as the argument to `_apply_norm` which makes it load-bearing for correctness and so is thoroughly tested now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109083 Approved by: https://github.com/lezcano	2023-09-27 23:59:29 +00:00
SS-JIA	dec140f1ea	[core IR] Add a core decomposition for aten.all (#110093 ) ## Context Change the ref implementation of `aten.all` to only use other `torch` operators such that we can use it for the core ATen decomposition table. This will replace the decomposition for `aten.all` that was used specifically by Inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110093 Approved by: https://github.com/manuelcandales, https://github.com/peterbell10, https://github.com/lezcano	2023-09-27 01:31:41 +00:00
SS-JIA	5df8aca994	[core IR] Add a core decomposition for floor_divide (#110046 ) ## Context Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table. This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition ``` # TorchInductor-only decomposition. It should not be taken to core. # See https://github.com/pytorch/torchdynamo/pull/1120 ``` but couldn't discern the reason why this is the case. cc: @lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046 Approved by: https://github.com/peterbell10	2023-09-26 08:39:21 +00:00
Mwiza Kunda	5c4b5baf21	Fix python decomps for OpOverloadPackets and add tests (#107707 ) - Extend `test_torch_dispatch_meta_outplace` to test torch ops that do not have an out parameter but have aten op overloads that have out parameters. Additionally, Python decompositions may register `OpOverloadPacket`'s so decompositions need to be tested to ensure all `OpOverloads` still function for the `Meta` key (e.g. if a python decomposition is registered for an aten op `aten.foo` with overloads `[default, out]`, the python function needs to support receiving out arguments) - Add out parameter wrappers to python decomps for aten ops that have out overloads CC. @ezyang @albanD @lezcano Fixes #107713 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107707 Approved by: https://github.com/lezcano	2023-09-25 20:53:30 +00:00
SS-JIA	7de669f2f9	[core IR] Remove trunc decomp and add trunc to core (#109902 ) Following up from [this comment](https://github.com/pytorch/pytorch/pull/109319#discussion_r1330803226). Remove the decomposition for `trunc`, and add it as a core operator. Going forward, provide similar treatment for operators that map cleanly to hardware instructions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109902 Approved by: https://github.com/peterbell10	2023-09-25 18:18:06 +00:00
Mwiza Kunda	6b7b9c796e	Fix registering jit decompositions for jvp for out wrapped decomps (#109367 ) Python decompositions wrapped by `out_wrapper` need to be unwrapped before compiling with TorchScript since: - `out_wrapper` extends the decompositions signature with an out parameter, however this `out` parameter is not present in the source code of the original decomposition so the resulting `ScriptFunction` will not have an `out` parameter - `out_wrapper` is in the `torch._prims_common.wrappers` module so its `globals()` are different to the globals of the decomposition to be wrapped. This may cause symbol resolution to fail with the TorchScript compiler since it is compiling the unwrapped decomps source code rather than the wrapper The python decomposition for `aten.trace` is wrapped as an example, other decompositions are to be fixed in https://github.com/pytorch/pytorch/pull/107707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109367 Approved by: https://github.com/lezcano	2023-09-21 16:36:51 +00:00
Peter Bell	9e629dd73c	[decomp] Add all std and std_mean overloads to core decompostions (#109667 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109667 Approved by: https://github.com/lezcano	2023-09-20 18:45:56 +00:00
Salil Desai	2e721aab98	[Decomposition] Trunc (#109319 ) Summary: Add Decomp for Trunc and add it to core_aten_decompositions Differential Revision: D49042033 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109319 Approved by: https://github.com/SherlockNoMad	2023-09-19 13:30:13 +00:00
Edward Z. Yang	677a1010e6	Implement traceable torch.tensor when you have SymInt/SymFloat inputs (#109515 ) I just ported the C++ torch.tensor implementation to Python, swapping out the inner bits to successively stack tensors together, so that we can trace through `scalar_tensor`. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/109515 Approved by: https://github.com/voznesenskym ghstack dependencies: #109513	2023-09-19 13:19:57 +00:00
Li-Huai (Allan) Lin	b2cba439b4	Introduce Tensor overload to linspace and logspace (#104889 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104889 Approved by: https://github.com/zou3519 ghstack dependencies: #107958	2023-09-11 23:30:40 +00:00
PyTorch MergeBot	a7f5abeade	Revert "Introduce Tensor overload to linspace and logspace (#104889 )" This reverts commit `57e5239321`. Reverted https://github.com/pytorch/pytorch/pull/104889 on behalf of https://github.com/clee2000 due to sorry have to revert this to revert https://github.com/pytorch/pytorch/pull/107958 ([comment](https://github.com/pytorch/pytorch/pull/104889#issuecomment-1714305768))	2023-09-11 17:33:48 +00:00
Li-Huai (Allan) Lin	57e5239321	Introduce Tensor overload to linspace and logspace (#104889 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104889 Approved by: https://github.com/zou3519 ghstack dependencies: #107958	2023-09-11 15:29:39 +00:00
ekamiti	0f88d93b10	decomposition spectral ops fixes (#108360 ) Fixes https://github.com/pytorch/pytorch/issues/105986, https://github.com/pytorch/pytorch/issues/108204, https://github.com/pytorch/pytorch/issues/108205 Fix all issues flagged when making changes for https://github.com/pytorch/pytorch/pull/107421 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108360 Approved by: https://github.com/ezyang	2023-09-09 04:48:09 +00:00
Ken Jin	c458fa0d35	Decompose/add reference for `view_as_complex` (#108005 ) Aten source: `d4a99631dd/aten/src/ATen/native/ComplexHelper.h (L78)` Documentation reference: https://pytorch.org/docs/stable/generated/torch.view_as_complex.html Note: this adds a new primitive `view_of_dtype`, which is trivially implemented, as its meta function is already implemented elsewhere. Finally, this is not registered as a decomposition (yet), because TorchInductor does not yet support complex types. It should be added once we do. Closes https://github.com/pytorch/pytorch/issues/108020 as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108005 Approved by: https://github.com/peterbell10, https://github.com/ezyang	2023-09-07 23:49:20 +00:00
Guilherme Leobas	7e878c9d10	Add decomposition for `aten.take_along_dim` (#108185 ) xref #107875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108185 Approved by: https://github.com/lezcano	2023-09-04 13:49:53 +00:00
Vishwa Raj Singh	1b3dc05c3e	Use contiguous() to handle noncontiguous outputs during elementwise decomposition (#108140 ) Fixes https://github.com/pytorch/pytorch/issues/108218 Use contiguous() API to handle noncontiguous outputs during elementwise decomp With this change, ops is decomposing properly (testcase from the bug): ``` graph(): %arg0_1 : [#users=3] = placeholder[target=arg0_1] %abs_1 : [#users=1] = call_function[target=torch.ops.aten.abs.default](args = (%arg0_1,), kwargs = {}) %floor : [#users=1] = call_function[target=torch.ops.aten.floor.default](args = (%abs_1,), kwargs = {}) %sign : [#users=1] = call_function[target=torch.ops.aten.sign.default](args = (%arg0_1,), kwargs = {}) %mul : [#users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%floor, %sign), kwargs = {}) %sub : [#users=1] = call_function[target=torch.ops.aten.sub.Tensor](args = (%arg0_1, %mul), kwargs = {}) return (sub,) ``` Output: ``` tensor([[ 0.2871, 0.7189, 0.7297], [ 0.8782, -0.4899, 0.7055]], device='hpu:0') ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/108140 Approved by: https://github.com/ezyang	2023-09-03 04:32:22 +00:00
lezcano	239ee76177	Add refs/decomps for dot/vdot (#108194 ) Follow-up on https://github.com/pytorch/pytorch/issues/108127#issuecomment-1698142427 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108194 Approved by: https://github.com/peterbell10 ghstack dependencies: #108188	2023-08-31 15:30:23 +00:00
lezcano	239fed7e1e	Add reference for linalg.vecdot (#108188 ) Was addressing https://github.com/pytorch/pytorch/issues/108127, but then I realised that vecdot is already CompositeImplicit. Pushing anyway as a short-and-sweet PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108188 Approved by: https://github.com/peterbell10	2023-08-31 15:30:23 +00:00
David Watson	598babf017	Added normal op decomposition for specializations of the normal op (#106792 ) This fixes running normal with the meta key. ``` import torch t = torch.tensor(4.0, device='meta') torch.normal(0.5, t) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106792 Approved by: https://github.com/lezcano	2023-08-25 16:18:28 +00:00
Vishwa Raj Singh	35de780aa6	Fix Inplace tensor update on transpose (#104689 ) Fixes #https://github.com/pytorch/pytorch/issues/103650 - To align with HPU device backend architecture. Ensure all non-view ops return contiguous fake tensor outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104689 Approved by: https://github.com/ezyang	2023-08-24 16:58:50 +00:00
Sherlock Huang	ee4b99cc3a	Decomp for aten.dropout (#106274 ) When exporting dropout with cpu tensor, we get following graph module ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format) bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9); empty_memory_format = None div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9); bernoulli_p = None mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar); arg0_1 = div_scalar = None return (mul_tensor,) ``` In addition, if we export with eval() mode, we will have an empty graph. However, when exporting with cuda tensor, we got ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True); arg0_1 = None getitem: f32[512, 10] = native_dropout_default[0]; native_dropout_default = None return (getitem,) ``` and exporting under eval() mode will still have a dropout node in graph. This PR make exporting with CPU tensor also produce aten.native_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274 Approved by: https://github.com/ezyang	2023-08-23 21:12:37 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Edward Z. Yang	5673c0874c	Use expect_true to make split with unbacked sizes work. (#106788 ) This pattern shows up in torchrec KeyedJaggedTensor. Most of the change in this PR is mechanical: whenever we failed an unbacked symint test due to just error checking, replace the conditional with something that calls expect_true (e.g., torch._check or TORCH_SYM_CHECK). Some of the changes are a bit more nuanced, I've commented on the PR accordingly. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106788 Approved by: https://github.com/lezcano ghstack dependencies: #106720	2023-08-15 20:31:30 +00:00
lezcano	2c5f96deac	[Inductor] Make softshrink composite implicit (#107052 ) The backward is pretty much equivalent to the one we had written Pull Request resolved: https://github.com/pytorch/pytorch/pull/107052 Approved by: https://github.com/peterbell10 ghstack dependencies: #107038, #107039, #107051	2023-08-14 21:01:50 +00:00
lezcano	3b1254e800	Make hardshrink's decomp composite implicit (#107039 ) The generated code is the same Pull Request resolved: https://github.com/pytorch/pytorch/pull/107039 Approved by: https://github.com/peterbell10 ghstack dependencies: #107038	2023-08-14 21:01:50 +00:00
lezcano	45c7880486	Simplify some decompositions. (#107038 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107038 Approved by: https://github.com/peterbell10	2023-08-14 21:01:50 +00:00
Peter Bell	ab6efb1649	[pt2] Add reference implementations of torch.{stft,istft} (#106400 ) This allows symbolic shapes to be traced through `torch.stft` and `torch.istft`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106400 Approved by: https://github.com/lezcano ghstack dependencies: #106319	2023-08-07 20:59:30 +00:00
Yanbo Liang	0ad93a3d56	Fix aten.logspace decomposition (#105201 ) Fixes #104118 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105201 Approved by: https://github.com/ezyang	2023-07-22 04:10:20 +00:00
Justin Chu	8a688277a2	[BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432 Approved by: https://github.com/ezyang	2023-07-19 13:48:44 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit `c9c4f8efc3`. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00
PyTorch MergeBot	b4d91b1c5b	Revert "[Typing] Fix PEP 484 Violation (#105022 )" This reverts commit `4148b7bada`. Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))	2023-07-14 14:45:09 +00:00
Nikita Shulga	4148b7bada	[Typing] Fix PEP 484 Violation (#105022 ) Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None Towards enabling mypy-1.4.1 in lintrunner <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 5e1b9f4</samp> > _We annotate the arguments of doom_ > _To show the `None` values of gloom_ > _We improve the type checking and readability_ > _With `Optional` annotations of metal-ity_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022 Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007	2023-07-12 10:20:48 +00:00
Peter Bell	5c580a9846	[decomp] Add test tracking core ATen operators (#104262 ) This adds an expect-test that finds the set of core ATen operators by subtracting the operators with decomposition in core_aten_decompositions from the set of all operators that have decompositions and could be decomposed. This is useful because if you add a new decomposition but forget to add it to the list of core decompositions, it will appear in the PR diff. Also, by going through this list I have identified some operators where the functional variant is decomposed, but not the inplace variant which must be an oversight. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104262 Approved by: https://github.com/lezcano	2023-07-04 16:41:44 +00:00
Peter Bell	8b418f197c	[decomp] Add decomposition for torch.renorm (#103858 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103858 Approved by: https://github.com/ezyang, https://github.com/nkaretnikov	2023-06-21 20:57:43 +00:00
Peter Bell	a61096fb94	[decomp] Decompose logaddexp2 (#103765 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103765 Approved by: https://github.com/Chillee	2023-06-21 20:16:24 +00:00
Kurt Mohler	ee83c646bb	Replace `_prims_common.check` with `torch._check` (#103240 ) This relands most of the changes from #102219 which were backed out by #103128. However, instead of removing `_prims_common.check`, it adds a warning and a comment mentioning that it will be removed in the future and `torch._check` should be used instead. As mentioned in https://github.com/pytorch/pytorch/pull/103128#pullrequestreview-1466414415, `_prims_common.check` cannot yet be removed because of some internal usage Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103240 Approved by: https://github.com/albanD	2023-06-21 00:46:17 +00:00
PyTorch MergeBot	7b6dc72ffa	Revert "[decomp] Decompose logaddexp2 (#103765 )" This reverts commit `bab21d20eb`. Reverted https://github.com/pytorch/pytorch/pull/103765 on behalf of https://github.com/ezyang due to looks like land race ([comment](https://github.com/pytorch/pytorch/pull/103765#issuecomment-1599030496))	2023-06-20 15:35:02 +00:00
Peter Bell	bab21d20eb	[decomp] Decompose logaddexp2 (#103765 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103765 Approved by: https://github.com/Chillee	2023-06-20 09:24:21 +00:00
BowenBao	724a1ba2de	Tidy __all__ under torch._refs (#103712 ) - Added ops that were missing under `__all__`. - Some misc changes to helper functions to make them private. - Set correct `fn.__module__` for `fn` created by `_make_alias`, when called in another module. All modification largely references results from a hacked version of `test_public_bindings::test_correct_module_names`. By default `torch._refs` is not included in the test because it is technically a private package. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103712 Approved by: https://github.com/lezcano	2023-06-20 00:04:58 +00:00
ekkapricious	5d34656fd7	Update dynamo sum dtype handling to match eager (#103037 ) The current behaviour for dynamo is to set the dtype to torch.int64 for integral types if the dtype is not specified explicitly which results in mismatched behaviour as compared to eager mode. In eager mode the semantics are: - If both out is specified and dtype is specified then they have to match - If dtype is not specified but out is specified then the dtype is set to match the out dtype - If neither dtype nor out is set then the dtype is set to kLong if it is a bool or an integral type Fixes #100698 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103037 Approved by: https://github.com/ngimel	2023-06-19 22:26:37 +00:00
Yu, Guangye	ad4ee297ed	allow cpu scalar to be moved to xpu in masked_fill (#103645 ) # Motivation Align to CUDA scenario, allow cpu scalar to be moved to xpu device in masked_fill. # Solution Add "xpu" support in condition control. # Additional no need for more UT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103645 Approved by: https://github.com/jgong5, https://github.com/ezyang	2023-06-16 12:15:43 +00:00
Yanbo Liang	686d7e4c48	[Inductor] Fix x.view(dtype) decomp and make inductor support it (#102920 ) Fixes #99804 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102920 Approved by: https://github.com/jansel, https://github.com/ngimel	2023-06-07 17:10:54 +00:00
Ivan Zaitsev	821493715c	Back out "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )", Back out "Forwatd fix for D46427687" (#103128 ) Test Plan: revertitparrot Reviewed By: malfet Differential Revision: D46506433 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103128 Approved by: https://github.com/malfet	2023-06-07 01:41:41 +00:00
Kurt Mohler	a84bb2709a	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-03 02:23:21 +00:00
PyTorch MergeBot	a7efa0ce35	Revert "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )" This reverts commit `fb79d43649`. Reverted https://github.com/pytorch/pytorch/pull/102219 on behalf of https://github.com/malfet due to Broke lint, see https://github.com/pytorch/pytorch/actions/runs/5158949959/jobs/9293466925 ([comment](https://github.com/pytorch/pytorch/pull/102219#issuecomment-1574245414))	2023-06-02 20:00:48 +00:00
Kurt Mohler	fb79d43649	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-02 19:13:45 +00:00
vfdev-5	319a1cb4e5	[inductor] Replaced refs.op by torch.op in _refs/* (#102176 ) Description: - Replaced `refs.op` by `torch.op` in `_refs/*` Pull Request resolved: https://github.com/pytorch/pytorch/pull/102176 Approved by: https://github.com/lezcano	2023-05-29 22:36:14 +00:00
vfdev-5	e3d97b6213	[inductor] Added `smooth_l1_loss` refs (#102077 ) Added `smooth_l1_loss` to refs + tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/102077 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-05-24 15:07:08 +00:00
Khushi	51fe53e619	[opinfo] item (#100313 ) Follows #100223 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100313 Approved by: https://github.com/ezyang	2023-05-10 11:32:45 +00:00
Khushi	5a933d044f	[opinfo prims] equal (#100663 ) Follows: #100223 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100663 Approved by: https://github.com/ezyang	2023-05-10 08:16:00 +00:00
Nikita Karetnikov	e87ed2a88d	[primTorch] add ref for `polar` (#100345 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100345 Approved by: https://github.com/ezyang	2023-05-04 01:37:02 +00:00
Angela Yi	d06b93b0c7	Decompose arange.default to arange.start_step (#99739 ) The aten op arange.default is not in the core aten IR, and should decompose into the arange.start_step op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99739 Approved by: https://github.com/SherlockNoMad	2023-04-27 19:06:36 +00:00
Yanbo Liang	4c6f7cbc86	Fix prims unbind if given dimension size is 0 (#100122 ) Fixes #99832 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100122 Approved by: https://github.com/ngimel	2023-04-26 23:40:21 +00:00
Nikita Karetnikov	f89b7c2bec	[pt2] add `SymInt` support for `roll` (#99114 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99114 Approved by: https://github.com/ezyang	2023-04-15 18:01:39 +00:00
Peter Bell	7b91bd2a7b	[primTorch] Add count_nonzero (#98995 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98995 Approved by: https://github.com/lezcano	2023-04-13 22:08:19 +00:00
Peter Bell	7d74dca780	[primTorch] Add rad2deg and deg2rad (#98994 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98994 Approved by: https://github.com/lezcano	2023-04-13 22:08:19 +00:00
Nikita Karetnikov	ff825de442	[primTorch] add ref for `cumprod` (#98670 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98670 Approved by: https://github.com/ezyang	2023-04-09 15:22:28 +00:00
albanD	0210481dcb	Fix _like meta registrations (#98160 ) The meta implementation for these _like function is wrong whenever device != "meta" (it doesn't fill the memory!). zeros_like is special due to sparse and is fixed directly by always filling it with zeros. Every other one is CompositeExplicit implementation, I went with removing their meta registration and tweaking code to avoid infinite recursions. I can do the same as zeros_like (and add the proper filling for each) but that would duplicate the c++ logic and make the meta registrations non trivial. I can do it if you prefer to removal. test_meta works fine with these fixes, relying on CI to see if other tests are breaking as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98160 Approved by: https://github.com/ezyang	2023-04-06 18:44:34 +00:00
Aaron Gokaslan	9c3fbe7475	[BE] Enable flake8-simplify checks (#97984 ) Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check. Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984 Approved by: https://github.com/ezyang	2023-03-31 03:40:21 +00:00
Aaron Gokaslan	47dca20d80	[BE] Enable flake8-comprehension rule C417 (#97880 ) Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880 Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD	2023-03-30 14:34:24 +00:00
Aaron Gokaslan	597b558c51	[BE]: Update flake8 and plugins and fix bugs (#97795 ) Update flake8 and flake8-plugins in lintrunner to a modern version. Enables more checks and makes flake8 checks significantly faster. Added a few additional rule ignores that will need to be fixed in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97795 Approved by: https://github.com/alexsio27444, https://github.com/janeyx99, https://github.com/ezyang	2023-03-28 23:51:55 +00:00
Vivek Khandelwal	5da86bbb68	Add decomposition for aten.squeeze.dims op (#97020 ) Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97020 Approved by: https://github.com/jansel	2023-03-27 20:13:19 +00:00
Chung-chieh Shan	2c588b3ad5	Allow new_full's fill_value argument type to be complex (#91345 ) It seems that this code should type-check but doesn't: ```python torch.zeros((2,3),dtype=torch.cdouble).new_full((4,5),complex(6,7)) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91345 Approved by: https://github.com/zou3519, https://github.com/ezyang	2023-03-21 12:34:00 +00:00
Edward Z. Yang	3606f59366	Default specialize_int to False (#96624 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624 Approved by: https://github.com/janeyx99	2023-03-16 02:54:18 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
PyTorch MergeBot	ba4fb9b6ad	Revert "Default specialize_int to False (#96624 )" This reverts commit `1ac8782db2`. Reverted https://github.com/pytorch/pytorch/pull/96624 on behalf of https://github.com/kit1980 due to Broke inductor/test_torchinductor_dynamic_shapes.py	2023-03-14 19:43:47 +00:00
Edward Z. Yang	1ac8782db2	Default specialize_int to False (#96624 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624 Approved by: https://github.com/janeyx99	2023-03-14 18:37:47 +00:00
Khushi Agrawal	301a28bf8c	[primTorch] move diagonal & add linalg.diagonal refs (#95774 ) Fixes #85419 Also, add `_refs.linalg.diagonal`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95774 Approved by: https://github.com/lezcano	2023-03-06 17:59:47 +00:00
Nikita Karetnikov	c72fbf2e5a	[inductor] do not use `ceil` in `arange` ref (#95773 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95773 Approved by: https://github.com/ezyang	2023-03-03 20:38:18 +00:00
mfkasim1	975333d80c	Logaddexp for complex in CPU (#95717 ) Continuation of PR #93153 where I implemented logaddexp for complex, but didn't expose it to `torch.logaddexp`. So this PR is to expose the complex logaddexp to `torch.logaddexp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95717 Approved by: https://github.com/lezcano	2023-03-01 20:37:46 +00:00
Brian Hirsh	84e2d957a1	fix primtorch handling for sub.scalar with alpha and float64 arg (#95421 ) This fixes the primtorch issue stemming from https://github.com/pytorch/pytorch/issues/95181 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95421 Approved by: https://github.com/ngimel, https://github.com/SherlockNoMad	2023-02-28 00:24:38 +00:00
Edward Z. Yang	4833e47feb	Add support for nonzero, some improvements to reduce guards (#95387 ) This takes the strategy described in https://docs.google.com/document/d/1lFRYAJo5nrfxRhwIzGnfi2pbLpU6T4ytSRSuLJ5qebI/edit# It is essentially https://github.com/pytorch/pytorch/pull/95222 but squashed and with changes that are unnecessary given that we assume nonzero returns > 1. What's in the PR: * nonzero now supports meta propagation. When `capture_dynamic_output_shape_ops`, it will return a tensor with an unbacked SymInt representing the size in question. * The unbacked SymInt is UNSOUNDLY assumed to be not equal to 0/1. We will still error if you guard otherwise. * PrimTorch pointwise operators are updated to use empty_permuted, to avoid guarding on unbacked SymInt from empty_strided (tested in `test_dynamic_pointwise_scalar`) * Convolution is updated to skip backend selection if batch is unbacked, to avoid guarding on unbacked SymInt (tested in `test_unbacked_batch_resnet`) * I kept the helper utilities like `definitely_true` for working with possibly unbacked SymInts. They're not used right now but maybe someone will find them useful. * Added `constrain_unify` to let you specify two unbacked SymInts must have the same value Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95387 Approved by: https://github.com/voznesenskym	2023-02-24 00:27:45 +00:00
Peter Bell	bc438af6fe	std/var: support floating point correction value (#94073 ) Ref https://github.com/pytorch/pytorch/issues/61492#issuecomment-1413003480 The array API specifies correction to be `Union[int, float]` while we currently only support integers. https://data-apis.org/array-api/latest/API_specification/generated/array_api.std.html As std/var is calculated currently, the final count of elements is already done in floating point so we can make the correction floating point without any loss of precision or generality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94073 Approved by: https://github.com/ezyang	2023-02-23 05:50:45 +00:00
Peter Bell	640b9c80f9	[primTorch] Redefine prim.collapse{,_view} end point to be inclusive (#92017 ) This makes `prims.collapse(a, start, end)` match the behavior of `torch.flatten(a, start, end)` more closely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92017 Approved by: https://github.com/mruberry	2023-02-21 20:36:50 +00:00
Edward Z. Yang	ce950b412f	Reland "Add torch.empty_permuted (#95069 )" (#95208 ) This reverts commit `92e03cd583`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95208 Approved by: https://github.com/albanD	2023-02-21 18:02:48 +00:00
PyTorch MergeBot	92e03cd583	Revert "Add torch.empty_permuted (#95069 )" This reverts commit `bedeb1f014`. Reverted https://github.com/pytorch/pytorch/pull/95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq	2023-02-21 12:05:20 +00:00
Edward Z. Yang	bedeb1f014	Add torch.empty_permuted (#95069 ) torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics. The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards. However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides: * The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the inverse of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3. * An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation. It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it. A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95069 Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD, https://github.com/dagitses	2023-02-20 00:23:10 +00:00
kshitij12345	06489a3c1c	[functorch] roll : fix batching rule for scalar tensor (#95048 ) Fixes https://github.com/pytorch/pytorch/issues/94925 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95048 Approved by: https://github.com/Skylion007, https://github.com/ngimel	2023-02-19 09:30:30 +00:00
Edward Z. Yang	ef5de0a4cf	Don't use PrimTorch decomposition for empty (#94512 ) This PR removes the unnecessary == 0 guard when constructing empty tensors, by ensuring that when we create a contiguous tensor we go directly to the C++ torch.empty implementation (instead of indirecting through empty_strided), where we can bypass doing zero tests when computing the size of the storage. This probably also speeds up trace time. When I did this, I found out that `empty_tensor_restride_symint` was flagrantly wrong (we had never exercised it before because we redirected to `empty_strided` in PrimTorch decomp, which doesn't hit this codepath.) The bugs: * Stride computation was wrong (only `last_idx` was ever written to) * Using set_sizes_and_strides with `sym_sizes` input doesn't work, because there is some sort of ordering problem where `clone_symvec` isn't safe when you clone a vector into itself. Probably should fix this. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94512 Approved by: https://github.com/ngimel	2023-02-16 16:04:41 +00:00
min-jean-cho	b6df987671	[Inductor] Added aten.normal_ decomp (#91207 ) Fixes #91085 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91207 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano	2023-02-15 21:21:46 +00:00
min-jean-cho	22e2fd554c	OpInfo for aten.exponential, Add check for dtype, parameter in decomp ref (#92709 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/92709 Approved by: https://github.com/lezcano	2023-02-14 10:11:07 +00:00
Fabio Rocha	1dbaa5c290	Use decompositions for some fallbacks introduced in #94039 (#94206 ) In some cases, implements required inductor primitives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94206 Approved by: https://github.com/jansel, https://github.com/ngimel	2023-02-14 09:31:30 +00:00
Aaron Gokaslan	67d9790985	[BE] Apply almost all remaining flake8-comprehension checks (#94676 ) Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676 Approved by: https://github.com/ezyang	2023-02-12 01:01:25 +00:00
Fabio Rocha	e116ca93e1	Run test_torchinductor*.py with implicit_fallbacks=False (#94039 ) This way it errors out for ops that don't have decomps and requires you to add explicit fallbacks to lowering.py Turns out there are a lot, and this commit adds them as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94039 Approved by: https://github.com/lezcano, https://github.com/jansel, https://github.com/ngimel	2023-02-10 18:10:56 +00:00
Xuehai Pan	69e0bda999	[BE] Import `Literal`, `Protocol`, and `Final` from standard library `typing` as of Python 3.8+ (#94490 ) Changes: 1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention. ```python import re def normalize(name): return re.sub(r"[-_.]+", "-", name).lower() ``` 2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+ 3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-09 19:17:49 +00:00
min-jean-cho	81853354c3	added aten.log_normal_ decomp (#91674 ) Fixes #91275 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91674 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano	2023-02-09 18:34:25 +00:00
min-jean-cho	92f569fe11	[Inductor] added aten.geometric_ decomp (#91672 ) Fixes #91671 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91672 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano	2023-02-09 07:29:14 +00:00
min-jean-cho	66ae3aa096	[Inductor] added aten.cauchy_ decomp (#92047 ) Fixes #91675 TODO: compare perf of decomposed tan --vs-- libdevice tan, aten tan for triton, cpp backeneds Pull Request resolved: https://github.com/pytorch/pytorch/pull/92047 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano, https://github.com/ngimel	2023-02-09 00:02:56 +00:00
Peter Bell	819990f595	[decomp] Decompose std/std_mean into aten.var/var_mean (#94072 ) These are currently decomposed into prims.var which is less useful for inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94072 Approved by: https://github.com/lezcano	2023-02-06 10:22:07 +00:00
Natalia Gimelshein	3c79ea2607	Removes stray print (#94079 ) Pertitle Pull Request resolved: https://github.com/pytorch/pytorch/pull/94079 Approved by: https://github.com/voznesenskym	2023-02-03 21:56:45 +00:00
Peter Bell	77acb556e6	[primTorch] Rewrite nan_to_num ref in terms of aten functions (#93952 ) This de-duplicates `_refs.nan_to_num` with the inductor decomposition and simplifies it to not reimplement `isnan`, `isposinf` and `isneginf`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93952 Approved by: https://github.com/lezcano	2023-02-03 13:51:37 +00:00
Peter Bell	72385bbd03	[primTorch] Rewrite is{,pos,neg}inf refs in terms of aten functions (#93951 ) `isposinf` and `isneginf` currently fallback in inductor. Here, I enable the existing decompositions to work with inductor. `isinf` can also be written with aten functions, however I don't add it to inductor's decompositions because `isinf` is lowered to `tl.libdevice.isinf` in triton. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93951 Approved by: https://github.com/lezcano	2023-02-03 13:51:37 +00:00
Peter Bell	5817695bfa	[pt2] Fix arange to match ATen behavior (#93353 ) Fixes #92676 `arange` infers the output dtype from the argument types, but in order to reduce falling back to ATen, inductor preferred to cast whole number float arguments to int which gave the wrong output dtype. Instead, this decomposes floating point arange into the prim equivalent for integers. This also changes the signature of `prims.arange` to ```python prims.iota(length, , start, step, *factory_kwargs) ``` which only supports integers arguments. This is done because calculating the output size from `start, end, step` is surprisingly complex and liable to off by one errors so should not be duplicated in each backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93353 Approved by: https://github.com/ngimel, https://github.com/lezcano	2023-02-03 00:44:32 +00:00
Edward Z. Yang	37fcc53096	Remove import cycle from torch._refs.nn.functional (#93948 ) This makes it possible to import torch._refs from torch._subclasses.fake_tensor Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/93948 Approved by: https://github.com/albanD	2023-02-02 21:06:37 +00:00
XiaobingSuper	db87396474	inductor: align the decomposition output stride with none-decomposition path for torch.lerp (#93336 ) As title, we need to align the decomposition output stride with the none-decomposition path for torch.lerp. And also enable it's lowering path for inductor. After this PR for the following case: ``` def fn(i0, i1): # i0: (10, 3, 10) # i1: (3, 10, 10) x1 = i0.transpose(-2, -3) #y = torch.lerp(x1, x1, 70000) z = torch.lerp(i1, x1, 70000) return z x0 = torch.rand(10, 3, 10) x1 = torch.rand(3, 10, 10) ret_eager = fn(x0, x1) print('==== Eager mode OK! ====') compiled = torch.compile(fn, fullgraph=True) ret_compiled = compiled(x0, x1) print('==== compile mode OK! ====') ret_compiled = compiled(x0, x1) print(torch.equal(ret_eager, ret_compiled)) print(ret_eager.stride()==ret_compiled.stride()) ``` the inductor output code will be like(CPU): ``` from ctypes import c_void_p, c_long import torch import random from torch import empty_strided, as_strided, device from torch._inductor.codecache import AsyncCompile from torch._inductor.select_algorithm import extern_kernels aten = torch.ops.aten assert_size_stride = torch._C._dynamo.guards.assert_size_stride async_compile = AsyncCompile() kernel_cpp_0 = async_compile.cpp(''' #include "/tmp/torchinductor_xiaobing/77/c7773nj5pwikpmm2pwa62rcudlf7p3if7eyqb5k4sjsvewwje4le.h" extern "C" void kernel(const float* __restrict__ in_ptr0, const float* __restrict__ in_ptr1, float* __restrict__ out_ptr0) { { #pragma GCC ivdep for(long i0=0; i0<3; i0+=1) { #pragma GCC ivdep for(long i1=0; i1<10; i1+=1) { for(long i2=0; i2<0; i2+=1) { auto tmp7 = at::vec::Vectorized<float>::loadu(in_ptr0 + (10i0) + (16i2) + (30i1)); auto tmp8 = at::vec::Vectorized<float>::loadu(in_ptr1 + (10i1) + (16i2) + (100i0)); auto tmp0 = at::vec::Vectorized<float>(static_cast<float>(70000.0)); auto tmp1 = tmp0.abs(); auto tmp2 = at::vec::Vectorized<float>(static_cast<float>(0.5)); auto tmp3 = tmp1 >= tmp2; auto tmp4 = at::vec::Vectorized<float>(static_cast<float>(1)); auto tmp5 = tmp0 - tmp4; auto tmp6 = decltype(tmp5)::blendv(tmp0, tmp5, tmp3); auto tmp9 = tmp7 - tmp8; auto tmp10 = tmp6 * tmp9; auto tmp11 = decltype(tmp7)::blendv(tmp8, tmp7, tmp3); auto tmp12 = tmp10 + tmp11; tmp12.store(out_ptr0 + (10i1) + (16i2) + (100i0)); } #pragma omp simd simdlen(8) for(long i2=0; i2<10; i2+=1) { auto tmp7 = in_ptr0[i2 + (10i0) + (30i1)]; auto tmp8 = in_ptr1[i2 + (10i1) + (100i0)]; auto tmp0 = static_cast<float>(70000.0); auto tmp1 = std::abs(tmp0); auto tmp2 = static_cast<float>(0.5); auto tmp3 = tmp1 >= tmp2; auto tmp4 = static_cast<float>(1); auto tmp5 = tmp0 - tmp4; auto tmp6 = tmp3 ? tmp5 : tmp0; auto tmp9 = tmp7 - tmp8; auto tmp10 = tmp6 tmp9; auto tmp11 = tmp3 ? tmp7 : tmp8; auto tmp12 = tmp10 + tmp11; out_ptr0[i2 + (10i1) + (100i0)] = tmp12; } } } } } ''') async_compile.wait(globals()) del async_compile def call(args): arg0_1, arg1_1 = args args.clear() buf1 = empty_strided((3, 10, 10), (100, 10, 1), device='cpu', dtype=torch.float32) kernel_cpp_0(c_void_p(arg0_1.data_ptr()), c_void_p(arg1_1.data_ptr()), c_void_p(buf1.data_ptr())) del arg0_1 del arg1_1 return (buf1, ) if __name__ == "__main__": from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance arg0_1 = rand_strided((10, 3, 10), (30, 10, 1), device='cpu', dtype=torch.float32) arg1_1 = rand_strided((3, 10, 10), (100, 10, 1), device='cpu', dtype=torch.float32) print_performance(lambda: call([arg0_1, arg1_1])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/93336 Approved by: https://github.com/jansel	2023-02-02 07:40:28 +00:00
Sherlock Huang	438f12d91a	Rewrite some decomps to allow producing aten ops (#93099 ) This introduces a new stop to the decomposition train. Before reaching prims.view_of, it will stop at aten.alias. Export path wants to get off the train at aten ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93099 Approved by: https://github.com/ngimel	2023-01-31 17:46:20 +00:00
Peter Bell	5644059489	[inductor] Lower torch.exp2 and use it for torch.pow(2, x) (#92632 ) Before ```python tmp0 = 2.0 tmp2 = tl.libdevice.pow(tmp0, tmp1) ``` After ```python tmp1 = tl.libdevice.exp2(tmp0) ``` I've benchmarked on CPU and CUDA with the following examples ``` @torch._dynamo.optimize() def exp2(x): return torch.pow(2, x) @torch._dynamo.optimize() def logaddexp2(a, b): m = torch.maximum(a, b) return m + torch.log2(1 + torch.pow(2, -torch.abs(a-b))) ``` triton is able to specialize `pow(2, x)` such that this makes no difference, but on CPU I see a surprisingly large speedup. \| device \| Function \| Master (us) \| This PR (us) \| Speedup \| \|--------\|-----------\|-------------\|--------------\|---------\| \| CUDA \| exp2 \| 64 \| 63 \| 1.0 \| \| \| logaddexp \| 109 \| 107 \| 1.0 \| \| CPU \| exp2 \| 220 \| 40 \| 5.5 \| \| \| logaddexp \| 282 \| 140 \| 2.0 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/92632 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-01-20 22:06:23 +00:00
Peter Bell	dd760c98f8	[decomp] Use new squeeze.dims overload in decompositions (#91602 ) This removes the now-redundant `_squeeze_multiple` helpers and instead decomposes into a single call to `aten::squeeze.dims` which also has the effect of reducing the lowered graph size in inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91602 Approved by: https://github.com/ngimel	2023-01-20 18:08:18 +00:00
Peter Bell	a9f4462847	[primTorch] Remove prims.to_dtype (#92380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92380 Approved by: https://github.com/lezcano, https://github.com/ngimel	2023-01-19 12:07:47 +00:00
lezcano	8b861544f9	Remove lowering and decompositions of zero_, zero, zeros_like... in favour of their references (#92071 ) The generated triton code is identical. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92071 Approved by: https://github.com/ngimel	2023-01-18 23:22:36 +00:00
Peter Bell	8770a7ed6f	Decompose more inplace ops (#90967 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90967 Approved by: https://github.com/anijain2305	2023-01-18 21:07:47 +00:00
Peter Bell	f0b592dae7	Make masked_fill reference traceable (#90972 ) As the comment states, `item()` cannot be used since you can't trace through a scalar. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90972 Approved by: https://github.com/ngimel	2023-01-18 10:54:42 +00:00
min-jean-cho	fb50a4b4ce	[Inductor] added aten.exponential_ decomp (#91673 ) Fixes #91276 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91673 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano	2023-01-18 09:19:35 +00:00
lezcano	d162c8f92b	Assorted decomposition fixes (#87183 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87183 Approved by: https://github.com/ngimel	2023-01-17 16:53:31 +00:00
lezcano	da58f9eb8f	Rewrite out-of-place decompositions in terms of out-of-place ops (#92003 ) Fixes https://github.com/pytorch/torchdynamo/issues/1863 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92003 Approved by: https://github.com/ngimel	2023-01-17 16:53:27 +00:00
Peter Bell	fb1427ea8f	squeeze: allow squeezing multiple dimensions at once (#89017 ) Ref #70924 This addresses part 1 of the issue, allowing `torch.squeeze` to be passed a tuple of dimensions. e.g. ```python x.squeeze(0).squeeze(0) ``` can now be written ```python x.squeeze((0, 1)) ``` (assuming x has at least 2 dimensions) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89017 Approved by: https://github.com/albanD	2023-01-17 14:20:15 +00:00
David Berard	d7dc1c2fd5	Support zero dimensions in softmax decompositions (#91322 ) The eager implementation of softmax supports computation along zero dimensions, but many of the other implementations did not, including: * decompositions & refs (this was causing dynamo failures) * forward AD for logsumexp * MPS log_softmax_backward This PR handles the `input.numel() == 0` cases separately to avoid running `amax()`, which fails for zero dimensions, and updates opinfos. example of "computation along zero dimensions": ```python # example of where import torch t = torch.rand((4, 0, 0)) print("~") print(torch.nn.functional.softmax(t, dim=-1)) # this passes print("~") torch._refs.softmax(t, dim=-1) # this fails print("~") ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91322 Approved by: https://github.com/lezcano	2023-01-11 09:35:43 +00:00
Nikita Karetnikov	d1cc64b2ac	[primTorch] Fix masking in `logsumexp` ref (#91941 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/91941 Approved by: https://github.com/ngimel, https://github.com/lezcano	2023-01-10 10:55:04 +00:00
lezcano	138a0188e0	Add support for logaddexp(float16) in CUDA and implement its reference (#91869 ) The reference is implemented so that it generates efficient and numerically stable triton code. Fixes https://github.com/pytorch/pytorch/issues/91683 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91869 Approved by: https://github.com/ngimel	2023-01-10 00:19:24 +00:00
Nikita Karetnikov	00e5f3a9c5	[primTorch] Move `logsumexp` decomp to refs (#91860 ) Fixes #91843. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91860 Approved by: https://github.com/lezcano	2023-01-09 17:00:43 +00:00
Natalia Gimelshein	2c00064113	remove unnecessary decomps (#91828 ) in favor of refs. Generated triton code is the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91828 Approved by: https://github.com/lezcano, https://github.com/soumith	2023-01-07 20:37:12 +00:00
PyTorch MergeBot	c73147f741	Revert "[decomp] Use new squeeze.dims overload in decompositions (#91602 )" This reverts commit `9262ffc692`. Reverted https://github.com/pytorch/pytorch/pull/91602 on behalf of https://github.com/clee2000 due to stacked pr was reverted, this is dependent	2023-01-05 20:39:52 +00:00
PyTorch MergeBot	df4b3b13bc	Revert "squeeze: allow squeezing multiple dimensions at once (#89017 )" This reverts commit `e26cb06681`. Reverted https://github.com/pytorch/pytorch/pull/89017 on behalf of https://github.com/mehtanirav due to Internal breakages	2023-01-05 19:25:08 +00:00
Peter Bell	9262ffc692	[decomp] Use new squeeze.dims overload in decompositions (#91602 ) This removes the now-redundant `_squeeze_multiple` helpers and instead decomposes into a single call to `aten::squeeze.dims` which also has the effect of reducing the lowered graph size in inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91602 Approved by: https://github.com/ngimel	2023-01-05 17:59:32 +00:00
lezcano	700399e3f1	Make sure the ends of linspace are correct regardless of the precision (#91625 ) This operation is usually called with small sizes, so the fact that this adds a couple of operations should be alright. Even more, given the structure of the data, the branching in the `where` is pretty much free. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91625 Approved by: https://github.com/peterbell10, https://github.com/ngimel	2023-01-05 00:23:19 +00:00
lezcano	223d1aa692	Improve linspace decomposition and remove its lowering (#91621 ) The code produced by the lowering and the decomposition is now the same modulo a casting to `float32`. This casting is necessary as otherwise the tests do not pass due to accuracy errors. We prefer accuracy over speed here, given that this is an associative scan, and thus it's prone to numerical errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91621 Approved by: https://github.com/ngimel	2023-01-05 00:23:19 +00:00
Peter Bell	e26cb06681	squeeze: allow squeezing multiple dimensions at once (#89017 ) Ref #70924 This addresses part 1 of the issue, allowing `torch.squeeze` to be passed a tuple of dimensions. e.g. ```python x.squeeze(0).squeeze(0) ``` can now be written ```python x.squeeze((0, 1)) ``` (assuming x has at least 2 dimensions) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89017 Approved by: https://github.com/albanD	2023-01-04 14:40:56 +00:00
Joel Schlosser	8b55b86dbd	Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317 ) This PR moves the definitions for: * `sym_int` * `sym_ceil` (used only for `sym_int`) * `sym_floor` (used only for `sym_int`) * `sym_float` from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined. This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way! Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2022-12-28 16:08:16 +00:00
Brian Hirsh	c47bdd7522	_scatter ops should preserve input stride/storage_offset (#91029 ) It turns out that we do* need to update *_scatter ops to return the exact same strides as their inputs. I added a test to `test/test_functionalization.py`, which now trips thanks to Ed's functionalization stride debugging check. It only actually ends up tripping silent correctness if you try to .backward() on that function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91029 Approved by: https://github.com/ezyang	2022-12-22 19:41:53 +00:00
Nikita Shulga	fd3a7264ae	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (take 2) (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Add `__all__` to `torch/backends/mps/__init__.py` as well as alias all imports as private Add `TestNNMPS.test_group_norm_backward` that validates no NaNs are generated during the backward pass Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-22 08:54:37 +00:00
PyTorch MergeBot	645eda0a00	Revert "[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 )" This reverts commit `371716eb36`. Reverted https://github.com/pytorch/pytorch/pull/91190 on behalf of https://github.com/kit1980 due to Broke test_correct_module_names because of underscore _ops	2022-12-21 19:37:43 +00:00
Nikita Shulga	371716eb36	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Depends on https://github.com/pytorch/pytorch/pull/91203 Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-21 17:33:27 +00:00
Nikita Shulga	c8546c930f	[BE] Use `aten` global in `torch._refs` (#91189 ) Similar to pattern used in `torch._decomp` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91189 Approved by: https://github.com/ngimel	2022-12-21 02:28:51 +00:00
soulitzer	98a9235dce	Fix prelu ref when a.ndim < 2 (#89809 ) Fixes https://github.com/pytorch/pytorch/issues/89560 Previously the test case for "input is 1-D or scalar + weight is not scalar" did not exist; adding it introduced some failures: - forward AD (fixed in this PR) - vmap (filed https://github.com/pytorch/pytorch/issues/89895) - ref/meta (fixed this PR, though this also regresses nvFuser support) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89809 Approved by: https://github.com/ngimel	2022-12-12 23:55:31 +00:00
Peter Bell	79406378ae	[primTorch] Add prim and ref for as_strided_scatter (#88426 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88426 Approved by: https://github.com/mruberry	2022-12-08 00:17:39 +00:00
Peter Bell	5caa27a3fd	as_strided: Fix default storage_offset for reference implementation (#89513 ) This fixes the default storage_offset to take it from the input. This was previously untested, so I've also added a new OpInfo which includes samples with non-zero storage_offsets on the input tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-12-06 22:39:21 +00:00
Yanbo Liang	25f39c1bce	Fix uniform ref implementation (#90094 ) Fixes https://github.com/pytorch/torchdynamo/issues/1954 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90094 Approved by: https://github.com/ngimel	2022-12-06 21:28:17 +00:00
PyTorch MergeBot	e645771e95	Revert "as_strided: Fix default storage_offset for reference implementation (#89513 )" This reverts commit `ba70a8be03`. Reverted https://github.com/pytorch/pytorch/pull/89513 on behalf of https://github.com/kit1980 due to Broke multiple workflows, 2 unexpected successes for autograd tests	2022-12-06 07:14:16 +00:00
Peter Bell	ba70a8be03	as_strided: Fix default storage_offset for reference implementation (#89513 ) This fixes the default storage_offset to take it from the input. This was previously untested, so I've also added a new OpInfo which includes samples with non-zero storage_offsets on the input tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-12-06 04:07:16 +00:00
PyTorch MergeBot	8845a8f899	Revert "as_strided: Fix default storage_offset for reference implementation (#89513 )" This reverts commit `eded97ac72`. Reverted https://github.com/pytorch/pytorch/pull/89513 on behalf of https://github.com/peterbell10 due to broke master	2022-12-05 17:53:23 +00:00
Peter Bell	eded97ac72	as_strided: Fix default storage_offset for reference implementation (#89513 ) This fixes the default storage_offset to take it from the input. This was previously untested, so I've also added a new OpInfo which includes samples with non-zero storage_offsets on the input tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-12-05 15:52:49 +00:00
Nikita Karetnikov	0a1a53083e	[primTorch] Enable regex error testing for some refs (#87765 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87765 Approved by: https://github.com/mruberry	2022-11-23 23:36:27 +00:00
Peter Bell	ac19c5be82	FFT: disable dimension wrapping for scalar tensors (#89234 ) Fixes #88985 By default, `maybe_wrap_dim` allows through `dim=0` or `dim=-1` for scalar tensors which leads to an invalid dimension being used to index into `tensor.sizes()` as in the code sample from the issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89234 Approved by: https://github.com/mruberry	2022-11-23 21:55:00 +00:00
Sergii Dymchenko	504570d577	Delete unused variable assignment in _refs/__init__.py (#89538 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89538 Approved by: https://github.com/huydhn	2022-11-23 02:59:25 +00:00
Edward Z. Yang	dbeacf1182	Fix cat striding in PrimTorch (#89332 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89332 Approved by: https://github.com/ngimel	2022-11-20 04:05:33 +00:00
Sherlock Huang	caf3d5319f	Symintify numel(), infer_size, prims.elementwise_meta (#88956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88956 Approved by: https://github.com/ezyang	2022-11-20 00:42:03 +00:00
PyTorch MergeBot	8ad39536d7	Revert "Symintify numel(), infer_size, prims.elementwise_meta (#88956 )" This reverts commit `ce2f8700ba`. Reverted https://github.com/pytorch/pytorch/pull/88956 on behalf of https://github.com/ezyang due to somehow breaks torch.numel	2022-11-19 21:47:55 +00:00
lezcano	154e58c032	Add most in-place references/decompositions (#88117 ) We add most in-place references in a generic way. We also implement a wrapper to implement the annoying interface that `nn.functional` nonlinearities have. We fix along the way a couple decompositions for some non-linearities by extending the arguments that the references have. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117 Approved by: https://github.com/mruberry	2022-11-18 14:59:46 +00:00
lezcano	ce0e22a81a	Fix names of some reference functions (#88115 ) The `__name__` field of some binary reference functions was wrong. We fix this to be consistent with unary reference functions. In the future, we should probably make the binary reference wrapper return a wrapper itself to avoid all those calls to `partial`. This change helps performing some homogeneous treatment of functions by their name. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88115 Approved by: https://github.com/mruberry	2022-11-18 14:59:43 +00:00
Kazuaki Ishizaki	1cd6ebe095	Fix typos in messages under torch (#89049 ) This PR fixes typos of messages in `.py` files under torch directory. Only in `torch/onnx/symbolic_opset16.py`, fix a typo in comment to make the operator name correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89049 Approved by: https://github.com/lezcano	2022-11-17 04:18:14 +00:00
lezcano	e1ecf53d84	Simplify linspace decomp and increase its tolerance (#87203 ) This is an interesting one Since this is an operation that's intrinsically defined on the reals, we should perform the ops on that dtype always, and just cast to the desired dtype at the end. This simplifies the decomposition. Now, I started looking at this one when I started seeing failures on a test that's added in a later PR. What's going on here is that, by doing an upcast to a higher dtype and then cast down to integers, sometimes there's an off-by-one error. I think this is fine, as the decomposition is more accurate than the original function, which goes in line with the whole PrimTorch effort. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87203 Approved by: https://github.com/mruberry	2022-11-16 17:46:54 +00:00
Sherlock Huang	ce2f8700ba	Symintify numel(), infer_size, prims.elementwise_meta (#88956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88956 Approved by: https://github.com/ezyang	2022-11-16 03:36:00 +00:00
Khushi Agrawal	f1a5044de0	[primTorch] _refs & opinfo alpha_dropout (#87989 ) Add _refs and OpInfo for `nn.functional.alpha_dropout` Pull Request resolved: https://github.com/pytorch/pytorch/pull/87989 Approved by: https://github.com/mruberry	2022-11-14 18:18:45 +00:00
Natalia Gimelshein	06f1b52705	don't use prims.unsqueeze in group_norm (#88927 ) inductor doesn't have prims.squeeze lowering, so this breaks it. Longer term, `squeeze` with multiple dimensions is not a prim, nvfuser implements it with a loop, inductor uses `_squeeze_multiple` helper which turns it into a loop. Prim should accept only a single dimension. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88927 Approved by: https://github.com/eellison	2022-11-14 17:37:24 +00:00
Nikita Karetnikov	76af71444a	[primTorch] Add ref for `complex` (#88562 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88562 Approved by: https://github.com/ezyang	2022-11-13 20:31:16 +00:00
Nikita Karetnikov	4270bb37da	[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045 Approved by: https://github.com/mruberry	2022-11-12 15:03:50 +00:00
Sherlock Huang	495e7b1c72	Ref for aten.full; symint changes in prim (#88762 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88762 Approved by: https://github.com/ezyang	2022-11-11 02:32:09 +00:00
Ryan Spring	534ae6ae47	[primTorch] Implement group norm reference (#87054 ) Add group norm reference Split from #81191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87054 Approved by: https://github.com/mruberry	2022-11-11 01:08:20 +00:00
PyTorch MergeBot	93d3bd626e	Revert "[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 )" This reverts commit `aa8279bcb8`. Reverted https://github.com/pytorch/pytorch/pull/87045 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41161182	2022-11-09 20:48:32 +00:00
Nikita Karetnikov	aa8279bcb8	[primTorch] Improve `narrow` and `narrow_copy`: refs, tests, docs (#87045 ) Fixes #87019. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045 Approved by: https://github.com/mruberry	2022-11-09 09:19:28 +00:00
Edward Z. Yang	860e354d1c	Support diag_embed.out decomposition (#88671 ) This is a little tricky: there is a diag_embed.out, but its not bound in Python because it's autogenerated, see https://github.com/pytorch/pytorch/issues/88598 So I can't "just" add the out variant to the ref, as this makes it inconsistent with the torch API. To workaround this, I mark the ref as supporting out, but not the original function. This is useful to do, because it means that diag_embed.out now supports symbolic shapes. However, this cannot be easily tested because I can't mark the out variant as being supported in the normal OpInfo test. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88671 Approved by: https://github.com/mruberry	2022-11-08 18:28:36 +00:00
lezcano	1a7c4b0de7	Create _make_alias to preserve the name of a function when creating an alias (#88114 ) Before, we would inherit the name of the aliased function, which was very confusing, and disallowed some homogeneous treatment of references, as we do later in this stack Pull Request resolved: https://github.com/pytorch/pytorch/pull/88114 Approved by: https://github.com/mruberry	2022-11-08 13:09:34 +00:00
Sherlock Huang	95d57b54e0	Handle pin_memory in refs.randn (#88473 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88473 Approved by: https://github.com/mruberry	2022-11-07 20:25:56 +00:00
lezcano	39d9d2ed70	Implement reference for lerp (#87424 ) We follow the vectorised CPU implementation for numerical accuracy Pull Request resolved: https://github.com/pytorch/pytorch/pull/87424 Approved by: https://github.com/ezyang	2022-11-02 11:21:01 +00:00
Sherlock Huang	0a4ca9d083	Fix meta for aten.angle and aten.index_copy (#88066 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88066 Approved by: https://github.com/albanD	2022-10-31 17:11:29 +00:00
Khushi	a3f8495b84	[primTorch fix] use _maybe_convert_to_dtype (#85163 ) Fixes #84561 - [x] fix lint tests cc: @Lezcano!! Pull Request resolved: https://github.com/pytorch/pytorch/pull/85163 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-31 17:08:55 +00:00
Sherlock Huang	5723fd503c	Fix meta function for aten.flip and aten.rot90 (#88065 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88065 Approved by: https://github.com/mruberry	2022-10-31 16:52:05 +00:00
Sherlock Huang	e8a97a3721	FakeTensorMode and Prims.add/sub/mul/div support scalar only inputs (#87759 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87759 Approved by: https://github.com/ngimel, https://github.com/mruberry, https://github.com/eellison	2022-10-28 04:34:25 +00:00
lezcano	fd27246c16	Fix decomposition for std (#87181 ) The previous implementation was lacking a few features and incurred on a pretty large error cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181 Approved by: https://github.com/ngimel, https://github.com/peterbell10	2022-10-28 00:50:29 +00:00
lezcano	f21d0b310c	Add decomposition for diagonal_scatter (#87282 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87282 Approved by: https://github.com/mruberry	2022-10-28 00:50:29 +00:00
Sherlock Huang	b21fe312c0	Fix meta for index_add and index_put (#87775 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87775 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-10-26 20:33:23 +00:00
Sherlock Huang	0b162f5b49	Fix stride for prims.where (#87563 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87563 Approved by: https://github.com/ngimel, https://github.com/mruberry	2022-10-25 21:22:50 +00:00
Sherlock Huang	ece3758afc	Fix _refs for aten.zeros/ones/empty/randn (#87569 ) refs for aten.zeros/ones/empty/randn doesn't support .names overload. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87569 Approved by: https://github.com/ngimel	2022-10-25 20:06:57 +00:00
Sherlock Huang	eb99c1efce	Prefer python meta function over c++ meta function (#87426 ) This is a policy update for meta registration. We now prefer python meta implementation over C++ meta function. This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist. Here's the meta registration process: 1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`. However, they will NOT register them into dispatcher. 2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd. 3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases - 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta - 2. the op is a view op, as the MetaTensor doesn't support aliased storage - 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op) Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5 cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426 Approved by: https://github.com/ezyang, https://github.com/jansel	2022-10-25 16:49:02 +00:00
lezcano	faf9c47abb	Simplify a few diagonal-related functions (#87180 ) `diag` was unnecessarily implemented as a kernel rather than as a composite function, which made it unnecessarily difficult (explicit backward + all it entails). We also change a few uses of `diag` on 2D tensors for `diagonal()`. The latter returns a view rather than creating a new tensor. We also upgrade its meta implementation to a fully-fledged decomposition I tried implementing the backwards of `diagonal()` via `diag_scatter` (or better `diag_scatter_` to keep the perf) but functionalisation was failing and I was not sure how to fix this, so I moved on. It may be possible to simplify that one as well if @soulitzer or someone knows how to do this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87180 Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/mruberry	2022-10-24 06:11:53 +00:00
lezcano	08c2314d98	[PrimTorch] Add maker for *_copy variants of view functions (#87278 ) Implements `diagonal_copy` as an example. This PR also fixes a number of correcness issues with `diagonal_copy`. cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87278 Approved by: https://github.com/mruberry	2022-10-24 06:11:53 +00:00
Ryan Spring	9bb4926de0	Add xlogy and xlog1py references (#77712 ) * Add reference implementations for `xlogy` and `xlog1py` * Replace `_wrap_scalar` helper function with `scalar_tensor` prim Pull Request resolved: https://github.com/pytorch/pytorch/pull/77712 Approved by: https://github.com/mruberry	2022-10-22 17:59:25 +00:00
Edward Z. Yang	d73d4aa7de	Audit for error prone isinstance int/float and add lint (#87345 ) We recently fixed a bug on symbolic-shapes branch where an isinstance(x, int) test failed when passed a SymIntNode. To prevent this, I've added a lint for all the codepaths where we may pass SymInt/SymFloat directly to reject direct isinstance int/float tests, and instead use one of the aliases. The lint rule explains the options. I then go and fix all of them. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/87345 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-10-21 15:55:24 +00:00
Nikita Karetnikov	1b8af28fe8	[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-20 12:29:04 +00:00
PyTorch MergeBot	cd21613526	Revert "[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 )" This reverts commit `c09ca93e47`. Reverted https://github.com/pytorch/pytorch/pull/84956 on behalf of https://github.com/ZainRizvi due to This is causing the MPS test test_output_match_log_softmax_with_dtype_cpu_float32 (__main__.TestConsistencyCPU) to fail	2022-10-19 20:36:55 +00:00
Nikita Karetnikov	c09ca93e47	[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-19 18:45:40 +00:00
Nikita Karetnikov	b886cd15f5	[primTorch] Add a ref for NumPy-style `T` (#86850 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86850 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-18 10:19:47 +00:00
Nikita Karetnikov	841995d53b	[primTorch] Add refs for data conversion ops (#86561 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86561 Approved by: https://github.com/lezcano, https://github.com/mruberry, https://github.com/zou3519	2022-10-18 08:38:51 +00:00
Nikita Karetnikov	91b3cd0b5a	[primTorch] Add a ref for `narrow_copy` (#86748 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86748 Approved by: https://github.com/mruberry	2022-10-17 10:16:05 +00:00
Ryan Spring	847ded6db3	[primTorch] Implement NLL loss reference (#81128 ) Add Reference: - nll_loss Depends on: - expand https://github.com/pytorch/pytorch/pull/79820 - advance indexing Pull Request resolved: https://github.com/pytorch/pytorch/pull/81128 Approved by: https://github.com/mruberry	2022-10-17 06:20:31 +00:00
Nikita Karetnikov	4460e40db4	[primTorch] Add a ref for `addcmul` (#86731 ) Based on: https://github.com/pytorch/pytorch/pull/79827 https://github.com/pytorch/pytorch/pull/72949 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86731 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-14 14:26:23 +00:00
Will Constable	b97ae59e29	Change legacy wrap_dim to work with symint == (#86842 ) - previously, sizes == vector<T>({0}) failed to hit SymInt::operator==, causing a the loop to bail out too early and make an invalid call to downstream maybe_wrap_dim helper Pull Request resolved: https://github.com/pytorch/pytorch/pull/86842 Approved by: https://github.com/Chillee, https://github.com/malfet, https://github.com/albanD	2022-10-13 15:10:46 +00:00
Brian Hirsh	e17732b234	[test] add cross-ref tests for python meta kernels (#86228 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86228 Approved by: https://github.com/albanD	2022-10-13 14:14:26 +00:00
Khushi Agrawal	77d29bcee2	[primTorch] special: ndtr, ndtri, log_ndtr, erfcx (#86077 ) - Adds prims and _refs for `erfcx` and `ndtri`. - Adds _refs for `ndtr`, and `log_ndtr`. cc @kshitij12345 @lezcano @mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/86077 Approved by: https://github.com/mruberry	2022-10-13 01:18:30 +00:00
Nikita Karetnikov	d56017a14f	[primTorch] Add ref for `triplet_margin_loss`, improve `triplet_margin_with_distance_loss` (#85614 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85614 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-12 18:37:58 +00:00
Ivan Yashchuk	cd7c86eaa4	Add prims.clone (#86705 ) This simple PR adds `clone` as a primitive. Current implementation of `clone` is not supported with nvFuser executor because of `empty_like` + `copy_to`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86705 Approved by: https://github.com/mruberry	2022-10-12 18:22:00 +00:00
Fabio Rocha	493ded249e	[primTorch] decomposition for bucketize (#86366 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86366 Approved by: https://github.com/mruberry	2022-10-12 12:25:42 +00:00
Khushi	2344135179	[primTorch] special: entr, expit (#86592 ) Add _refs for `entr` & `expit`. cc @mruberry @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/86592 Approved by: https://github.com/mruberry	2022-10-12 07:00:40 +00:00
Sherlock Huang	8a3a54e012	Fix index_select decomp (#86469 ) For decomposing index_select with 0-dim tensor, we cannot write `x.unsqueeze(0)[index].squeeze(0).clone()` , as tensor[index] will trigger index.item() if index is a 0-dim tensor, and .item() cannot be symbolically traced with FakeTensor. We use `torch.ops.aten.index(x.unsqueeze(0), [index]).squeeze(0).clone()` as a workaround. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86469 Approved by: https://github.com/ngimel	2022-10-07 22:59:49 +00:00

... 2 3 4 5 6 ...

559 Commits