pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yanbo Liang	dc58259746	[Inductor] [FX passes] Group linear fusion (#105116 ) Summary: The draft version of a group + batch fusion framework, and the group linear fusion implementation. In the future, it's pretty straightforward to add a new group/batch fusion policy by defining a class with match + fuse functions. Test Plan: buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/inductor:group_batch_fusion Differential Revision: D46956695 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105116 Approved by: https://github.com/jansel	2023-07-18 03:56:42 +00:00
ekamiti	32d422f335	Make adding buffers more like adding parameters (#104069 ) Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible. Fixes #35735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069 Approved by: https://github.com/mikaylagawarecki	2023-07-17 17:59:05 +00:00
Edward Z. Yang	1152e86da1	Transmute refined SymInt into int (#104828 ) Previously, x.size(0) could return a SymInt, even when the internal sympy expression was actually already constant (e.g., due to an introduced guard.) We now allow to query the Python object with maybe_as_int which allows us to transmute these objects back to int when possible. It is still possible to end up with a constant SymInt even after this change, e.g., if you get out a SymInt and while holding onto it specialize it, but casual users are more likely to get ints when they want to. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/104828 Approved by: https://github.com/Skylion007	2023-07-15 18:46:10 +00:00
kshitij12345	d552c271db	[pt2] grad support (#102264 ) Teach dynamo about grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/102264 Approved by: https://github.com/zou3519	2023-06-21 10:13:09 +00:00
PyTorch MergeBot	e737a8486f	Revert "[pt2] grad support (#102264 )" This reverts commit `85b83954c8`. Reverted https://github.com/pytorch/pytorch/pull/102264 on behalf of https://github.com/huydhn due to This is failing in trunk `85b83954c8` and looks like a landrace ([comment](https://github.com/pytorch/pytorch/pull/102264#issuecomment-1600001309))	2023-06-21 03:02:55 +00:00
kshitij12345	85b83954c8	[pt2] grad support (#102264 ) Teach dynamo about grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/102264 Approved by: https://github.com/zou3519	2023-06-21 01:37:08 +00:00
Kurt Mohler	ee83c646bb	Replace `_prims_common.check` with `torch._check` (#103240 ) This relands most of the changes from #102219 which were backed out by #103128. However, instead of removing `_prims_common.check`, it adds a warning and a comment mentioning that it will be removed in the future and `torch._check` should be used instead. As mentioned in https://github.com/pytorch/pytorch/pull/103128#pullrequestreview-1466414415, `_prims_common.check` cannot yet be removed because of some internal usage Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103240 Approved by: https://github.com/albanD	2023-06-21 00:46:17 +00:00
Richard Zou	08a054649c	[operator_compile_check] Add FakeTensor testing (#103595 ) This PR adds dedicated FakeTensor testing to operator_compile_check. We reuse CrossRefFakeMode to do this and improve the error messages on it. Note that this only really runs detailed tests for operators that do not have data-dependent output shape. In the future we should add something like a dynamic CrossRefFakeMode. Test Plan: - existing tests (these now have improved error messages). Pull Request resolved: https://github.com/pytorch/pytorch/pull/103595 Approved by: https://github.com/ezyang, https://github.com/soulitzer	2023-06-16 16:55:51 +00:00
xuanqi	b27c3558a4	[RFC]: Create aten native op for constrain_range (#103346 ) At high current implementation of constrains functions (constrain_as_) will raise exception for the following code snippets: ``` def f(x): a = x.item() constrain_as_size(a, 4, 7) return torch.empty((a, 4)) inp = torch.tensor([5]) ep = torch._export.export(f, (inp,)) ``` The reason is because current constrain logic is: 1) Purely python so it won't survive AOT export (the full node is gone after AOT export since AOT export only maintains aten level op). 2) Utilize side effect to add range constraints for traced symbol's shape env ([code](`9591e52880/torch/fx/experimental/symbolic_shapes.py (L370-L372)`)). 3) If runtime assertion is turned on (by default). [`_AddRuntimeAssertionsForConstraintsPass`](`9591e52880/torch/_export/passes/add_runtime_assertions_for_constraints_pass.py (L98-L100)`) will try to append assertion node based on range constrains extracted from shape env of symbol during another interpretation round. 4). However, since 1), in the round of AOT export, range constraints logic won't run for symbols generated during this round. And later there is no range constrains information available for assertion round and caused issue. 5) As a result of above, it will failure at `torch.empty((a, 4))` (there is no constrains for `a` that it must be positive). The fix here is just to implement range constrain logic as a native aten op (CPU implementation as no-op) to make it be able to survive AOT export. NOTE:** [Logic](`2d745b95d7/torch/fx/experimental/symbolic_shapes.py (L350-L365C15)`) within [`constrain_range`](`2d745b95d7/torch/fx/experimental/symbolic_shapes.py (LL313C74-L313C74)`) is split out as `constrain_range_int` to capture case when non `SymInt` is passed in and reused in the new `_constrain_range`. The reason is when non `SymInt` is provided: * If it directly calls `sym_constrain_range`, the C++ version will be called which will be no-op. * So in this case it calls `constrain_range_int` instead to be able to capture issue like user provides a input whose tensor's shape could be out of range during exporting, like the following for above code example: ``` ... inp = torch.tensor([10]) ep = torch._export.export(f, (inp,)) # immediately raise error ``` Differential Revision: [D46734204](https://our.internmc.facebook.com/intern/diff/D46734204) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103346 Approved by: https://github.com/tugsbayasgalan	2023-06-16 14:55:40 +00:00
Thiago Crepaldi	6f655d4195	Add symbolic tracing support to torch._dynamo.export (fake input + weights) (#100017 ) Fixes #95900 Using the following repro as guide: ```python import torch import torch._dynamo from torch._subclasses import fake_tensor from torch.fx.experimental.symbolic_shapes import ShapeEnv from torch._dynamo.output_graph import config class Model(torch.nn.Module): def __init__(self) -> None: super().__init__() self.linear = torch.nn.Linear(2, 2) self.linear2 = torch.nn.Linear(2, 2) def forward(self, x): out = self.linear(x) out = self.linear2(out) return out fake_mode = fake_tensor.FakeTensorMode(allow_non_fake_inputs=False, allow_fallback_kernels=True, shape_env=ShapeEnv( allow_scalar_outputs=config.capture_scalar_outputs, allow_dynamic_output_shape_ops=config.capture_dynamic_output_shape_ops, frame_id=0 ), ) # Fakefying input/model before calling torch._dynamo.export with fake_mode: fake_x = torch.rand(5, 2, 2) model = Model() # Calling torch._dynamo.export without active fake mode graph_module, guards = torch._dynamo.export( model, fake_x, aten_graph=True, fake_mode=fake_mode ) graph_module.print_readable() graph_module.graph.print_tabular() ``` Summary of changes: * Plumb fake_mode through torch.export API. When specified, it replaces the creation of a new FaketendorMode at InstructionTranslator on behalf of OutputGraph Hacks FakeTensor.__new__ to prevent a torch.tensor._make_subclass call for inputs that are already fakefied by user. This probably need to be fixed in a nicer way. Any idea? * Removed a few asserts that didn't want faked tensors coming from user script * Added torch._subclasses.fake_tensor.FakeTensor to type list on a few asserts check to allow fake inputs The changes above allowed symbolic tracing with both static and dynamic shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100017 Approved by: https://github.com/ezyang	2023-06-15 21:28:10 +00:00
Edward Z. Yang	1c3a7d9a7e	Resolve TODO by deleting assert sparse cannot be meta on SymInt (#103299 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103299 Approved by: https://github.com/bdhirsh	2023-06-09 17:13:54 +00:00
Edward Z. Yang	96fd283640	Preserve CreationMeta when metafying views. (#103152 ) This helps us avoid erroring / generate more accurate error messages in Dynamo when doing mutations on views. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103152 Approved by: https://github.com/albanD	2023-06-09 12:34:54 +00:00
Jack Taylor	e936277cc2	[ROCm] force HIP context initialization for inductor UTs (#103149 ) Workaround for https://github.com/pytorch/pytorch/issues/102886 related to: https://github.com/pytorch/pytorch/issues/102476 https://github.com/pytorch/pytorch/issues/102475 https://github.com/pytorch/pytorch/issues/102474 https://github.com/pytorch/pytorch/issues/102473 https://github.com/pytorch/pytorch/issues/102473 https://github.com/pytorch/pytorch/issues/102472 Since `9aaa12e328` the first inductor (CPU) UT fails until the GPU context is correct initialised and the subsequent UTs pass. CUDA observes the same issue and a workaround was pushed to force initialisation of cuda context by declaring an empty tensor https://github.com/pytorch/pytorch/issues/92627, we have adopted the same approach but have opted for `torch.zeros` which correctly activates the HIP context after the kernel launch. Reproducer: ``` import torch from torch._subclasses.fake_tensor import FakeTensorMode import argparse if __name__ == '__main__': parser = argparse.ArgumentParser(description='Swap between torch.empty and torch.randn operations.') parser.add_argument('--empty', action='store_true', help='Use torch.empty operation') parser.add_argument('--rand', action='store_true', help='Use torch.randn operation') args = parser.parse_args() torch.cuda.set_device(0) if args.empty: torch.empty(1, device="cuda") elif args.rand: torch.rand(1, device="cuda") print(f": hasPrimaryContext: {torch._C._cuda_hasPrimaryContext(0)") with FakeTensorMode(): p = torch.randn(4, 2, requires_grad=True, device='cuda') x = torch.randn(8, 4, device='cuda') y = torch.mm(x, p).square().sum() y.backward() ``` ROCm python repro.py --empty 0: hasPrimaryContext: False ROCm python repro.py --rand 0: hasPrimaryContext: True CUDA python repro.py --empty 0: hasPrimaryContext: True CUDA python repro.py --rand 0: hasPrimaryContext: True Pull Request resolved: https://github.com/pytorch/pytorch/pull/103149 Approved by: https://github.com/eellison	2023-06-07 21:42:33 +00:00
Ivan Zaitsev	821493715c	Back out "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )", Back out "Forwatd fix for D46427687" (#103128 ) Test Plan: revertitparrot Reviewed By: malfet Differential Revision: D46506433 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103128 Approved by: https://github.com/malfet	2023-06-07 01:41:41 +00:00
Richard Zou	5b700fc914	Disable fallback for custom kernels (#101131 ) Previous failed attempt was here: https://github.com/pytorch/pytorch/pull/97715. Basically we tried to disable fallback for all ops (aten + custom) but hit many CI failures due to missing fake tensor coverage. Let's just disable it for custom kernels for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101131 Approved by: https://github.com/zou3519	2023-06-06 23:25:29 +00:00
shibo19	9d20b47e47	make device normalization more generic in faketensor (#102519 ) Fixes #ISSUE_NUMBER make the device normalization more generic in faketensor to support devices like "cuda", "foo" and so on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102519 Approved by: https://github.com/albanD	2023-06-04 01:44:21 +00:00
Kurt Mohler	a84bb2709a	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-03 02:23:21 +00:00
PyTorch MergeBot	a7efa0ce35	Revert "Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 )" This reverts commit `fb79d43649`. Reverted https://github.com/pytorch/pytorch/pull/102219 on behalf of https://github.com/malfet due to Broke lint, see https://github.com/pytorch/pytorch/actions/runs/5158949959/jobs/9293466925 ([comment](https://github.com/pytorch/pytorch/pull/102219#issuecomment-1574245414))	2023-06-02 20:00:48 +00:00
Kurt Mohler	fb79d43649	Remove `check` from `_prims_common`, replace with `torch._check*` (#102219 ) Part of #72948 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219 Approved by: https://github.com/lezcano, https://github.com/albanD	2023-06-02 19:13:45 +00:00
William Wen	da963d793b	Fix aten.copy device mismatch bug in FakeTensor (#102664 ) Fixes `pytest ./generated/test_yizhou_wang_RODNet.py -k test_000` failure in https://github.com/pytorch/pytorch/issues/92670. FakeTensor would raise an error upon trying to run `aten.copy` with inputs with different devices, although this is allowed behavior. Also fix `aten.slice_scatter`, since it also takes args with different devices. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102664 Approved by: https://github.com/yanboliang	2023-06-01 23:05:20 +00:00
Edward Z. Yang	e03800a93a	Add torch._utils.render_call, improve printoptions (#102623 ) - Add get_printoptions and printoptions context manager - Improve edgeitems handling when it is zero - Add render_call which can be used to conveniently print command line arguments of a function call, while suppressing actual tensor data Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102623 Approved by: https://github.com/albanD	2023-05-31 22:08:04 +00:00
Vishwa Raj Singh	c27cefccd3	Faketensor hpu device normalization (#102512 ) FakeTensor doesn't normalize device_idx and failed with below testcase. import torch import habana_frameworks.torch.hpu from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode.push(): a = torch.empty(1, device="hpu") b = torch.empty(1, device="hpu:0") result = a + b Pull Request resolved: https://github.com/pytorch/pytorch/pull/102512 Approved by: https://github.com/albanD	2023-05-31 17:06:44 +00:00
Peter Bell	ce42010722	[inductor][decomp] Add aten._unsafe_index_put for unchecked indexing (#101812 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101812 Approved by: https://github.com/lezcano	2023-05-24 22:17:32 +00:00
Edward Z. Yang	3318a832b3	Tighten FakeTensor reentrancy asserts, add debugging (#102091 ) When investigating failures in https://github.com/pytorch/pytorch/pull/100017 I realized that we were reentering FakeTensorMode even though there was already one on the stack. Although we have attempted assert for these cases in the past, e.g., as in https://github.com/pytorch/pytorch/pull/97186 it seems that the existing protections were insufficient. In this particular case, the reapplication of FakeTensorMode was due to an interaction with NotImplemented multiple dispatch handling. If proxy tensor mode detects an unrecognized tensor type (this includes FakeTensor, if it is not tracked with a proxy), it will return NotImplemented to give this tensor a chance to unpack itself into proxyable operation. However, this is never the right thing for FakeTensor, where no unpacking is possible. However, today, FakeTensor attempts to reapply the FakeTensorMode, resulting in FakeTensorMode being twice on the stack. This PR does a number of things: * It adds an assert in `FakeTensorMode.__torch_dispatch__` that you must not already have this mode on the stack, this is ALWAYS an error * It modifies `FakeTensor.__torch_dispatch__` to return `NotImplemented` if the mode is already active. This prevents us from readding the mode on the stack * It adds a new logging artifact `not_implemented` which you can use to get debug logs about all of the times a `__torch_dispatch__` handler returned NotImplemented and why it did so. Your subclass has to manually opt into this logging, but I inserted the necessary logs for ProxyTensorMode and FakeTensor(Mode) * `with fake_mode` now no-ops if the fake mode is already on the stack, which is what users want anyway * I am BREAKING pre-autograd tracing, because it is currently doing something weird with the original C++ mode stack. Brian is going to follow up with a fix next week. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102091 Approved by: https://github.com/thiagocrepaldi, https://github.com/eellison, https://github.com/wanchaol, https://github.com/bdhirsh	2023-05-24 05:37:51 +00:00
PyTorch MergeBot	5147fe4969	Revert "[inductor][decomp] Add aten._unsafe_index_put for unchecked indexing (#101812 )" This reverts commit `b9721bd705`. Reverted https://github.com/pytorch/pytorch/pull/101812 on behalf of https://github.com/osalpekar due to Causing test_nn_cuda tests to crash during runtime. More details at [D46093942](https://www.internalfb.com/diff/D46093942) ([comment](https://github.com/pytorch/pytorch/pull/101812#issuecomment-1560238085))	2023-05-23 23:06:21 +00:00
Richard Zou	8487105fae	[custom_op] Create a new torch._custom_op namespace (#101823 ) torch/custom_op.py is getting long, and the autograd pieces are going to make it even longer. I'm planning on just organizing the files under a torch/_custom_op folder. Note that the imports now look a bit crazy (from torch._custom_op.impl import...) but they will look more OK when we figure out the plan to make custom_op public (coming later). Pull Request resolved: https://github.com/pytorch/pytorch/pull/101823 Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/bdhirsh	2023-05-23 18:31:29 +00:00
Elias Ellison	e9246b290f	Initialize cuda tensor in fake tensor (#102027 ) Fix for https://github.com/pytorch/pytorch/issues/92627 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102027 Approved by: https://github.com/ngimel	2023-05-23 06:24:50 +00:00
Peter Bell	b9721bd705	[inductor][decomp] Add aten._unsafe_index_put for unchecked indexing (#101812 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101812 Approved by: https://github.com/lezcano	2023-05-22 20:39:18 +00:00
Elias Ellison	f99eeb5bdf	Check devices on meta functions that return inputs (#101807 ) FakeTensor has a default device logic that wraps meta tensors to the right device after running meta kernels and throws on multiple devices. This logic was only running on the wrapping from meta kernels -> fake. For out variants, where the output of the meta kernel was already a fake tensor because it was an input, the device logic wasn't running. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101807 Approved by: https://github.com/ngimel	2023-05-19 16:13:39 +00:00
Richard Zou	3ffeab7f80	[custom_op] Make repeated registrations error gracefully (#100979 ) Previously the error message went through torch.library. This PR changes it so that on each custom_op.impl_* call: - we store a (function, location) tuple - if a (function, location) tuple exists already, then we raise an error. This logic already existed for the abstract impl (the impl for meta and fake tensors), so this PR just extends it to the others. Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/100979 Approved by: https://github.com/bdhirsh, https://github.com/soulitzer	2023-05-12 13:49:15 +00:00
Bert Maher	d283075282	Reduce fake_tensor create_mode logging (#101074 ) A lot of Meta-internal logging is at INFO level, so this produces a lot of spam Differential Revision: [D45732720](https://our.internmc.facebook.com/intern/diff/D45732720/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101074 Approved by: https://github.com/eellison, https://github.com/ezyang	2023-05-11 13:26:38 +00:00
Nikita Karetnikov	37f1be041a	[pt2] enable `svd` in `fake_tensor` (#100130 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/100130 Approved by: https://github.com/ezyang, https://github.com/lezcano	2023-05-05 06:27:59 +00:00
Richard Zou	e6f9bc500b	CustomOp simple abstract implementation registration (#99439 ) This PR: - adds an abstract registration API for CustomOp (CustomOp.impl_abstract) that is used for both FakeTensor and meta tensors - deletes CustomOp.impl_meta The user story behind this API is that it is the one-stop shop for registering implementations for data-less Tensors, i.e. FakeTensor and Meta tensor. The abstract implementation provided by the user: - gets registered as the FakeTensor implementation AND the meta formula - can be written like a regular meta formula. If the user decides that they need something more special (i.e. data-dependent output shape), then they are able to query a current context object (FakeTensorImplCtx) that has methods to construct new unbacked symints. Caveats: - we really need to make FakeTensor/FakeTensorMode public. Otherwise, there isn't a way for the user to interactively test that their abstract implementation is correct without running through large pieces of the PT2 stack (make_fx or torch.compile). - We do not memoize the symints produced by ctx.create_unbacked_symint(). It is possible to do this in the future, but it is difficult to do soundly and I am not convinced of the utility outside of the nonzero() usecase mentioned in #95399 Public API: - More docs will come when we actually expose this API to users by putting it in a public namespace, unless you folks want it now. - The APIs mentioned in `__all__` are the ones that are intended to be public. Test Plan: - Updated existing custom_op_db operators - Added new numpy_nonzero and numpy_nms operations that test operations that have data-dependendent output shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99439 Approved by: https://github.com/ezyang	2023-04-28 13:45:39 +00:00
Brian Hirsh	9834358e0f	Get SchemaCheckMode to error on ops that return inputs directly. Expose as a dynamo backend, eager_debug (#99744 ) Talked to @zou3519 and @ezyang on what the right UX is: tentatively, adding a new dynamo backend is cheap and simple, so it seems worth doing. And longer term, we agreed (?) that it's worth seeing if we can get custom ops sanity asserts to run more automatically, instead of needing a separate backend. Side comment: that actually seems tough: the mode detects secret mutations by cloning every input to every op, running the op, and checking that the data matches between the real input and the cloned input. So I doubt we'll be able to make that behavior always-on? It would need some config at least. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99744 Approved by: https://github.com/albanD, https://github.com/ezyang, https://github.com/zou3519	2023-04-27 20:12:42 +00:00
Brian Hirsh	1f2d00e537	move SchemaCheckMode to torch/_subclasses (#99743 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99743 Approved by: https://github.com/albanD	2023-04-27 20:12:41 +00:00
Aaron Gokaslan	e2a3817dfd	[BE] Enable C419 rule for any all shortcircuiting (#99890 ) Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890 Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet	2023-04-25 15:02:13 +00:00
Jiong Gong	e5c9a0fcf5	[dynamo] avoid graph break on repeat_interleave.self_int (#99528 ) Address convit_base failure: https://github.com/pytorch/torchdynamo/issues/1886 mentioned in https://github.com/pytorch/pytorch/issues/93777 Also for models like EleutherAI/gpt-j-6B. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99528 Approved by: https://github.com/ezyang	2023-04-25 04:47:39 +00:00
Edward Z. Yang	41280a0791	Don't detach to create parameters in MetaConverter (#99618 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99618 Approved by: https://github.com/albanD	2023-04-24 19:01:26 +00:00
Michael Voznesensky	5e73569ab4	Add memoized_only mode to tensor conversion (#99741 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99741 Approved by: https://github.com/ezyang	2023-04-22 19:19:39 +00:00
Edward Z. Yang	18fd6394dc	Give distinct names to __unknown_tensor (#99729 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/99729 Approved by: https://github.com/albanD	2023-04-21 21:03:43 +00:00
Elias Ellison	638feec4e3	Turn on meta converter for complex (#98869 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98869 Approved by: https://github.com/ngimel	2023-04-20 16:42:38 +00:00
Richard Zou	57e1a50da3	Fix FakeTensor printing (#99205 ) I got too confused by the FakeTensor printing, so this PR fixes it to print normally. Before: ``` with FakeTensorMode(): x = torch.empty(2, 2, device="cpu") print(x) # FakeTensor(FakeTensor(..., device='meta', shape=(2, 2)), cpu) ``` After (Tensor printing doesn't print the default device): ``` FakeTensor(..., shape=(2, 2)) ``` Test Plan: - new test Pull Request resolved: https://github.com/pytorch/pytorch/pull/99205 Approved by: https://github.com/eellison	2023-04-18 13:26:27 +00:00
Tugsbayasgalan Manlaibaatar	7401f0f8ce	Add unbacked symbool support (#98877 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98877 Approved by: https://github.com/ezyang	2023-04-17 17:45:10 +00:00
Michael Voznesensky	d5f7ec8a31	Apply dynamic shapes policy correctly to _base tensor (#99211 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99211 Approved by: https://github.com/ezyang	2023-04-15 17:18:45 +00:00
andrewor14	651c1be885	Recompute flat_arg_fake_tensors after fakification (#98769 ) Summary: This fixes the case when some of the input tensors were real tensors and fakified in `validate_and_convert_non_fake_tensors`, but `flat_arg_fake_tensors` would not contain all the inputs because it was computed before the fakification. We fix this by recomputing `flat_arg_fake_tensors` after fakification as well. Test Plan: python test/dynamo/test_export.py ExportTests.test_mixed_real_and_fake_inputs Reviewers: Chillee, voznesenskym Pull Request resolved: https://github.com/pytorch/pytorch/pull/98769 Approved by: https://github.com/voznesenskym	2023-04-14 19:14:29 +00:00
Elias Ellison	fc53472ce4	Move/Fix FakeTensor logic for detecting multiple fake modes (#97186 ) This was leftover for when we had more logic in the FakeTensor and not FakeTensorMode, and wasn't firing correctly. It also makes more sense for it to be in the other validation function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97186 Approved by: https://github.com/bdhirsh	2023-04-13 19:20:01 +00:00
PyTorch MergeBot	4828585019	Revert "Move/Fix FakeTensor logic for detecting multiple fake modes (#97186 )" This reverts commit `8a057c445d`. Reverted https://github.com/pytorch/pytorch/pull/97186 on behalf of https://github.com/huydhn due to This breaks ONNX test in trunk and it looks like a landrace as the CI signal is green	2023-04-12 19:24:54 +00:00
Elias Ellison	8a057c445d	Move/Fix FakeTensor logic for detecting multiple fake modes (#97186 ) This was leftover for when we had more logic in the FakeTensor and not FakeTensorMode, and wasn't firing correctly. It also makes more sense for it to be in the other validation function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97186 Approved by: https://github.com/bdhirsh	2023-04-12 17:40:41 +00:00
Edward Z. Yang	b09722f540	Convert logging f-strings to use % format, part two (#98700 ) This hits multi-line logging strings Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98700 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Edward Z. Yang	d01ee10b25	Add detect_fake_mode (#98321 ) This replaces fake_mode_from_tensors but it preferentially looks for fake_mode in TracingContext and also if there is an active fake mode on the dispatch stack, before groveling in tensors to find it. This advances PegasusForCausalLM, which was previously failing because we generated a graph that had a parameter (non-fake) and a SymInt, and thus previously we failed to detect the correct fake mode. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98321 Approved by: https://github.com/voznesenskym	2023-04-05 22:15:16 +00:00
Edward Z. Yang	8372c5dc68	Refactor dynamic dims api, stateless internals, higher level export API (#96699 ) The purpose of this API is to execute a few large components of work: 1) Refactor all the internals of plumbing dynamic dimension information after dynamo to be stateless 2) Decouple allocation controls around dynamic dimensions from verification 3) For (2), for allocation, create an enum that dictates whether we are in DUCK (default today), STATIC (aka assume_static_default in the past), or DYNAMIC (aka user constrained, do not duck shape) 4) For (2), for verification, we separate out the list of dynamic ranges entirely from allocation. This means shape_env does not tracking for what we verify on, and instead, it is the callers job to invoke produce_guards() with the various things they want verified, specifically, with the valid ranges. We do use constrain ranges to refine value ranges when doing analysis. 5) We have decided, therefore, as an extension of (4) to double down on "late" checks versus "eager" checks, primarily because the mechanisms for gathering what actually matters happens during guards, and should be a purview of the caller seeking guards, not the shape env. However, for dynamo, these structures are essentially one and the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96699 Approved by: https://github.com/avikchaudhuri, https://github.com/ezyang	2023-03-29 16:55:49 +00:00
Edward Z. Yang	fb7f983357	Graph break on operators that fake tensor doesn't support (#97708 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97708 Approved by: https://github.com/eellison	2023-03-28 19:49:54 +00:00
Elias Ellison	6854fd7189	Add Config to Skip Cpp Codegen, Enable in FBCode (#97204 ) Differential Revision: [D44353662](https://our.internmc.facebook.com/intern/diff/D44353662) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97204 Approved by: https://github.com/ngimel, https://github.com/bertmaher, https://github.com/mikekgfb, https://github.com/cpuhrsch	2023-03-28 18:21:15 +00:00
Edward Z. Yang	fa82080016	Don't run fallback if symbolic sizes in fake tensor (#97148 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/97148 Approved by: https://github.com/Skylion007, https://github.com/eellison, https://github.com/bdhirsh	2023-03-21 02:23:44 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Edward Z. Yang	98ff841a75	Use maxint to bound integers. (#96121 ) We don't actually support arbitrary precision integers. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/96121 Approved by: https://github.com/tugsbayasgalan, https://github.com/lezcano	2023-03-07 12:46:19 +00:00
Edward Z. Yang	027ebca4d7	Don't use guardless contiguity/stride-like implementations (#95733 ) These prevent us from simplifying tests involving unbacked SymInts, and then you end up with unbacked SymInt in guards, which is bad. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95733 Approved by: https://github.com/tugsbayasgalan	2023-03-03 21:56:41 +00:00
PyTorch MergeBot	4026c62174	Revert "Don't use guardless contiguity/stride-like implementations (#95733 )" This reverts commit `deaf077de8`. Reverted https://github.com/pytorch/pytorch/pull/95733 on behalf of https://github.com/ezyang due to apparently this regresses executorch tests internally	2023-03-03 17:43:05 +00:00
Wonjoo Lee	3095c95828	Fixes for PyTorch/XLA functionalization integration (#94537 ) Fixes for PyTorch/XLA functionalization integration --- Some notable changes include: - More asserts in `FunctionalTensorWrapper`, so bugs show up more cleanly in cases where we e.g. forget to wrap an output - Make the *_scatter ops `CompositeExplicitAutogradNonFunctional`, so we get a better error message and XLA doesn't accidentally try to us them - Fix LTC/XLA codegen in core to handle multi-tensor out= ops with no returns - Better erroring: Allow XLA to use the CPU fallback from core in a way so that it always errors on view ops, which XLA should no longer see. - Update MetaConverter to exclude XLA tensors in raising NotImplemented… - Add `_propagate_xla_data` op - Add meta tensor support for some ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/94537 Approved by: https://github.com/bdhirsh	2023-03-02 23:02:34 +00:00
Edward Z. Yang	deaf077de8	Don't use guardless contiguity/stride-like implementations (#95733 ) These prevent us from simplifying tests involving unbacked SymInts, and then you end up with unbacked SymInt in guards, which is bad. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95733 Approved by: https://github.com/tugsbayasgalan	2023-03-01 23:14:58 +00:00
Edward Z. Yang	8efe4fd590	Memoize repeated nonzero calls to the same fake tensor (#95399 ) This removes the need to explicitly constrain_unify `x[mask]` and `y[mask]` when mask is a boolean tensor. It's very narrow but it seems to work in practice. To invalidate the nonzero call when mutation occurs, I use version counter. I know there are ways to bypass this but I think it's good enough for now. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95399 Approved by: https://github.com/eellison	2023-02-24 00:27:45 +00:00
Edward Z. Yang	4833e47feb	Add support for nonzero, some improvements to reduce guards (#95387 ) This takes the strategy described in https://docs.google.com/document/d/1lFRYAJo5nrfxRhwIzGnfi2pbLpU6T4ytSRSuLJ5qebI/edit# It is essentially https://github.com/pytorch/pytorch/pull/95222 but squashed and with changes that are unnecessary given that we assume nonzero returns > 1. What's in the PR: * nonzero now supports meta propagation. When `capture_dynamic_output_shape_ops`, it will return a tensor with an unbacked SymInt representing the size in question. * The unbacked SymInt is UNSOUNDLY assumed to be not equal to 0/1. We will still error if you guard otherwise. * PrimTorch pointwise operators are updated to use empty_permuted, to avoid guarding on unbacked SymInt from empty_strided (tested in `test_dynamic_pointwise_scalar`) * Convolution is updated to skip backend selection if batch is unbacked, to avoid guarding on unbacked SymInt (tested in `test_unbacked_batch_resnet`) * I kept the helper utilities like `definitely_true` for working with possibly unbacked SymInts. They're not used right now but maybe someone will find them useful. * Added `constrain_unify` to let you specify two unbacked SymInts must have the same value Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95387 Approved by: https://github.com/voznesenskym	2023-02-24 00:27:45 +00:00
Edward Z. Yang	a2f44d82f8	Flag guard unbacked SymInt/SymFloat support (#94987 ) I believe this fixes the AllenaiLongformerBase problem in periodic. The longer version of the problem is here is we are currently optimistically converting all item() calls into unbacked SymInt/SymFloat, but sometimes this results in a downstream error due to a data-dependent guard. Fallbacks for this case are non-existent; this will just crash the model. This is bad. So we flag guard until we get working fallbacks. What could these fallbacks look like? One idea I have is to optimistically make data-dependent calls unbacked, but then if it results in a crash, restart Dynamo analysis with the plan of graph breaking when the item() call immediately happened. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94987 Approved by: https://github.com/Skylion007, https://github.com/malfet	2023-02-17 00:25:05 +00:00
Edward Z. Yang	ef5de0a4cf	Don't use PrimTorch decomposition for empty (#94512 ) This PR removes the unnecessary == 0 guard when constructing empty tensors, by ensuring that when we create a contiguous tensor we go directly to the C++ torch.empty implementation (instead of indirecting through empty_strided), where we can bypass doing zero tests when computing the size of the storage. This probably also speeds up trace time. When I did this, I found out that `empty_tensor_restride_symint` was flagrantly wrong (we had never exercised it before because we redirected to `empty_strided` in PrimTorch decomp, which doesn't hit this codepath.) The bugs: * Stride computation was wrong (only `last_idx` was ever written to) * Using set_sizes_and_strides with `sym_sizes` input doesn't work, because there is some sort of ordering problem where `clone_symvec` isn't safe when you clone a vector into itself. Probably should fix this. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94512 Approved by: https://github.com/ngimel	2023-02-16 16:04:41 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
Edward Z. Yang	d690a596dc	Fast path binary ops in fake tensor (#94047 ) Fast path execution of a few binary ops in fake tensor, to speed up trace time. When testing `python benchmarks/dynamo/timm_models.py --accuracy --timing --backend aot_eager --dynamic-shapes --float32 --only hrnet_w18`, I get the following trace speedup. Before: ``` cuda eval hrnet_w18 PASS TIMING: entire_frame_compile:53.97591 backend_compile:33.60832 STATS: call_* op count: 1369 \| FakeTensor.__torch_dispatch__:4995 \| FakeTensorMode.__torch_dispatch__:89985 \| ProxyTorchDispatchMode.__torch_dispatch__:3010 ``` After: ``` cuda eval hrnet_w18 PASS TIMING: entire_frame_compile:40.18931 backend_compile:25.28828 STATS: call_* op count: 1369 \| FakeTensor.__torch_dispatch__:4995 \| FakeTensorMode.__torch_dispatch__:69478 \| attempt fast:4399 \| fast is_contiguous:4399 \| ProxyTorchDispatchMode.__torch_dispatch__:3010 ``` My experiment notebook can be found at https://docs.google.com/document/d/1_dTIQUwjIVnEWmiFAavJQYVF8uzXqD9Dk6b9gGQLF_U/edit# This is not the "most" optimized version of the code; compared with Horace/Voz roofline experiment: ``` diff --git a/torch/_subclasses/fake_tensor.py b/torch/_subclasses/fake_tensor.py index e3bf545f3b8..395942c6ffe 100644 --- a/torch/_subclasses/fake_tensor.py +++ b/torch/_subclasses/fake_tensor.py @@ -774,6 +774,10 @@ class FakeTensorMode(TorchDispatchMode): def __torch_dispatch__(self, func, types, args=(), kwargs=None): kwargs = kwargs if kwargs else {} + with no_dispatch(): + if func in {aten.mul.Tensor, aten.add.Tensor, aten.sub.Tensor, aten.relu.default}: + return FakeTensor(self, torch.empty(args[0].shape, device='meta'), device='cuda') + if func == torch.ops.prim.device.default: assert len(args) == 1 and isinstance(args[0], FakeTensor) if args[0].fake_mode.in_kernel_invocation: ``` I am still leaving about 5s of trace time improvement on the table (3s of which is attributable to not yet handling relu.) The implementation here is based off of https://github.com/pytorch/pytorch/pull/93118/ but I modeled the short circuit logic off of TensorIterator's implementation, for ease of code review and correctness verification. However, there are some important divergences: * Traditional fast setup in TensorIterator only short circuits if the shapes of all input elements are equal. On hrnet_w18, only 5% of fastpath'ed binary operators actually satisfy this. So instead, I compute the broadcasted shape, but then I only allow the fast path if (1) at least one input tensor has a shape that is exactly the output size, and (2) all the tensors are contiguous (or if all the tensors are channels last). * I had to manually adjust the logic to handle wrapped numbers (which ordinarily are handled by wrapping into tensors). I think I got this right. Some evidence that this heuristic is correct is here in: https://gist.github.com/ezyang/b22fa7b72b7349137211d8dc7041f758 I exhaustively test all dim=3 tensors with sizes [1, 2] and show that we get the same significant strides between PrimTorch and the new algorithm. In fact, there ARE differences between this algorithm and PrimTorch, but in fact this algorithm agrees with TensorIterator where PrimTorch is wrong (sample case: size=(1, 1, 2), stride=(1, 1, 1), stride=(1, 1, 1)) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94047 Approved by: https://github.com/eellison	2023-02-07 18:34:24 +00:00
Yanbo Liang	605b661805	FakeTensor should constant propagate through ops that allow numbers as scalars (#94145 ) Fixes #92655 Thanks @eellison for the code change suggestion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94145 Approved by: https://github.com/eellison	2023-02-07 06:20:35 +00:00
Edward Z. Yang	2481fc0df4	Add count to FakeTensorMode.__torch_dispatch__ (#93936 ) Most calls to fake tensor never hit `FakeTensor.__torch_dispatch__` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/93936 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2023-02-03 14:21:11 +00:00
Edward Z. Yang	12f22655b1	Short circuit device property access on FakeTensor (#93946 ) Before: ``` (/home/ezyang/local/a/pytorch-env) [ezyang@devgpu020.ftw1 ~/local/a/pytorch (ab0e3db0)]$ python benchmarks/dynamo/timm_models.py --accuracy --timing --backend aot_eager --dynamic-shapes --float32 --only hrnet_w18 cuda eval hrnet_w18 PASS TIMING: entire_frame_compile:54.19504 backend_compile:33.86702 STATS: call_* op count: 1369 \| FakeTensor.__torch_dispatch__:72549 \| FakeTensorMode.__torch_dispatch__:115542 \| ProxyTorchDispatchMode.__torch_dispatch__:3103 ``` After ``` (/home/ezyang/local/a/pytorch-env) [ezyang@devgpu020.ftw1 ~/local/a/pytorch (ab0e3db0)]$ python benchmarks/dynamo/timm_models.py --accuracy --timing --backend aot_eager --dynamic-shapes --float32 --only hrnet_w18 cuda eval hrnet_w18 PASS TIMING: entire_frame_compile:53.97591 backend_compile:33.60832 STATS: call_* op count: 1369 \| FakeTensor.__torch_dispatch__:4995 \| FakeTensorMode.__torch_dispatch__:89985 \| ProxyTorchDispatchMode.__torch_dispatch__:3010 ``` It doesn't really help end-to-end wall time all that much, but it does cut the number of calls to FakeTensor.__torch_dispatch__ by an order of magnitude, which hopefully has other positive effects. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/93946 Approved by: https://github.com/eellison, https://github.com/albanD	2023-02-03 14:20:30 +00:00
Elias Ellison	e4f11e01bd	[Fake Tensor] Allow fake meta by default, delete unused ctor args (#93993 ) Two small changes that I'm bundling together because one of them needs to touch fbcode and I'm not sure how to do stacked diffs + internal changes + land before release cut. Remove allow_meta from ctor, and allow by default: we should be able to trace through meta with fake tensors, so in some senses it's a bit weird to expose to user to disallow this. However, it's still useful debug wise to error from time to time, so I've added an option to the config that will get back previous behavior. Remove `throw_on_data_dependent_ops=True`: this was intended as a temporary behavior as we were smoothing things turning on the erroring. There are no uses anywhere of `throw_on_data_dependent_ops=False` I could find. These are technically backward-incompatble, but fake tensor is new since the last release / in a private namespace, and I don't want to release it with baggage that would be hard to remove later. Fix for https://github.com/pytorch/pytorch/issues/92877. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93993 Approved by: https://github.com/bdhirsh, https://github.com/ezyang	2023-02-03 09:23:38 +00:00
Jason Ansel	8c09a005c5	[inductor] Pattern matching engine (copy) (#93291 ) This is an exact duplicate of https://github.com/pytorch/pytorch/pull/90739 The fbcode workflow for landing that diff seems buggy. The github-export-checks task is failing with credentials errors. Plan to try to land it using GH1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93291 Approved by: https://github.com/desertfire	2023-01-31 04:51:00 +00:00
Michael Voznesensky	d322f82b05	Add @count util to torch, use it to track benchmark stats (#93013 ) <img width="1333" alt="image" src="https://user-images.githubusercontent.com/4755252/214687911-f766f072-c162-4298-9aed-c889f1375336.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/93013 Approved by: https://github.com/ezyang	2023-01-26 03:09:12 +00:00
Edward Z. Yang	1237cf6b6c	Allow direct Tensor constructor to return preexisting PyObject (#92754 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92754 Approved by: https://github.com/albanD, https://github.com/voznesenskym	2023-01-23 20:20:43 +00:00
PyTorch MergeBot	db466ae057	Revert "[Modes] Add assert that the mode isn't already on the stack (#90770 )" This reverts commit `702838637d`. Reverted https://github.com/pytorch/pytorch/pull/90770 on behalf of https://github.com/DanilBaibak due to Break internal build	2023-01-12 16:44:29 +00:00
samdow	702838637d	[Modes] Add assert that the mode isn't already on the stack (#90770 ) Redo of #89726 on a clean PR, thanks @voznesenskym for the first draft! Pull Request resolved: https://github.com/pytorch/pytorch/pull/90770 Approved by: https://github.com/ezyang	2023-01-11 15:19:43 +00:00
Samantha Andow	a7749ae177	[reland] rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 ) (#89221 ) Summary: First half of #87990. This doesn't change any of the behavior and is just a rename #88218 got reverted for internal breakages. This is the reland of started from internal Differential Revision: D41268423 LaMa Project: L1098534 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89221 Approved by: https://github.com/meliy-meyada, https://github.com/zou3519	2023-01-04 18:32:49 +00:00
Edward Z. Yang	bcf15cd93b	Store source, not sname, in Symbol (#91057 ) I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular: - Move Source to torch._guards to break cycles - I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously - I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious - Some mypy wobbles from new imports I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057 Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/zou3519	2022-12-30 05:56:56 +00:00
Kurt Mohler	08a47549af	Rename `Tensor._storage` to `Tensor.untyped_storage` and update docs (#91414 ) Fixes #89224 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91414 Approved by: https://github.com/ezyang	2022-12-28 19:21:34 +00:00
Nikita Shulga	fd3a7264ae	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (take 2) (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Add `__all__` to `torch/backends/mps/__init__.py` as well as alias all imports as private Add `TestNNMPS.test_group_norm_backward` that validates no NaNs are generated during the backward pass Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-22 08:54:37 +00:00
PyTorch MergeBot	b68fd7e319	Revert "Store source, not sname, in Symbol (#91057 )" This reverts commit `88c581be87`. Reverted https://github.com/pytorch/pytorch/pull/91057 on behalf of https://github.com/atalman due to causing internal build failures	2022-12-21 22:33:15 +00:00
PyTorch MergeBot	645eda0a00	Revert "[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 )" This reverts commit `371716eb36`. Reverted https://github.com/pytorch/pytorch/pull/91190 on behalf of https://github.com/kit1980 due to Broke test_correct_module_names because of underscore _ops	2022-12-21 19:37:43 +00:00
Nikita Shulga	371716eb36	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Depends on https://github.com/pytorch/pytorch/pull/91203 Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-21 17:33:27 +00:00
Edward Z. Yang	88c581be87	Store source, not sname, in Symbol (#91057 ) I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular: - Move Source to torch._guards to break cycles - I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously - I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious - Some mypy wobbles from new imports I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057 Approved by: https://github.com/albanD, https://github.com/voznesenskym	2022-12-21 04:51:51 +00:00
Edward Z. Yang	0b22f5ae9f	Deeply rework WeakIdKeyDictionary (#90825 ) In the prior patch, I just YOLOed a mutable mapping implementation. Many edge cases were not handled correctly. In this PR, I just copy paste the WeakKeyDictionary from CPython and the hacked it up to use WeakIdRef instead of weakref.ref. You can see each line I changed with the comment CHANGED; there aren't many. Being exactly API compatible with WeakKeyDictionary means I can also rob all of the tests from CPython, which I also did for test/test_weak.py How to review? You could either try taking the delta from CPython (recommended), or review everything from scratch (not recommended). Can post diff representing delta on request. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90825 Approved by: https://github.com/albanD	2022-12-15 08:43:08 +00:00
Edward Z. Yang	54563e6288	Don't put tracing state on Tensor (#90628 ) Fixes https://github.com/pytorch/pytorch/issues/89626 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90628 Approved by: https://github.com/voznesenskym	2022-12-15 08:43:08 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	1aab755320	Fakify params and weights under private config (#90417 ) Previously, we planned to lift the parameters and weights while exporting and implement our own transformer to "unlift" the lifted weights and params back to the graph as attributes. But this is bit challenging because: - We need to maintain correct ordering for weights and parameters that are passed as inputs so that we know how to map them back. - Some weights are unused in the graph, so our transformer needs to be aware of which weights and parameters are not used in the graph. And we need to distinguish which are real user input and which are parameters. - There can be more edge cases we haven't seen in other models yet. I am aware that @Chillee and @bdhirsh mentioned that functionalization won't work with fake-tensor attributes but this is fine for the short term as we don't expect users to be modifying weights and params in inference mode. In fact, we explicitly disable attribute mutation in torchdynamo export mode right now. Given above condition, it might be ok to just fakify params when we need. I use a flag to guard against this change. Differential Revision: [D41891201](https://our.internmc.facebook.com/intern/diff/D41891201) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90417 Approved by: https://github.com/eellison	2022-12-14 09:33:18 +00:00
Edward Z. Yang	f7365eca90	Add unbacked symints support; item works now (#90624 ) The big idea is to add `create_unbacked_symfloat` and `create_unbacked_symint` to ShapeEnv, allowing you to allocate symbolic floats/ints corresponding to data you don't know about at compile time. Then, instead of immediately erroring out when you try to call local_scalar_dense on a FakeTensor, we instead create a fresh symint/symfloat and return that. There a bunch of odds and ends that need to be handled: * A number of `numel` calls converted to `sym_numel` * When we finally return from item(), we need to ensure we actually produce a SymInt/SymFloat when appropriate. The previous binding code assumed that you would have to get a normal Python item. I add a pybind11 binding for Scalar (to PyObject only) and refactor the code to use that. There is some trickiness where you are NOT allowed to go through c10::SymInt if there isn't actually any SymInt involved. See comment. * One of our unit tests tripped an implicit data dependent access which occurs when you pass a Tensor as an argument to a sizes parameter. This is also converted to support symbolic shapes * We now support tracking bare SymInt/SymFloat returns in proxy tensor mode (this was already in symbolic-shapes branch) * Whenever we allocate an unbacked symint, we record the stack trace it was allocated at. These get printed when you attempt data dependent access on the symint (e.g., you try to guard on it) * Subtlety: unbacked symints are not necessarily > 1. I added a test for this. These unbacked symints are not very useful right now as you will almost always immediately raise an error later when you try to guard on them. The next logical step is adding an assertion refinement system that lets ShapeEnv learn facts about unbacked symints so it can do a better job eliding guards that are unnecessary. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90624 Approved by: https://github.com/Skylion007, https://github.com/voznesenskym	2022-12-12 13:33:07 +00:00
Edward Z. Yang	b68dead20c	Keep track of source name on all allocated SymInts (#90295 ) Wow, I had to sweat so much to get this PR out lol. This PR enforces the invariant that whenever we allocate SymInts as part of fakeification, the SymInt is associated with a Source, and in fact we store the string source name on SymbolWithSourceName. We use 'sname' as the shorthand for source name, as 'name' is already used by sympy to name symbols. In order to store source names, we have to plumb source names from Dynamo to PyTorch. This made doing this PR a bit bone crushing, because there are many points in the Dynamo codebase where we are improperly converting intermediate tensors into fake tensors, where there is no source (and there cannot be, because it's a frickin' intermediate tensor). I've fixed all of the really awful cases in earlier PRs in the stack. This PR is just plumbing in source names from places where we do have it. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90295 Approved by: https://github.com/voznesenskym	2022-12-10 13:17:34 +00:00
Ram Rachum	351d73b97f	Fix exception causes all over the codebase (#90271 ) This is the continuation to #90134 and hopefully the final PR in this series. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271 Approved by: https://github.com/kit1980	2022-12-07 04:29:00 +00:00
Edward Z. Yang	3d4b92b171	Ensure that we fakeify tensor subclasses when they are initially tracked (#90009 ) The old code didn't actually fakeify traceable tensor subclasses at the time they are added as a GraphArg to the module; now we do, by ignoring the subclass during fakeification and relying on Dynamo to simulate the subclass on top. See comments for more details. BTW, this codepath is super broken, see filed issues linked on the inside. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90009 Approved by: https://github.com/wconstab, https://github.com/voznesenskym	2022-12-06 22:36:32 +00:00
Michael Voznesensky	3b9a386d48	Add `TORCH_FAKE_TENSOR_DEBUG` use it to enable storage of traces on fake tensors at init time (#90215 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90215 Approved by: https://github.com/ezyang	2022-12-06 22:28:52 +00:00
Edward Z. Yang	a1ab06ab65	ShapeEnv.create_symbolic_sizes_strides_storage_offset (#89962 ) Instead of having storage offset hang out on its own, allocate all of these symbols all in one go. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89962 Approved by: https://github.com/albanD, https://github.com/voznesenskym	2022-12-06 21:27:02 +00:00
Elias Ellison	1a33b7cbfa	Make fake tensors preserve dense strides in type conversion (#89803 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89803 Approved by: https://github.com/ngimel	2022-11-30 01:28:51 +00:00
XiaobingSuper	07151a6bd6	TorchDynamo: weight prepack for onednn convolution external call (#88988 ) This PR is about enabled weight prepack using the MKLDNN tensor: 1. enable fake tensor mode for MKLDNN tensor input. 2. make convolution fusion kernel support MKLDNN tensor input. 3. do the weight prepack at FX fusion step. For better performance, we always use channels_last for CPU convolution path. because we test that the channels_last path can get a better performance than block input path, and also avoid the activation's layout conversion(plain to block, block to plain), currently, there only need plain to plain format conversion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88988 Approved by: https://github.com/jgong5, https://github.com/jansel	2022-11-25 01:16:11 +00:00
Edward Z. Yang	860bae49e4	Suppress guards on as_strided call only. (#89569 ) See comment in meta_utils.py for the whole story. This doesn't have a substantive impact yet, but will in the next PR on the stack. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89569 Approved by: https://github.com/albanD	2022-11-24 14:01:12 +00:00
Edward Z. Yang	ea50549ce6	Suppress guards when creating fake tensors (#89349 ) When we create fake tensors, we may call operators that introduce guards, to accurately reconstruct views. But these guards are spurious: if a user is able to present a tensor that "looks the same", they have implicitly fulfilled the contract that the view is creatable. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89349 Approved by: https://github.com/voznesenskym	2022-11-21 23:14:20 +00:00
Sherlock Huang	caf3d5319f	Symintify numel(), infer_size, prims.elementwise_meta (#88956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88956 Approved by: https://github.com/ezyang	2022-11-20 00:42:03 +00:00
PyTorch MergeBot	8ad39536d7	Revert "Symintify numel(), infer_size, prims.elementwise_meta (#88956 )" This reverts commit `ce2f8700ba`. Reverted https://github.com/pytorch/pytorch/pull/88956 on behalf of https://github.com/ezyang due to somehow breaks torch.numel	2022-11-19 21:47:55 +00:00
Edward Z. Yang	5582001bd5	Reland 2 "Towards unifying symbolic and non symbolic fake tensor (#89038 ) (#89143 )" (#89346 ) This reverts commit `8e4c9828f4`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89346 Approved by: https://github.com/wconstab	2022-11-19 21:14:31 +00:00
PyTorch MergeBot	8e4c9828f4	Revert "Reland "Towards unifying symbolic and non symbolic fake tensor (#89038 )" (#89143 )" This reverts commit `e686b8c3ba`. Reverted https://github.com/pytorch/pytorch/pull/89143 on behalf of https://github.com/ZainRizvi due to This seems to be causing the test_make_fx_symbolic_exhaustive_rad2deg_cpu_float32 and test_make_fx_symbolic_exhaustive_inplace_rad2deg_cpu_float32 test to fail across multiple jobs	2022-11-17 17:02:36 +00:00

1 2 3 4 5 ...

278 Commits