pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	3f1a97a99c	Revert "[dynamic shapes] unbacked-safe slicing (#157944 )" This reverts commit `44549c7146`. Reverted https://github.com/pytorch/pytorch/pull/157944 on behalf of https://github.com/pianpwk due to this PR & internal diff landed out of sync, just reverted internal with D80720654, will revert this & reland as codev ([comment](https://github.com/pytorch/pytorch/pull/157944#issuecomment-3215610135))	2025-08-22 20:48:46 +00:00
Pian Pawakapan	44549c7146	[dynamic shapes] unbacked-safe slicing (#157944 ) Generates new unbacked symbols for slice output size & storage offset, when appropriate semantics are unclear. Teaches inductor to codegen the slice with flexible semantics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157944 Approved by: https://github.com/laithsakka	2025-08-20 22:52:56 +00:00
PyTorch MergeBot	6ea4be1e2e	Revert "[dynamic shapes] unbacked-safe slicing (#157944 )" This reverts commit `2f0cba934d`. Reverted https://github.com/pytorch/pytorch/pull/157944 on behalf of https://github.com/seemethere due to This is blocking internal sync due to merge conflicts ([comment](https://github.com/pytorch/pytorch/pull/157944#issuecomment-3206833193))	2025-08-20 15:16:45 +00:00
Pian Pawakapan	2f0cba934d	[dynamic shapes] unbacked-safe slicing (#157944 ) Generates new unbacked symbols for slice output size & storage offset, when appropriate semantics are unclear. Teaches inductor to codegen the slice with flexible semantics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157944 Approved by: https://github.com/laithsakka	2025-08-19 17:32:47 +00:00
PyTorch MergeBot	5e98d9f9ba	Revert "[dynamic shapes] unbacked-safe slicing (#157944 )" This reverts commit `56218d85e2`. Reverted https://github.com/pytorch/pytorch/pull/157944 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think this is failing test_draft_export in trunk `56218d85e2` ([comment](https://github.com/pytorch/pytorch/pull/157944#issuecomment-3198874677))	2025-08-19 01:16:17 +00:00
Pian Pawakapan	56218d85e2	[dynamic shapes] unbacked-safe slicing (#157944 ) Generates new unbacked symbols for slice output size & storage offset, when appropriate semantics are unclear. Teaches inductor to codegen the slice with flexible semantics. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157944 Approved by: https://github.com/laithsakka	2025-08-18 22:38:16 +00:00
PyTorch MergeBot	b82aa3df20	Revert "Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 )" This reverts commit `e444cd24d4`. Reverted https://github.com/pytorch/pytorch/pull/159197 on behalf of https://github.com/laithsakka due to internal build failures ([comment](https://github.com/pytorch/pytorch/pull/159197#issuecomment-3195436668))	2025-08-18 07:22:13 +00:00
Laith Sakka	e444cd24d4	Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. (#159197 ) This might cause some new DDEs on call sites that do not use is_contiguous_or_false() or sym_is_contiguous() but want to find those call sites to handle this properly by calling is_contiguous_or_false() and not is_contiguous() explitly when appropriate. I had to fix one issue after removing the implicit size oblivious reasoning. here is context we defined in this https://github.com/pytorch/pytorch/pull/157472 sym_is_contiguous to be the function computing contiguity for dynamic shapes in c++. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE. when people call is_contiguous we do sym_is_contiguous().guard_bool() when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false() one issue not handled well was this path ``` c10::SymBool TensorImpl::sym_is_contiguous_custom( at::MemoryFormat memory_format) const { if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) { return pyobj_slot_.load_pyobj_interpreter()->is_contiguous( this, memory_format); } return sym_is_contiguous_default(memory_format); } ``` namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format); This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning. once we removed that implicit size oblivious reasoning, the right thing we want is to call return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format); otherwise we would get DDE even if the caller is doing sym_is_contiguous. so I had to define it for pyinterpreter, and then I had to override it for nested tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159197 Approved by: https://github.com/ezyang	2025-08-16 09:15:58 +00:00
Laith Sakka	fea7e9dd37	extract shape in _view_has_unbacked_input (#160255 ) Summary: We were getting DDE on reshape still!! i looked deeper and found an issue in _view_has_unbacked_input namely when input is [[,,]] it need to be normalized to [..] Test Plan: existing tests. Rollback Plan: Differential Revision: D79951119 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160255 Approved by: https://github.com/bobrenjc93	2025-08-12 08:38:19 +00:00
Laith Sakka	2523e58781	unbacked handling for view_copy (#159244 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159244 Approved by: https://github.com/bobrenjc93	2025-07-29 07:10:46 +00:00
Laith Sakka	aaa384b2d4	move view_meta to fake impl (#158406 ) Python dispatcher is not always enabled in fake tensors and have to be called explicitly. While it should be, it requires some work to get all tests working. I have been running in several issues where I add to add enable_python_dispatcher ex XLA, Helom ..etc to avoid issues related to that for the view specifically i moved it to fake tensor impl. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158406 Approved by: https://github.com/bobrenjc93	2025-07-25 08:21:27 +00:00
Laith Sakka	0b2ef76e85	DDE-Free select with unbacked index. (#157605 ) When select has data dependent input, we cant tell if the actual index shall be index+size or index. to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the output view and we compute its value dynamically at runtime when inductor is lowered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605 Approved by: https://github.com/ColinPeppler	2025-07-24 20:08:05 +00:00
PyTorch MergeBot	23550ab735	Revert "DDE-Free select with unbacked index. (#157605 )" This reverts commit `79d7c754ab`. Reverted https://github.com/pytorch/pytorch/pull/157605 on behalf of https://github.com/laithsakka due to fail pr time benchmarks ([comment](https://github.com/pytorch/pytorch/pull/157605#issuecomment-3084663020))	2025-07-17 16:20:02 +00:00
Laith Sakka	79d7c754ab	DDE-Free select with unbacked index. (#157605 ) When select has data dependent input, we cant tell if the actual index shall be index+size or index. to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the output view and we compute its value dynamically at runtime when inductor is lowered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605 Approved by: https://github.com/ColinPeppler	2025-07-17 05:08:11 +00:00
Xuehai Pan	7f14b42adf	[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312 Approved by: https://github.com/albanD	2025-07-12 05:47:06 +00:00
PyTorch MergeBot	e15f4248ad	Revert "[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 )" This reverts commit `7a92b51196`. Reverted https://github.com/pytorch/pytorch/pull/156312 on behalf of https://github.com/XuehaiPan due to landrace ([comment](https://github.com/pytorch/pytorch/pull/156312#issuecomment-3064672250))	2025-07-12 04:40:52 +00:00
Xuehai Pan	7a92b51196	[BE][2/16] fix typos in torch/ (torch/_*/) (#156312 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312 Approved by: https://github.com/albanD	2025-07-12 01:47:22 +00:00
Laith Sakka	ed5d6d2a20	python definitely_contiguous-> is_contiguous_or_false (#156515 ) We probably can avoid having those in python as well and just depend on c++ impl after we land https://github.com/pytorch/pytorch/pull/155590 but that is for a different PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156515 Approved by: https://github.com/bobrenjc93	2025-06-30 17:31:51 +00:00
PyTorch MergeBot	75a7d9e868	Revert "python definitely_contiguous-> is_contiguous_or_false (#156515 )" This reverts commit `4c0091fda6`. Reverted https://github.com/pytorch/pytorch/pull/156515 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause some torch.export failures internally ([comment](https://github.com/pytorch/pytorch/pull/156515#issuecomment-3014104570))	2025-06-27 19:07:06 +00:00
Laith Sakka	4c0091fda6	python definitely_contiguous-> is_contiguous_or_false (#156515 ) We probably can avoid having those in python as well and just depend on c++ impl after we land https://github.com/pytorch/pytorch/pull/155590 but that is for a different PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156515 Approved by: https://github.com/bobrenjc93	2025-06-26 00:47:14 +00:00
Xuehai Pan	162ca185ff	[BE][PYFMT] migrate PYFMT for `torch/_[a-h]*/` to `ruff format` (#144551 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144551 Approved by: https://github.com/ezyang ghstack dependencies: #148186	2025-06-25 06:16:06 +00:00
Laith Sakka	35321b2ad6	remove make_fast_binary_impl from make_fast_binary_impl (#156528 ) This was added in https://github.com/pytorch/pytorch/pull/133584. Take slow path when we cant determine fast path is valid. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156528 Approved by: https://github.com/bobrenjc93	2025-06-21 08:27:54 +00:00
Pian Pawakapan	8ad6197b46	[draft export] avoid storing intermediate real tensors in proxies (#154630 ) Handles GC for non-strict draft export; GPU memory usage shouldn't be much more than eager mode + input tensors now. While trying to do draft export CPU offloading, I found out GC is feasible, because in non-strict, there's 2 places holding references to a `.real_tensor` attribute: 1) the FakeTensors in fake tensor prop, but these are held by the actual variables in the model's forward call, and so the real tensor gets gc-ed along with the fake one when the variable goes out of scope. 2) A clone of the fake tensor in 1) stored in `proxy.node.meta["val"]`, which was added in https://github.com/pytorch/pytorch/pull/150948. But we didn't actually need to store them on intermediate values; the placeholders are enough for retracing/lowering. Avoiding storing the intermediate values in 2), the values in 1) should be naturally GC-ed, and the real-tensor memory usage for non-strict should be pretty similar to eager computation? Strict still OOMs; dynamo still holds these in variable tracking, and not sure how to GC those. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154630 Approved by: https://github.com/angelayi, https://github.com/yushangdi	2025-06-12 01:18:57 +00:00
Oguz Ulgen	d1947a8707	Migrate from lru_cache to cache (#155613 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155613 Approved by: https://github.com/ezyang ghstack dependencies: #155612	2025-06-11 19:44:18 +00:00
Colin Peppler	7b7cd56f5e	[export] support linear & layer_norm unbacked (#155260 ) ## What - use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE. - use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE. ## Example DDEs - Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl` ``` torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)). (Size-like symbols: u0) Caused by: layer_norm = self.layer_norm(linear) # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl) ``` - Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten) ``` File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object, _object, _object) File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128((u0//387)), 0) (unhinted: Eq(128((u0//387)), 0)). (Size-like symbols: u0) Caused by: (_refs/__init__.py:3302 in native_layer_norm) ``` - Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous ``` torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2). (Size-like symbols: u0) Caused by: (_prims_common/__init__.py:279 in is_contiguous) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155260 Approved by: https://github.com/laithsakka ghstack dependencies: #155499	2025-06-11 16:47:34 +00:00
bobrenjc93	fc77269262	Add randint_like tensor overload for high (#154899 ) Fixes #135664 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154899 Approved by: https://github.com/StrongerXi	2025-06-06 15:48:00 +00:00
PyTorch MergeBot	5130ac64f4	Revert "Add randint_like tensor overload for high (#154899 )" This reverts commit `72fe1d5f42`. Reverted https://github.com/pytorch/pytorch/pull/154899 on behalf of https://github.com/seemethere due to Failing internal tests see https://fburl.com/diff/bai044ob ([comment](https://github.com/pytorch/pytorch/pull/154899#issuecomment-2942740661))	2025-06-05 04:54:05 +00:00
bobrenjc93	72fe1d5f42	Add randint_like tensor overload for high (#154899 ) Fixes #135664 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154899 Approved by: https://github.com/StrongerXi ghstack dependencies: #154863	2025-06-04 03:37:09 +00:00
Ryan Guo	467235027c	[AOTDispatch] Use the proper meta function for `_amp_foreach_non_finite_check_and_unscale_` (#154930 ) As title, this fixes part of #138412. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154930 Approved by: https://github.com/zou3519	2025-06-03 18:18:40 +00:00
PyTorch MergeBot	0fab32290a	Revert "[draft export] avoid storing intermediate real tensors in proxies (#154630 )" This reverts commit `5acb8d5080`. Reverted https://github.com/pytorch/pytorch/pull/154630 on behalf of https://github.com/malfet due to This still ooms, at least occasionally see `78624679a8/1` ([comment](https://github.com/pytorch/pytorch/pull/154630#issuecomment-2923759745))	2025-05-31 00:07:56 +00:00
Pian Pawakapan	5acb8d5080	[draft export] avoid storing intermediate real tensors in proxies (#154630 ) Handles GC for non-strict draft export; GPU memory usage shouldn't be much more than eager mode + input tensors now. While trying to do draft export CPU offloading, I found out GC is feasible, because in non-strict, there's 2 places holding references to a `.real_tensor` attribute: 1) the FakeTensors in fake tensor prop, but these are held by the actual variables in the model's forward call, and so the real tensor gets gc-ed along with the fake one when the variable goes out of scope. 2) A clone of the fake tensor in 1) stored in `proxy.node.meta["val"]`, which was added in https://github.com/pytorch/pytorch/pull/150948. But we didn't actually need to store them on intermediate values; the placeholders are enough for retracing/lowering. Avoiding storing the intermediate values in 2), the values in 1) should be naturally GC-ed, and the real-tensor memory usage for non-strict should be pretty similar to eager computation? Strict still OOMs; dynamo still holds these in variable tracking, and not sure how to GC those. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154630 Approved by: https://github.com/angelayi, https://github.com/yushangdi	2025-05-30 21:06:55 +00:00
Pian Pawakapan	4166373908	[dynamic shapes] guard_or_false for infer_size (#152146 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/152146 Approved by: https://github.com/laithsakka	2025-05-08 21:27:22 +00:00
Pian Pawakapan	5521e6b671	[export] support SymInt minlength for torch.bincount() (#152497 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152497 Approved by: https://github.com/angelayi	2025-05-01 00:45:58 +00:00
angelayi	7deed1946f	Fix assert_tensor_meta (#150808 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150808 Approved by: https://github.com/pianpwk ghstack dependencies: #150806, #150807	2025-04-14 19:28:54 +00:00
Shangdi Yu	92e81cf41a	Add real_tensor to the FakeTensor in node.meta["val"] (#150948 ) Summary: We need real_tensor on the FakeTensor in node.meta["val"] in order to aot_compile the draft exported programs. Otherwise, we cannot propagate real tensors even when fake_mode.propagate_real_tensors = True. This also fixes real tensor propagation in `run_decomposition()`. Test Plan: ``` buck2 run @mode/dev-nosan caffe2/test:test_export -- -r test_dedup_data_dependent_failure ``` Differential Revision: D72732714 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150948 Approved by: https://github.com/angelayi	2025-04-10 00:11:46 +00:00
Shangdi Yu	cfab04d01b	Fix aten.div type promotion for FakeTensor (#150874 ) Summary: When we divide a FakeTensor by an integer using the fast op implementation, the type promotion should be `ELEMENTWISE_TYPE_PROMOTION_KIND.INT_TO_FLOAT` so we get a float when dividing an int FakeTensor by an integer. ``` FAST = get_fast_op_impls() fast_div = FAST[torch.ops.aten.div.Tensor] fast_div(fake_tensor, some_int) ``` Test Plan: ``` python test/test_fake_tensor.py -k test_fast_div ``` Differential Revision: D72667430 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150874 Approved by: https://github.com/angelayi	2025-04-09 18:52:01 +00:00
angelayi	ea188ac0c7	[export] Add meta for aten.bincount (#147129 ) Fixes https://github.com/pytorch/pytorch/issues/147094 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147129 Approved by: https://github.com/pianpwk	2025-02-14 10:33:54 +00:00
rzou	2e5886dcc4	Add fake_impl for unique_consecutive (#145649 ) Summary: It's fairly similar to torch.unique and torch.unique_dim. Test Plan: New test Pull Request resolved: https://github.com/pytorch/pytorch/pull/145649 Approved by: https://github.com/ezyang, https://github.com/eellison	2025-01-29 22:33:16 +00:00
Edward Z. Yang	87fdadde1d	Remove FFT from stride incorrect ops (#145080 ) I gotta say, the FFT implementation is completely insane, there's gotta be a better way to do this than repeatedly inplace restriding the output tensor. Anyway, this is a faithful translation of both the MKL and cuFFT paths to Python. Fixes https://github.com/pytorch/pytorch/issues/135087 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145080 Approved by: https://github.com/Skylion007, https://github.com/albanD ghstack dependencies: #145530	2025-01-27 04:26:04 +00:00
Edward Z. Yang	90448f0128	Output of nonzero is transposed, fix fake tensor (#144695 ) Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695 Approved by: https://github.com/bobrenjc93, https://github.com/albanD	2025-01-26 01:07:22 +00:00
PyTorch MergeBot	ad36f4f42c	Revert "Add generator parameter to rand*_like functions (#136780 )" This reverts commit `c7b2f7dd14`. Reverted https://github.com/pytorch/pytorch/pull/136780 on behalf of https://github.com/izaitsevfb due to internal regression ([comment](https://github.com/pytorch/pytorch/pull/136780#issuecomment-2613191933))	2025-01-24 19:00:21 +00:00
PyTorch MergeBot	f0a210bf5d	Revert "Output of nonzero is transposed, fix fake tensor (#144695 )" This reverts commit `693d8c7e94`. Reverted https://github.com/pytorch/pytorch/pull/144695 on behalf of https://github.com/izaitsevfb due to breaking internal tests, see D68461259 ([comment](https://github.com/pytorch/pytorch/pull/144695#issuecomment-2608443589))	2025-01-22 23:04:50 +00:00
Edward Z. Yang	693d8c7e94	Output of nonzero is transposed, fix fake tensor (#144695 ) Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695 Approved by: https://github.com/bobrenjc93, https://github.com/albanD	2025-01-21 20:50:09 +00:00
Sam	c7b2f7dd14	Add generator parameter to rand*_like functions (#136780 ) Fixes #128786 Fixes #101974 Fixes #27072 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136780 Approved by: https://github.com/Chillee, https://github.com/ezyang	2025-01-15 21:16:52 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit `fda975a7b3`. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
Aaron Gokaslan	12e95aa4ee	[BE]: Apply PERF401 autofixes from ruff (#140980 ) * Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables. * list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize. * Manually went back and made mypy happy after the change. * Also fixed style lints in files covered by flake8 but not by pyfmt Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-11-20 17:52:07 +00:00
Tugsbayasgalan Manlaibaatar	2b21a653d8	Register CIA ops to FakeTensorMode directly in export (#140465 ) During export, we nub out most CIA ops to return NotImplemented to avoid decomposing them during tracing. To recover the existing shape propagation behavior, we register these CIA decomps directly as FakeTensorMode rules as well. The reason we have to do is because when we return NotImplemented, FakeTensor would fallback to running these CIAs with Meta backend causing device branching CIA ops to fail. (because now the device is Meta. One example is sdpa). If we register a kernel directly to FakeTensorMode, we won't fallback to Meta backend. Differential Revision: [D65716260](https://our.internmc.facebook.com/intern/diff/D65716260/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/140465 Approved by: https://github.com/bdhirsh	2024-11-19 15:00:35 +00:00
Jack Zhang	dd688099af	Update unbacked symints in torch.nonzero more precisely (#137663 ) ### Summary The fake impl for `nonzero` sets the symint's upper range to `sys.maxsize - 1` if there are any SymInts in the original input tensor shape. This PR constrains the range more intelligently by using the upper ranges of each SymInt in the input tensor shape. See https://github.com/pytorch/pytorch/pull/134899 as a merged solution for a similar problem for a different op. ### Test plan Added unit test to verify upper bound reduction calculation (`python test/export/test_export.py TestExport.test_nonzero_dynamic`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/137663 Approved by: https://github.com/ezyang	2024-10-28 20:57:23 +00:00

1 2

96 Commits