pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
Angela Yi	ddbf1aab64	[export] Add dynamic_shapes to _export.aot_compile (#110101 ) Summary: Following the new dynamic_shapes API (introduced in https://github.com/pytorch/pytorch/pull/108448), we will also add a dynamic_shapes API to _export.aot_compile Test Plan: CI Differential Revision: D49653815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110101 Approved by: https://github.com/gmagogsfm	2023-09-27 04:10:22 +00:00
Edward Z. Yang	f7c9ef88f5	Add masked_select abstract impl (#110103 ) Fixes https://github.com/pytorch/pytorch/issues/109871 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/110103 Approved by: https://github.com/bdhirsh	2023-09-27 04:07:58 +00:00
SS-JIA	dec140f1ea	[core IR] Add a core decomposition for aten.all (#110093 ) ## Context Change the ref implementation of `aten.all` to only use other `torch` operators such that we can use it for the core ATen decomposition table. This will replace the decomposition for `aten.all` that was used specifically by Inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110093 Approved by: https://github.com/manuelcandales, https://github.com/peterbell10, https://github.com/lezcano	2023-09-27 01:31:41 +00:00
Yukio Siraichi	51a8c166a6	Add test for `ShapeEnv` recording fallback. (#109944 ) This PR adds a test for the previous PR in this stack: #109904. In summary, it calls functions decorated with `@record_shapeenv_event`, that don't have an explicit `ShapeEnv` parameter, with arguments that don't hold a `ShapeEnv` instance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109944 Approved by: https://github.com/ezyang	2023-09-27 00:50:14 +00:00
SS-JIA	9928c10e71	[core IR] Add glu as a core decomposition (#110043 ) ## Context Add the decomposition for `aten.glu` as a decomposition in the core ATen decomposition table. Don't use it in the Inductor decomposition table since Inductor has a lowering for it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110043 Approved by: https://github.com/peterbell10, https://github.com/lezcano ghstack dependencies: #110046	2023-09-27 00:23:05 +00:00
Yang Chen	4d0ae7c9da	[inductor] support _scaled_dot_product_flash_attention fallback (#110085 ) Summary: This PR supports _scaled_dot_product_flash_attention fallback kernel. Note that in the abi_compatible mode, we retrieve outputs by passing output argument pointers rather than relying on std::get. It also fixes an issue related to dynamic shapes, where we wrongfully query undefined dynamic symbols. Test Plan: ci Reviewed By: frank-wei Differential Revision: D49620191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110085 Approved by: https://github.com/desertfire	2023-09-27 00:09:56 +00:00
Shiyan Deng	19ca883f8b	[pytorch][jit] allow passing in obj loader in unpickle api (#109730 ) Summary: We are trying to use wired message to pass python objects like KJT. In order to make JIT be able to unpickle it, we need to provide a type resolver as well as an obj loader. This diff modify the interface to let we be able to do that. Test Plan: Rely on current CI to make sure existing usage doesn't break. In the next diff, test e2e Differential Revision: D49438569 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109730 Approved by: https://github.com/davidberard98	2023-09-26 23:50:20 +00:00
Edward Z. Yang	3262c5358f	Use _check_is_size for validate_dim_length (#109849 ) _check_is_size has some extra juice for unbacked SymInts, use it. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/109849 Approved by: https://github.com/yanboliang	2023-09-26 23:33:31 +00:00
Wanchao Liang	27443eadeb	[dtensor][7/n] remove reduction rule (#109144 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109144 Approved by: https://github.com/fduwjj ghstack dependencies: #108263, #108264	2023-09-26 22:24:50 +00:00
Wanchao Liang	2dd9a79d22	[dtensor][6/n] refactor reduction to use op strategy (#108264 ) This PR refactors the reduction op to use strategy based propagation Pull Request resolved: https://github.com/pytorch/pytorch/pull/108264 Approved by: https://github.com/fduwjj ghstack dependencies: #108263	2023-09-26 22:24:50 +00:00
Wanchao Liang	986d255db2	[dtensor][5/n] switch random ops to op strategy (#108263 ) This PR switches the random ops to use op strategy instead of rule based, this is a first series of PRs to refactor ops after we refactor op dispatch logic Pull Request resolved: https://github.com/pytorch/pytorch/pull/108263 Approved by: https://github.com/fduwjj	2023-09-26 22:24:42 +00:00
Richard Zou	bb9779ecd2	Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should be prefixed with test class (#110094 ) Summary: Revert D49640259: Revert D49615962: [optests] Test names in failure dicts should Test Plan: revert-hammer Differential Revision: D49645397 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110094 Approved by: https://github.com/izaitsevfb	2023-09-26 21:16:36 +00:00
PyTorch MergeBot	194d9aa0f2	Revert "[Dynamo] Match closures by code ID (#109427 )" This reverts commit `3de0857503`. Reverted https://github.com/pytorch/pytorch/pull/109427 on behalf of https://github.com/voznesenskym due to Fails test `PYTORCH_TEST_WITH_DYNAMO=1 python test_ops.py -k test_out_warning__refs_cat_cpu ([comment](https://github.com/pytorch/pytorch/pull/109427#issuecomment-1736101561))	2023-09-26 18:54:36 +00:00
Angela Yi	a7409695bb	[export] Verifier for exported program (#109519 ) Summary: X-link: https://github.com/pytorch/executorch/pull/292 Added a verifier for the graph signature in a exported program Test Plan: CI Differential Revision: D48926643 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109519 Approved by: https://github.com/zhxchen17	2023-09-26 18:47:43 +00:00
Jane Xu	0a60219fe3	[foreach] Fix 0-size handling for real for real (#109402 ) @crcrpar's last attempt to fix the 0-size problem unfortunately did not pass all cases. See my comment in https://github.com/pytorch/pytorch/issues/100701. When we have a tail tensor of size 0, the old code would mess with the chunk logic to check the previous tensor's length. This is flawed because: 1. if the previous tensor was also 0 sized, (so a tensor list of [tensor, tensor, tensor, ..., 0-sized tensor, 0-sized tensor],) chunks would still be 0 and the nested for loop would be missed. 2. the nested forloop pronounces side effects on tensorListMeta that _shouldn't_ be there! This can mess up the compute in unexpected ways that I haven't really needed to reason through. We noticed that the problem had not been fixed due to an internal report. This PR solves the issue by: - removing the finagling of chunks when the tail tensor is 0-sized - adding a surefire way for the kernel to be launched in the case where the last tensor is 0-sized AND there's content in the metadata, signifying there is stuff to compute still. ## test plan As I went through the code, I also added some comments explaining what's up and modified our tensor inputs to ensure that this case is tested in the test_parity test in test_foreach.py. Yes, I do realize there is quite a bit of duplication and that this file could be due for a refactor. That said, the primary goal of this PR is to fix the pretty egregious bug and refactoring can be a followup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109402 Approved by: https://github.com/albanD	2023-09-26 17:38:20 +00:00
Rodrigo Kumpera	317e39a8ad	[C10d] Cleanup collective sequence number. (#109136 ) Sequence numbers must be associated with a Work object if we want to use it as a way to report collective progress. The API surface change is introducing Work::getSequenceNumber, which should eventually be exposed to python. The bulk of this change is changing gloo to make the sequence number be always in use and weave it to the dozens subclasses of Work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109136 Approved by: https://github.com/fduwjj	2023-09-26 17:17:04 +00:00
Li-Huai (Allan) Lin	d91492a7a4	[MPS] Fix sort with empty tensor. (#109584 ) Fixes #107284 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109584 Approved by: https://github.com/kulinseth, https://github.com/albanD ghstack dependencies: #109557, #109574	2023-09-26 16:30:38 +00:00
Bin Bao	993530ee4f	[aotinductor] Relax the CUDAGuard device index check (#110030 ) Summary: Although AOTInductor only supports running on a single cuda device, it does work in the case where there is a mix of cpu and cuda ops. So instead of asserting if a CUDA index appears for the first time, we check if there is only one cuda device index. This solves https://github.com/pytorch/pytorch/issues/109655 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110030 Approved by: https://github.com/jansel	2023-09-26 16:23:23 +00:00
leslie-fang-intel	0dcea70bfd	fix sfdp patern 13 accuracy issue (#110001 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110001 Approved by: https://github.com/eellison	2023-09-26 15:23:45 +00:00
PyTorch MergeBot	2393864070	Revert "[optests] Test names in failure dicts should be prefixed with test class (#110045 )" This reverts commit `76fcec74c4`. Reverted https://github.com/pytorch/pytorch/pull/110045 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/110045#issuecomment-1735711094))	2023-09-26 14:56:08 +00:00
rzou	ea20db8aa0	[optests] Excise unused operator_compile_check (#110011 ) The recommendation is to just use `opcheck`, which has superceded all uses of `operator_compile_check`. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/110011 Approved by: https://github.com/ezyang ghstack dependencies: #109912	2023-09-26 13:24:21 +00:00
PyTorch MergeBot	812bf847b7	Revert "Add test for `ShapeEnv` recording fallback. (#109944 )" This reverts commit `a4dec8d306`. Reverted https://github.com/pytorch/pytorch/pull/109944 on behalf of https://github.com/atalman due to New test failing internally ([comment](https://github.com/pytorch/pytorch/pull/109944#issuecomment-1735512734))	2023-09-26 13:11:22 +00:00
Peter Bell	92d86cd1ad	[inductor] Fix triton compiler error in multilayer any (#109325 ) Fixes #109196 When we have a split reduction and the tensor is not an even multiple of the split size, we use `ops.masked` to pad to an even multiple. In the case here we generated: ```python tmp5 = tl.where(mask, tmp4, 0) ``` which implicitly promotes our boolean value to `int32`. The fix is to give the default value the same dtype as `result`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109325 Approved by: https://github.com/lezcano	2023-09-26 12:29:29 +00:00
PyTorch MergeBot	1b90f07f5a	Revert "Reland "Update AOTAutograd to use FunctionalTensorMode instead of C++ functionalization (#106406 )" (#109906 )" This reverts commit `d0fe8fa5db`. Reverted https://github.com/pytorch/pytorch/pull/109906 on behalf of https://github.com/atalman due to Breaks internal tests ([comment](https://github.com/pytorch/pytorch/pull/109906#issuecomment-1735416852))	2023-09-26 12:10:25 +00:00
wz337	8140494afd	[3/N][2D] Enable training with new 2D flow (#110034 ) Replacing https://github.com/pytorch/pytorch/pull/109553 as it gets reverted. This PR enables training with new 2D flow and adds associated test. In addition, this PR moves the tensor/parallel/_data_parallel_utils.py that are fsdp specific back to tensor/parallel/fsdp.py to avoid circular dependency for ddp.py and test/distributed/tensor/parallel/test_ddp_2d_parallel.py. state_dict related changes would be in later PRs. cc. @fegin, @fduwjj, @wanchaol, @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/110034 Approved by: https://github.com/fduwjj	2023-09-26 09:14:15 +00:00
Animesh Jain	0673aa3d28	[dynamo][guards-log] Print nn module guard saved dict versions for debugging (#110028 ) This is the output for nn module guards ~~~ [DEBUG] GUARDS: [DEBUG] hasattr(L['x'], '_dynamo_dynamic_indices') == False # _dynamo/variables/builder.py:1356 in wrap_fx_proxy_cls [DEBUG] ___check_obj_id(L['self'], 139820807110912) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_0(L['self']) # versions(mod=9998, _parameters=1194395, _buffers=1194397, _modules=1194423, _forward_hooks=1194405, _forward_pre_hooks=1194411, _backward_hooks=1194402, _backward_pre_hooks=1194400) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[0], 139817945727568) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_1(L['self'].mods[0]) # versions(mod=10001, _parameters=1194428, _buffers=1194430, _modules=1194522, _forward_hooks=1194438, _forward_pre_hooks=1194444, _backward_hooks=1194435, _backward_pre_hooks=1194433) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[1], 139817945560640) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] __nn_module_guard_2(L['self'].mods[1]) # versions(mod=10001, _parameters=1194660, _buffers=1194662, _modules=1194753, _forward_hooks=1194670, _forward_pre_hooks=1194676, _backward_hooks=1194667, _backward_pre_hooks=1194665) # for mod in self.mods: # examples/graph_break.py:35 in forward [DEBUG] ___check_obj_id(L['self'].mods[0].linear, 139817945727856) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] __nn_module_guard_3(L['self'].mods[0].linear) # versions(mod=10004, _parameters=1470004, _buffers=1194467, _modules=1194493, _forward_hooks=1194475, _forward_pre_hooks=1194481, _backward_hooks=1194472, _backward_pre_hooks=1194470) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] ___check_obj_id(L['self'].mods[1].linear, 139817945561120) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] __nn_module_guard_4(L['self'].mods[1].linear) # versions(mod=10004, _parameters=1470008, _buffers=1194699, _modules=1194725, _forward_hooks=1194707, _forward_pre_hooks=1194713, _backward_hooks=1194704, _backward_pre_hooks=1194702) # return self.linear(a) # examples/graph_break.py:24 in helper [DEBUG] utils_device.CURRENT_DEVICE == None # _dynamo/output_graph.py:373 in init_ambient_guards ~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/110028 Approved by: https://github.com/ezyang ghstack dependencies: #110023, #110039	2023-09-26 08:53:07 +00:00
SS-JIA	5df8aca994	[core IR] Add a core decomposition for floor_divide (#110046 ) ## Context Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table. This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition ``` # TorchInductor-only decomposition. It should not be taken to core. # See https://github.com/pytorch/torchdynamo/pull/1120 ``` but couldn't discern the reason why this is the case. cc: @lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046 Approved by: https://github.com/peterbell10	2023-09-26 08:39:21 +00:00
Yukio Siraichi	26e8cc0465	Add test for `ShapeEnv` state when not recording. (#109945 ) This PR adds a test for checking `ShapeEnv` state when it's built with `should_record_events=False`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109945 Approved by: https://github.com/ezyang ghstack dependencies: #109904, #109944	2023-09-26 07:20:46 +00:00
Animesh Jain	2ac7e52d34	[dynamo][nn_module_guards] Config flag to disable nn_module_guards (#110039 ) This flag is requested by @Chillee who is seeing recompilations with simple gpt experiments. We are observing recompilations because `_parameters` ordered dict keeps changing from run to run, and its unclear why that is happening. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110039 Approved by: https://github.com/Chillee ghstack dependencies: #110023	2023-09-26 06:35:23 +00:00
rzou	76fcec74c4	[optests] Test names in failure dicts should be prefixed with test class (#110045 ) We want to use the same failures dict for multiple TestCase. This happens common in e.g. fbgemm. To move towards that, we need to prefix each test name with their test class to avoid ambiguity Differential Revision: [D49615962](https://our.internmc.facebook.com/intern/diff/D49615962/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110045 Approved by: https://github.com/williamwen42	2023-09-26 03:21:12 +00:00
Jez Ng	41bb5c27a2	Enable typechecking for _inductor/fx_passes/joint_graph.py (#109955 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109955 Approved by: https://github.com/Skylion007 ghstack dependencies: #109951, #109952, #109954	2023-09-26 02:49:43 +00:00
Jez Ng	86762f33d1	Enable typechecking for _inductor/fx_passes/pad_mm.py (#109954 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109954 Approved by: https://github.com/Skylion007 ghstack dependencies: #109951, #109952	2023-09-26 02:49:43 +00:00
Jez Ng	55f8553078	Enable typechecking for _inductor/fx_passes/pre_grad.py (#109952 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109952 Approved by: https://github.com/Skylion007 ghstack dependencies: #109951	2023-09-26 02:49:42 +00:00
Jez Ng	89fc66fb36	Enable typechecking for _inductor/fx_passes/split_cat.py (#109951 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109951 Approved by: https://github.com/Skylion007	2023-09-26 02:49:40 +00:00
rzou	f8fcc54f70	Add torch.library.impl_abstract (#109912 ) Changelog: - torch.library.impl_abstract optionally accepts a torch.library.Library object. If passed in, then the lifetime of the registration is tied to the Library object. - we've also changed torch.library.impl_abstract to work on all operators, including overloads. - we refactored the `torch._custom_ops.` and `torch._custom_op.` impl_abstract APIs and put them under torch._library. This is the final resting place for them. I will follow-up with deleting all the `torch._custom_ops.` stuff later. - There is a new "SimpleOperatorRegistry" where we actually collect the abstract_impl. We will expand this to also hold the other torch._custom_ops. APIs when we move those to torch.library NB: Previously we had designed `impl_abstract` assuming a very high-level Python-only custom op API. We've revisited that since; now, impl_abstract works for all custom ops, no matter python or C++, no matter the schema. The new refactored design reflects this better. Test Plan: - existing and new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/109912 Approved by: https://github.com/ezyang	2023-09-26 01:59:50 +00:00
Animesh Jain	b481349d3c	[dynamo][guards-log] Do not print duplicate guard entries (#110023 ) Cleans up logs for nn module guards. They always get duplicated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110023 Approved by: https://github.com/ezyang	2023-09-26 01:59:25 +00:00
Yuqing Jiang	56659844f9	[profiler] Show shapes for lists of tensors in chrome traces #109263 (#109751 ) Summary: https://github.com/pytorch/pytorch/issues/109263 Show the shape of tensorlist when the length is < 30. Test Plan: {F1097707985} and unit tests Reviewed By: davidberard98 Differential Revision: D49351902 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109751 Approved by: https://github.com/davidberard98	2023-09-26 01:03:54 +00:00
Bin Bao	4bf1cd6961	[aotinductor] Rename aot_runtime to aoti_runtime (#110007 ) Summary: Make the naming more explicit Differential Revision: D49593528 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110007 Approved by: https://github.com/houseroad	2023-09-26 00:46:54 +00:00
Yanbo Liang	a81cb0de16	[Dynamo] Support python class member_descriptor (#109956 ) Fixes Meta internal cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109956 Approved by: https://github.com/jansel	2023-09-26 00:03:41 +00:00
Edward Z. Yang	5f6216b12c	Add torch.fx.experimental.recording to uninteresting_files() (#109887 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/109887 Approved by: https://github.com/Chillee	2023-09-25 23:22:29 +00:00
Mu-Chu Lee	7af30ea54c	[AOTInductor] Bug fix for redefining symbol name (#110041 ) Summary: Bug fix for redefining symbol name. Test Plan: python benchmarks/dynamo/huggingface.py --bfloat16 --accuracy --inference --device cuda --export-aot-inductor --cold-start-latency --only OPTForCausalLM Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110041 Approved by: https://github.com/desertfire	2023-09-25 23:03:06 +00:00
Andrei Gheorghe	6275f91654	Improved DDP checkpoint documentation (#106985 ) Amended the documentation for the specified case. Fixes #84589 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106985 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2023-09-25 22:54:24 +00:00
Sam Larsen	7ed06e8317	[inductor] enable mypy checking in torch/_inductor/codegen/cpp.py (#109729 ) Summary: Add enough typehints / ignores to enable mypy checking in torch/_inductor/codegen/cpp.py Test Plan: lintrunner Pull Request resolved: https://github.com/pytorch/pytorch/pull/109729 Approved by: https://github.com/Skylion007	2023-09-25 22:53:05 +00:00
Pritam Damania	ab70183c53	[RFC] Allow "spawn" start method for torchinductor workers. (#108850 ) Context: https://github.com/pytorch/pytorch/issues/108586 This PR adds a config to torchinductor such that users can specify the multiprocessing context for TorchInductor workers in codecache. This would allow users a choice of using "spawn" in multithreaded environments instead of "fork" being hardcoded as the default. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108850 Approved by: https://github.com/ezyang, https://github.com/zdevito	2023-09-25 21:30:17 +00:00
Yukio Siraichi	a4dec8d306	Add test for `ShapeEnv` recording fallback. (#109944 ) This PR adds a test for the previous PR in this stack: #109904. In summary, it calls functions decorated with `@record_shapeenv_event`, that don't have an explicit `ShapeEnv` parameter, with arguments that don't hold a `ShapeEnv` instance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109944 Approved by: https://github.com/ezyang ghstack dependencies: #109904	2023-09-25 20:59:41 +00:00
Mwiza Kunda	5c4b5baf21	Fix python decomps for OpOverloadPackets and add tests (#107707 ) - Extend `test_torch_dispatch_meta_outplace` to test torch ops that do not have an out parameter but have aten op overloads that have out parameters. Additionally, Python decompositions may register `OpOverloadPacket`'s so decompositions need to be tested to ensure all `OpOverloads` still function for the `Meta` key (e.g. if a python decomposition is registered for an aten op `aten.foo` with overloads `[default, out]`, the python function needs to support receiving out arguments) - Add out parameter wrappers to python decomps for aten ops that have out overloads CC. @ezyang @albanD @lezcano Fixes #107713 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107707 Approved by: https://github.com/lezcano	2023-09-25 20:53:30 +00:00
PyTorch MergeBot	c1a2f35805	Revert "Disallow skipping dynamo (#109476 )" This reverts commit `7bb1d10c2f`. Reverted https://github.com/pytorch/pytorch/pull/109476 on behalf of https://github.com/atalman due to Failing internal CI ([comment](https://github.com/pytorch/pytorch/pull/109476#issuecomment-1734402581))	2023-09-25 20:20:50 +00:00
fwenguang	c4f2b6dbd2	[profiler] use PyCFunction_Check to check both PyCMethod_Type and PyC… (#110002 ) At https://github.com/pytorch/pytorch/blob/main/torch/csrc/autograd/profiler_python.cpp#L1096, when what is PyTrace_C_CALL, Py_TYPE(arg) only can be PyCFunction_Type before python3.9. But in python3.9 or later, Py_TYPE(arg) also can be PyCMethod_Type. PyCMethod_Type is subtype of PyCFunction_Type, ref to `f2eaa92b0c/Objects/methodobject.c (L372)`. So there should use PyCFunction_Check to check arg->ob_type. Fixes #109877 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110002 Approved by: https://github.com/ezyang	2023-09-25 20:17:25 +00:00
PyTorch MergeBot	83deaa16ed	Revert "[1/N] Cleanup header inclusions in torch_cpu by iwyu (#101178 )" This reverts commit `b7a95f4fdb`. Reverted https://github.com/pytorch/pytorch/pull/101178 on behalf of https://github.com/atalman due to Break internal CI ([comment](https://github.com/pytorch/pytorch/pull/101178#issuecomment-1734384645))	2023-09-25 20:05:25 +00:00

1 2 3 4 5 ...

32108 Commits