pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	b2f09c1859	Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 )" This reverts commit `d27509c384`. Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/xmfan due to breaking internal builds, see D54707287 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1989542344))	2024-03-11 22:18:36 +00:00
Simon Fan	d27509c384	[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 ) - Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm - Include files more granularly to avoid namespace pollution and circular imports limitations: - requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness - will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash `b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)` - can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection - tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681 Approved by: https://github.com/jansel	2024-03-08 20:43:29 +00:00
PyTorch MergeBot	2b1661c7a0	Revert "[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 )" This reverts commit `05c256849b`. Reverted https://github.com/pytorch/pytorch/pull/120681 on behalf of https://github.com/izaitsevfb due to breaking internal builds, see D54617701 ([comment](https://github.com/pytorch/pytorch/pull/120681#issuecomment-1984214079))	2024-03-07 18:53:51 +00:00
Simon Fan	05c256849b	[compiled autograd] support custom ops backed by c++ autograd::Function (#120681 ) - Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm - Include files more granularly to avoid namespace pollution and circular imports limitations: - requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness - will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash `b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)` - can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection - tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681 Approved by: https://github.com/jansel	2024-03-06 18:01:56 +00:00
Sheng Fu	31bfa59970	Capture primitive data type arguments for profiling python_function (#120949 ) RECORD_FUNCTION in python_function only captures argument that is a Tensor. However, it is very common for user to use non tensor arguments in custom ops, for example, sequence length in GPT attention custom op. My previous PR tries to capture all non-tensor arguments, it turned out in some cases, it is very expensive. This PR is to support primitive (or its container) arguments in RECORD_FUNCTION. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120949 Approved by: https://github.com/soulitzer	2024-03-06 05:09:22 +00:00
PyTorch MergeBot	4903e33e19	Revert "Capture non tensor arguments in record_function (#120017 )" This reverts commit `5c5b71b6ee`. Reverted https://github.com/pytorch/pytorch/pull/120017 on behalf of https://github.com/soulitzer due to regresses perf on autograd Function when using profiler ([comment](https://github.com/pytorch/pytorch/pull/120017#issuecomment-1969883792))	2024-02-28 20:43:33 +00:00
Jason Ansel	01ec8df6d8	[Compiled Autograd] Introduce BackwardState capture (#120382 ) This adds support for backwards hooks that are both: 1) Interior to the graph; and 2) Dynamically generated (e.g. lambdas) We do this by creating a BackwardState object that is used to register the hooks in the forward, then populated by dynamo after the forwards runs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120382 Approved by: https://github.com/xmfan	2024-02-28 20:36:47 +00:00
Sheng Fu	5c5b71b6ee	Capture non tensor arguments in record_function (#120017 ) Summary: RECORD_FUNCTION only capture the argument when it is a Tensor. However, it is very common for user to use the argument with primitive data type (int, float, index, bool). This DIFF is to support non tensor arguments in RECORD_FUNCTION. Test Plan: unit test buck test mode/dev-nosan caffe2/test:profiler -- test_execution_trace_with_pt2 test_execution_trace_alone test_execution_trace_with_kineto test_execution_trace_start_stop test_execution_trace_repeat_in_loop test_execution_trace_no_capture Differential Revision: D53674768 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120017 Approved by: https://github.com/soulitzer	2024-02-22 09:40:08 +00:00
cyy	8a3c241094	Remove unused header inclusion (#119667 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/119667 Approved by: https://github.com/Skylion007	2024-02-12 05:36:25 +00:00
PyTorch MergeBot	dabb90f2a4	Revert "[Exception] [6/N] Remove use of torch::TypeError (#117964 )" This reverts commit `87335fabae`. Reverted https://github.com/pytorch/pytorch/pull/117964 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/117964#issuecomment-1913079096))	2024-01-27 08:44:34 +00:00
cyy	87335fabae	[Exception] [6/N] Remove use of torch::TypeError (#117964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117964 Approved by: https://github.com/albanD	2024-01-25 03:35:58 +00:00
Simon Fan	9eb842cbd6	Compiled autograd: Lift autograd functions' backward and provide default key for custom autograd functions (#115573 ) This PR adds support for torch.autograd.Function subclasses in compiled autograd. We do this by: - Creating a uid for all torch.autograd.Function via its metaclass. This uid is used in the compiled autograd key, which is a subset of the cache key to the compiled graph - "Lifting" the backward/saved_tensors, having them as input arguments in the compiled graph - Creating proxies to track the backward's inputs and outputs. Since the backward's outputs (grads) have to match the forward's inputs, we pass the node's `input_info` (forward's input sizes) to build the proxies tracking the backward's outputs. - Use a `FakeContext` class as a replacement for the autograd node's context object (`BackwardCFunction`) during tracing, only support passing saved_tensors from the forward to the backward - Index each backward, to support multiple torch.autograd.Functions in the same graph - Special case for `CompiledFunctionBackward`, lifting CompiledFunction will fail 4 tests and requires some skipfiles changes that I'd rather do that in a separate PR Example graph: test_custom_fn_saved_multiple_tensors (eager fw + compiled autograd) ```python class MyFn(torch.autograd.Function): @staticmethod def forward(ctx, x, y): ctx.save_for_backward(x, y) return torch.sin(x), torch.sin(y) @staticmethod def backward(ctx, gO_x, gO_y): (x, y) = ctx.saved_tensors return gO_x * torch.cos(x), gO_y * torch.cos(y) ``` The backwards is lifted via `getitem_5` and `call_backward` ```python # Compiled autograd graph ===== Compiled autograd graph ===== <eval_with_key>.0 class CompiledAutograd(torch.nn.Module): def forward(self, inputs, sizes, hooks): # No stacktrace found for following nodes getitem: "f32[]" = inputs[0] getitem_1: "f32[10]" = inputs[1] getitem_2: "f32[10]" = inputs[2] getitem_3: "f32[10]" = inputs[3] getitem_4: "f32[10]" = inputs[4]; inputs = None expand: "f32[10]" = torch.ops.aten.expand.default(getitem, [10]); getitem = None mul: "f32[10]" = torch.ops.aten.mul.Tensor(expand, getitem_2); getitem_2 = None mul_1: "f32[10]" = torch.ops.aten.mul.Tensor(expand, getitem_1); expand = getitem_1 = None getitem_5 = hooks[0]; hooks = None call_backward = torch__dynamo_external_utils_call_backward(getitem_5, (getitem_3, getitem_4), mul_1, mul); getitem_5 = mul_1 = mul = None getitem_6: "f32[10]" = call_backward[0] getitem_7: "f32[10]" = call_backward[1]; call_backward = None accumulate_grad_ = torch.ops.inductor.accumulate_grad_.default(getitem_4, getitem_7); getitem_4 = getitem_7 = None accumulate_grad__1 = torch.ops.inductor.accumulate_grad_.default(getitem_3, getitem_6); getitem_3 = getitem_6 = None return [] ``` then is later inlined by dynamo ```python # Dynamo graph ===== __compiled_fn_0 ===== <eval_with_key>.1 class GraphModule(torch.nn.Module): def forward(self, L_inputs_0_ : torch.Tensor, L_inputs_1_ : torch.Tensor, L_inputs_2_ : torch.Tensor, L_inputs_3_ : torch.Tensor, L_inputs_4_ : torch.Tensor): getitem = L_inputs_0_ getitem_1 = L_inputs_1_ getitem_2 = L_inputs_2_ x = L_inputs_3_ y = L_inputs_4_ # File: <eval_with_key>.0:10, code: expand = torch.ops.aten.expand.default(getitem, [10]); getitem = None expand = torch.ops.aten.expand.default(getitem, [10]); getitem = None # File: <eval_with_key>.0:11, code: mul = torch.ops.aten.mul.Tensor(expand, getitem_2); getitem_2 = None mul = torch.ops.aten.mul.Tensor(expand, getitem_2); getitem_2 = None # File: <eval_with_key>.0:12, code: mul_1 = torch.ops.aten.mul.Tensor(expand, getitem_1); expand = getitem_1 = None mul_1 = torch.ops.aten.mul.Tensor(expand, getitem_1); expand = getitem_1 = None # File: /data/users/xmfan/core/pytorch/test/inductor/test_compiled_autograd.py:412, code: return gO_x * torch.cos(x), gO_y * torch.cos(y) cos = torch.cos(x) getitem_6 = mul_1 * cos; mul_1 = cos = None cos_1 = torch.cos(y) getitem_7 = mul * cos_1; mul = cos_1 = None # File: <eval_with_key>.0:17, code: accumulate_grad_ = torch.ops.inductor.accumulate_grad_.default(getitem_4, getitem_7); getitem_4 = getitem_7 = None accumulate_grad__default = torch.ops.inductor.accumulate_grad_.default(y, getitem_7); y = getitem_7 = None # File: <eval_with_key>.0:18, code: accumulate_grad__1 = torch.ops.inductor.accumulate_grad_.default(getitem_3, getitem_6); getitem_3 = getitem_6 = None accumulate_grad__default_1 = torch.ops.inductor.accumulate_grad_.default(x, getitem_6); x = getitem_6 = None return () ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115573 Approved by: https://github.com/jansel	2024-01-10 18:01:28 +00:00
cyy	91bbcf8c71	[1/N] replace THPUtils_assert with TORCH_CHECK (#116675 ) This PR replaces THPUtils_assert with TORCH_CHECK. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116675 Approved by: https://github.com/albanD	2024-01-04 11:15:33 +00:00
Scott Wolchok	165f4f6ccf	[PyTorch] Redirect c10::optional to std::optional (#101995 ) We have C++17 now! I am intentionally dropping the `c10::optional<c10::ArrayRef>` size optimization. It was intended to improve dispatch, but thanks to D34602980 / #70864 we don't use `optional<ArrayRef>` in function arguments anymore anyway. Differential Revision: [D46079028](https://our.internmc.facebook.com/intern/diff/D46079028/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101995 Approved by: https://github.com/malfet, https://github.com/Skylion007, https://github.com/ezyang	2023-11-30 02:46:41 +00:00
Ying Liu	85b97605ab	Enable set sequence nr (#114120 ) Summary: In some cases (especially those involving collective calls) - we would want to always kick off a collective call first before running going down another path. For example: ``` tbe lookup -> a2a -> overarch dense -------------> ``` if the forward code is written as a2a_out = a2a dense = dense_net out = overarch(a2a_out, dense) out.backward() The current default is running backwards in the opposite order the forward is called. However, there is no data dependency between a2a and dense, so in reality either of them could be run first. We would like the a2a to run first because it provides optimal (on average) overlap. Changing the seq_nr of a2a_out to something large enough would allow autograd engine to kick it off first. Test Plan: Tests incoming Differential Revision: D51445261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114120 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-11-21 19:47:28 +00:00
soulitzer	c435b8c10a	Fix autograd engine callback error propagation from device thread (#113702 ) The existing try-catch doesn't work because it doesn't call err.persist(). This is in contrast to the try-catch for evaluate_function which does work because it calls into python_engine's thread_on_exception which calls persist. Calling persist on a python_error stashes the PyErr state from the thread-local PyThreadState onto the python_error object, so that when this error object is stored onto the future and passed back to the calling cpu thread, python_engine's execute try-catch can then err.restore() the error state. Finally, the python_engine's execute would re-raise so that this is re-caught by the HANDLE_TH_ERRORS macro. Fixes https://github.com/pytorch/pytorch/issues/75750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113702 Approved by: https://github.com/albanD	2023-11-17 20:17:02 +00:00
albanD	5e8be63e99	Allow specifiying inputs as GradientEdge in autograd APIs (#110867 ) This can be useful for advanced users (like AOTAutograd) who don't want to keep the corresponding Tensor alive (for memory reasons for example) or when inplace op will change the Tensor's grad_fn (but gradients wrt to the original value is needed). I went minimal API change but open to suggestions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110867 Approved by: https://github.com/soulitzer	2023-10-12 04:08:44 +00:00
soulitzer	73f4c1a406	[reland2] Update custom Function preserve torch function when inputs … (#110895 ) …returned as-is Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/110895 Approved by: https://github.com/albanD	2023-10-11 21:37:19 +00:00
PyTorch MergeBot	d1c157c598	Revert "[reland] Update custom Function preserve torch function when inputs r… (#110679 )" This reverts commit `563728f61c`. Reverted https://github.com/pytorch/pytorch/pull/110679 on behalf of https://github.com/kit1980 due to The diff has Meta-internal changes, please land from Phabricator ([comment](https://github.com/pytorch/pytorch/pull/110679#issuecomment-1753523182))	2023-10-09 19:09:01 +00:00
soulitzer	563728f61c	[reland] Update custom Function preserve torch function when inputs r… (#110679 ) …eturned as-is reland of https://github.com/pytorch/pytorch/pull/109825#issuecomment-1749803837 Opening this without ghstack to do codev. In our PR, we changed the signature of `_wrap_outputs`. There is some internal code that calls `_wrap_outputs` directly, so we also need to update that callsite. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110679 Approved by: https://github.com/albanD	2023-10-07 00:27:45 +00:00
PyTorch MergeBot	236afe73a2	Revert "Update custom Function preserve torch function when inputs returned as-is (#109825 )" This reverts commit `4e73eee93f`. Reverted https://github.com/pytorch/pytorch/pull/109825 on behalf of https://github.com/PaliC due to causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/109825#issuecomment-1749802739))	2023-10-05 23:49:41 +00:00
soulitzer	4e73eee93f	Update custom Function preserve torch function when inputs returned as-is (#109825 ) Fixes https://github.com/pytorch/pytorch/issues/109805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109825 Approved by: https://github.com/albanD	2023-10-04 22:45:11 +00:00
cyy	d0ad848aa5	Enable misc clang-tidy checks (#110283 ) This PR enables the misc-XX checks in clang-tidy. Meanwhile, I excluded some of them that require a lot of code changes and have no immediate benefits. Some additional fixes and suppression were also given. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110283 Approved by: https://github.com/albanD	2023-09-30 10:39:52 +00:00
Pritam Damania	550b0ec3d4	Release GIL around VariableInfo::zeros to avoid deadlocks (#109454 ) See https://github.com/pytorch/pytorch/issues/109074#issue-1891369807 and https://github.com/pytorch/pytorch/issues/109074#issuecomment-1718825855 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109454 Approved by: https://github.com/albanD	2023-09-18 22:28:48 +00:00
cyy	a14d30d8d1	[1/N] apply clang-tidy in torch/csrc/autograd (#109032 ) This PR begins a new series of patches for enabling clang-tidy checks in torch/csrc/augograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/109032 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-09-15 23:28:43 +00:00
cyy	36b8ca4e48	[2/N] apply clang-tidy in torch/csrc/autograd (#109277 ) This PR follows the work of PR #109032. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109277 Approved by: https://github.com/albanD	2023-09-15 00:39:12 +00:00
Alex Settle	9ba0558d48	Add sequence_nr to aot_autograd to map forward ops to their corresponding backward ops (#103129 ) Fixes #102375 Sequence_nr increments in the forward pass and decrements in the backward pass. Backward ops with the same sequence_nr as a forward op represent the backward implementation for the op. The long term goal is to make this information available to the profiler so users can observe which ops are fused by the inductor openai triton kernels. Added a test for this feature test/dynamo/test_aot_autograd.py::AotAutogradFallbackTests::test_aot_sequence_nr. The test case uses aot_export_module() to create a joint fwd/bwd fx graph. Then it walks all the nodes in fx graph using fx_graph.graph.nodes. The seq_nr of each node is recorded in node.meta. During the fwd pass the seq_nr increments and it decrements during the bwd pass. This allows the user to map forward ops to their corresponding bwd ops which is useful for performance analysis. Expected output from the test case. SeqNr\|OrigAten\|SrcFn 0\|aten.convolution.default\|l__self___conv1 0\|aten.add.Tensor\|l__self___bn1 1\|aten._native_batch_norm_legit_functional.default\|l__self___bn1 2\|aten.relu.default\|l__self___relu1 3\|aten.add.Tensor\|add 4\|aten.view.default\|flatten 5\|aten.t.default\|l__self___fc1 6\|aten.unsqueeze.default\|l__self___fc1 7\|aten.mm.default\|l__self___fc1 8\|aten.squeeze.dim\|l__self___fc1 9\|aten.add.Tensor\|l__self___fc1 10\|aten.sub.Tensor\|l__self___loss_fn 11\|aten.abs.default\|l__self___loss_fn 12\|aten.mean.default\|l__self___loss_fn 12\|aten.ones_like.default\| 12\|aten.expand.default\| 12\|aten.div.Scalar\| 11\|aten.sgn.default\| 11\|aten.mul.Tensor\| 8\|aten.unsqueeze.default\| 7\|aten.t.default\| 7\|aten.mm.default\| 7\|aten.t.default\| 7\|aten.t.default\| 7\|aten.mm.default\| 6\|aten.squeeze.dim\| 5\|aten.t.default\| 4\|aten.view.default\| 2\|aten.threshold_backward.default\| 1\|aten.native_batch_norm_backward.default\| 0\|aten.convolution_backward.default\| 0\|aten.add.Tensor\| Pull Request resolved: https://github.com/pytorch/pytorch/pull/103129 Approved by: https://github.com/soulitzer	2023-08-02 00:52:52 +00:00
Jason Ansel	457d01bcfd	[Compiled Autograd] Remove TORCH_API from generated autograd nodes (#105286 ) This works around the Windows symbol count issues in #103822. Unfortunately, removing TORCH_API only works on Windows, but causes build issues on Linux, so we need the `#ifdef`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105286 Approved by: https://github.com/albanD	2023-07-27 02:33:14 +00:00
Jason Ansel	5a114f72bf	[Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105854 Approved by: https://github.com/albanD	2023-07-27 00:36:47 +00:00
PyTorch MergeBot	e60af5c8e4	Revert "[Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854 )" This reverts commit `26e3b4020f`. Reverted https://github.com/pytorch/pytorch/pull/105854 on behalf of https://github.com/PaliC due to breaking internal embedded device tests (details shared with author) ([comment](https://github.com/pytorch/pytorch/pull/105854#issuecomment-1650559375))	2023-07-25 21:09:18 +00:00
Jason Ansel	26e3b4020f	[Compiled Autograd] Move to torch::dynamo::autograd namespace (#105854 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105854 Approved by: https://github.com/albanD	2023-07-25 01:14:04 +00:00
Jason Ansel	c902b84e0b	Compiled autograd (#103822 ) This branch: 1) converts the autograd tape into an FX graph 2) caches that conversion using a "shadow" graph 3) compiles and runs the generated FX graph instead of the normal autograd What works currently: 1) Caching, capture, and initial integration 2) Backwards hooks 3) Inlining AotAutograd generated subgraphs 4) torch.compiling the generated FX graph 5) Auto-detecting dynamic shapes based on changes Future work 1) Larger scale testing 1) Boxed calling convention, so memory can be freed incrementally 1) Support hooks on SavedTensor 1) Additional testing by running eager autograd tests under compiled_autograd.enable() Pull Request resolved: https://github.com/pytorch/pytorch/pull/103822 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-07-24 21:12:05 +00:00
soulitzer	c85468a94c	[autograd Function] Add private API to not materialize grads for non-differentiable outputs (#104291 ) Fixes https://github.com/pytorch/pytorch/issues/104272 This PR adds a new private API `materialize_non_diff_grads` (default True) such that when set to False, grad outputs corresponding to outputs marked non-differentiable would receive None instead of a zero-filled tensor. This is overrides the setting of `materialize_grads`, i.e. grad outputs corresponding non-differentiable outputs would still be None even if `materialize_grads=True` (the default). Pull Request resolved: https://github.com/pytorch/pytorch/pull/104291 Approved by: https://github.com/albanD	2023-07-08 14:53:54 +00:00
Thiago Crepaldi	3834582327	[ONNX] Add autograd_inlining flag to torch.onnx.export (#104067 ) Fixes #88286, Fixes #97160 Repro: ```python import torch import io from torch.utils.checkpoint import checkpoint class A(torch.nn.Module): # A supported module. def __init__(self): super(A, self).__init__() self.l1 = torch.nn.Linear(2, 2) def forward(self, x): return self.l1(x) class B(torch.nn.Module): # This module is not exportable to ONNX because it # uses gradient-checkpointing. However, its two sub-module's # are exportable, so ORTModule should be used to compute them. def __init__(self): super(B, self).__init__() self.l1 = torch.nn.Linear(2, 2) self.a = A() def forward(self, x): def custom(): def custom_forward(x_): return self.a(x_) return custom_forward z = self.l1(checkpoint(custom(), x)) return z torch.onnx.export( B(), (torch.randn(2, 2),), io.BytesIO(), autograd_inlining=True ) ``` `torch.onnx.export(autograd_inlining=True)` should repro the user error as this is the original execution path. ```bash Traceback (most recent call last): File "repro88286.py", line 36, in <module> torch.onnx.export( File "<@beartype(torch.onnx.utils.export) at 0x7f0f011faee0>", line 385, in export File "/opt/pytorch/torch/onnx/utils.py", line 511, in export _export( File "/opt/pytorch/torch/onnx/utils.py", line 1576, in _export graph, params_dict, torch_out = _model_to_graph( File "<@beartype(torch.onnx.utils._model_to_graph) at 0x7f0f01187dc0>", line 11, in _model_to_graph File "/opt/pytorch/torch/onnx/utils.py", line 1130, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/opt/pytorch/torch/onnx/utils.py", line 1006, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/opt/pytorch/torch/onnx/utils.py", line 910, in _trace_and_get_graph_from_model trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph( File "/opt/pytorch/torch/jit/_trace.py", line 1269, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(args, kwargs) File "/opt/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl return self._call_impl(args, *kwargs) File "/opt/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl return forward_call(args, *kwargs) File "/opt/pytorch/torch/jit/_trace.py", line 128, in forward graph, out = torch._C._create_graph_by_tracing( File "/opt/pytorch/torch/jit/_trace.py", line 119, in wrapper outs.append(self.inner(trace_inputs)) File "/opt/pytorch/torch/nn/modules/module.py", line 1502, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/opt/pytorch/torch/nn/modules/module.py", line 1511, in _call_impl return forward_call(args, *kwargs) File "/opt/pytorch/torch/nn/modules/module.py", line 1492, in _slow_forward result = self.forward(input, *kwargs) File "repro88286.py", line 32, in forward z = self.l1(checkpoint(custom(), x)) File "/opt/pytorch/torch/utils/checkpoint.py", line 412, in checkpoint return CheckpointFunction.apply(function, preserve, args) File "/opt/pytorch/torch/autograd/function.py", line 506, in apply return super().apply(args, *kwargs) # type: ignore[misc] RuntimeError: _Map_base::at ``` By using `autograd_inlining=False`, the export still fail with a different error because autograd inlining is not enabled: ```bash Traceback (most recent call last): File "repro88286.py", line 36, in <module> torch.onnx.export( File "<@beartype(torch.onnx.utils.export) at 0x7f6088b32ee0>", line 385, in export File "/opt/pytorch/torch/onnx/utils.py", line 511, in export _export( File "/opt/pytorch/torch/onnx/utils.py", line 1615, in _export ) = graph._export_onnx( # type: ignore[attr-defined] RuntimeError: ONNX export failed: Couldn't export Python operator CheckpointFunction ``` To allow `CheckpointFunction` into the onnx graph, `operator_export_type=torch.onnx.OperatorExportTypes.ONNX_FALLTHROUGH` flag can be added to `torch.onnx.export`, which would lead to the following ONNX graph: ```bash Exported graph: graph(%prim::PythonOp_0 : Float(2, 2, strides=[2, 1], requires_grad=0, device=cpu), %l1.weight : Float(2, 2, strides=[2, 1], requires_grad=1, device=cpu), %l1.bias : Float(2, strides=[1], requires_grad=1, device=cpu)): %/PythonOp_output_0 : Float(2, 2, strides=[2, 1], requires_grad=0, device=cpu) = ^CheckpointFunction[inplace=0, module="torch.utils.checkpoint", onnx_name="/PythonOp"](<function B.forward.<locals>.custom.<locals>.custom_forward at 0x7fdf9182f670>, True)(%prim::PythonOp_0), scope: __main__.B:: # /opt/pytorch/torch/autograd/function.py:506:0 %6 : Float(2, 2, strides=[2, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name="/l1/Gemm"](%/PythonOp_output_0, %l1.weight, %l1.bias), scope: __main__.B::/torch.nn.modules.linear.Linear::l1 # /opt/pytorch/torch/nn/modules/linear.py:114:0 return (%6) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104067 Approved by: https://github.com/BowenBao, https://github.com/kit1980	2023-07-05 15:27:36 +00:00
soulitzer	896d997dd0	Remove incorrect THP{Cpp,}Function_traverse PyObject traversals (#102860 ) Fixes https://github.com/pytorch/pytorch/issues/102174 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102860 Approved by: https://github.com/albanD	2023-06-02 22:05:25 +00:00
PandaNinjas	f0786ad776	Use %zu instead of %ld when formatting size_t (#101412 ) This fixes compiling on systems where `size_t` is an `unsigned int` instead of an `unsigned long int` (32 bit Raspberry Pi OS is one example). `%ld` expects an `unsigned long int`, while `%zu` specifies that it's an unsigned size_t. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101412 Approved by: https://github.com/albanD	2023-05-16 02:45:55 +00:00
soulitzer	abe96654de	[reland][BE][autograd Function] Raise an error if input is returned a… (#98051 ) …s-is and saved for forward or backward in setup_context Fixes #ISSUE_NUMBER Relanding this in a new non-ghstack PR so I can import this to do co-dev Pull Request resolved: https://github.com/pytorch/pytorch/pull/98051 Approved by: https://github.com/zou3519	2023-04-11 15:42:54 +00:00
PyTorch MergeBot	45acfc8574	Revert "[BE][autograd Function] Raise an error if input is returned as-is and saved for forward or backward in setup_context (#97212 )" This reverts commit `313db584f3`. Reverted https://github.com/pytorch/pytorch/pull/97212 on behalf of https://github.com/soulitzer due to Internally someone is rely on _wrap_outputs and we updated its signature	2023-03-30 22:03:07 +00:00
soulitzer	313db584f3	[BE][autograd Function] Raise an error if input is returned as-is and saved for forward or backward in setup_context (#97212 ) Fixes https://github.com/pytorch/pytorch/issues/96887 We error out in BOTH the case when graph is created and when it is not created. Still bc-breaking, but not as severe because we are limiting to the case where someone uses setup_context. This makes setup_context and non-setup_context versions diverge in their behavior - With the non-setup_context version, saved variables are assumed to have the grad_fn of the inputs. - But now with the setup_context version, we produce an error for this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97212 Approved by: https://github.com/zou3519	2023-03-29 17:54:00 +00:00
PyTorch MergeBot	2ef6ffdfa1	Revert "[BE][autograd Function] Raise an error if input is returned as-is and saved for forward or backward in setup_context (#97212 )" This reverts commit `f3aca45a16`. Reverted https://github.com/pytorch/pytorch/pull/97212 on behalf of https://github.com/soulitzer due to TestAutogradFunctionCUDA.test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_True_cuda leaks	2023-03-28 18:30:51 +00:00
soulitzer	f3aca45a16	[BE][autograd Function] Raise an error if input is returned as-is and saved for forward or backward in setup_context (#97212 ) Fixes https://github.com/pytorch/pytorch/issues/96887 We error out in BOTH the case when graph is created and when it is not created. Still bc-breaking, but not as severe because we are limiting to the case where someone uses setup_context. This makes setup_context and non-setup_context versions diverge in their behavior - With the non-setup_context version, saved variables are assumed to have the grad_fn of the inputs. - But now with the setup_context version, we produce an error for this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97212 Approved by: https://github.com/zou3519	2023-03-28 03:14:32 +00:00
Aaron Gokaslan	8c8cd9539d	Add missing moves to torch autograd (#92772 ) Applies some additional std::move functions to torch/csrc/autograd to opportunities that were found via static analysis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92772 Approved by: https://github.com/ezyang	2023-01-24 02:01:52 +00:00
soulitzer	a112814a7f	Simplify retains grad hook implementation (#92604 ) How the old retains_grad hooks was implemented: - retains_grad hooks are stored on the autograd_meta, as entries in a vector - upon registration, a wrapper hook CppFunctionTensorPreHook is created to wrap that vector, and then that wrapper hook is registered to the grad_fn, i.e., by appending it to a vector of retains_grad hooks on the grad_fn - upon in-place, for the old grad_fn we set the retains_grad hook to nullptr, so that even though the old grad_fn still references the vector, the vector contains a single nullptr. For the new grad_fn, we create a new wrapper hook around the vector (storing the single retains_grad hook) on autograd_meta. The new retains_grad hook implementation: - we store std::function by value, and we store it on the grad_fn rather than the autograd_meta - a single grad_fn can have multiple outputs, so it can potentially hold multiple retains_grad hooks. We use an unordered_map (previously a vector). - on in-place we remove the hook from the old grad_fn and put it in the new grad_fn (small implication of this change is that we we now need to have access to both the old grad_fn and new grad_fn, this isn't a problem) Other details: - CppFunctionTensorPreHook took a shared_ptr to vector of std::function. In our new implementation, we add a new wrapper hook CppFunctionSingleTensorPreHook, which takes a single std::function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92604 Approved by: https://github.com/albanD	2023-01-23 20:10:46 +00:00
soulitzer	1bc60c6b31	[reland] Improve hooks ordering behavior (#92559 ) This reverts commit `e525f433e1`. Original PR: #85849 Fixes #ISSUE_NUMBER In addition to reverting the revert, this PR: - defines the virtual destructor of FunctionPreHook in the header. Why? Presumably the internal build imports the header from somewhere, but does not have function_hooks.cpp (where the virtual destructor was previously defined) in the same compilation unit. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92559 Approved by: https://github.com/albanD	2023-01-19 08:17:32 +00:00
PyTorch MergeBot	e525f433e1	Revert "Improve hooks ordering behavior (#85849 )" This reverts commit `049838f249`. Reverted https://github.com/pytorch/pytorch/pull/85849 on behalf of https://github.com/albanD due to fails internal build	2023-01-18 15:27:22 +00:00
Richard Zou	98b78aa11c	[autograd.Function] setup_context always appears on the Function (#92312 ) Previously, we used the existence of setup_context to switch between if forward should take a ctx object or not. To be consistent with all other staticmethod (which always exist on the autograd.Function), this PR change it so that we use IF setup_context gets overriden by the user to switch between if forward should take a ctx object or not. Fixes https://github.com/pytorch/pytorch/issues/91451 Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/92312 Approved by: https://github.com/albanD, https://github.com/soulitzer	2023-01-18 02:55:42 +00:00
soulitzer	049838f249	Improve hooks ordering behavior (#85849 ) Addresses: https://github.com/pytorch/pytorch/issues/35802 Design doc: https://docs.google.com/document/d/19xSib7FFknRQ5f3ptGFUmiOt3BrgXSUlTQH2xMcZJYg/edit# ### Changes in this PR #### Implementation - We have now have 3 fields: pre_hooks, retains_grad_hooks, and tensor_pre_hooks so that we can more precisely define their ordering and when they are executed. - Since retains grad uses an entirely new field, we cannot reuse the old retains grad, logic. We refactor retains grad to call directly into the variable.cpp logic. Other logic in variable.cpp that handle cpp hooks must also be updated. #### Hooks ordering and execution: - Defines pre-hooks registered on tensor to run before pre-hooks registered on grad_fn - Updates pre-hooks registered on tensor to always run, even if they are the inputs= to .grad() - Post hooks (and pre hooks) can now observe the modifications to gradient by the tensor pre hook #### Retains grad hooks - retains grad hooks always execute last, even if there are other tensor pre-hooks registered #### Unchanged: - pre_hooks registered to grad_fn aren't expected to execute if they are the inputs= to .grad() Follow ups: - simplify retains_grad field to not be a vector, since it always holds a single hook - potentially merge capture hooks with tensor pre hooks, this would involve some additional refactoring since - python hooks registered to tensor behavior on in-place is still wrong Pull Request resolved: https://github.com/pytorch/pytorch/pull/85849 Approved by: https://github.com/albanD	2023-01-17 16:23:21 +00:00
Richard Zou	81cc9bba5e	[autograd.Function] Kill the extension feature flag (#92026 ) This PR removes the autograd.Function extension feature flag. This was previously used for development of the functorch <> autograd.Function interaction. It's been in master for long enough with the feature flag defaulting to True, so it's time to remove it. Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/92026 Approved by: https://github.com/soulitzer	2023-01-17 13:36:42 +00:00
Richard Zou	7aaad0b832	Rename flag that enables/disables _SingleLevelFunction for functorch (#92025 ) functorch used to have a switch that enables/disables autograd.Function. That switch now enables/disables torch.autograd.function._SingleLevelFunction, so I've renamed it accordingly. We could just delete the switch because users should not be directly working with torch.autograd.function._SingleLevelFunction. However, it was useful for debugging when something went wrong when I was implementing the autograd.Function <> functorch interaction, so I want to keep it around as a debugging tool for a while since the code is already there. Test Plan: - updated tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/92025 Approved by: https://github.com/soulitzer	2023-01-17 13:36:41 +00:00
PyTorch MergeBot	b3603f8129	Revert "Deduplicate c10 error and PyTorchError hierarchy (#87855 )" This reverts commit `34f2d3e6ae`. Reverted https://github.com/pytorch/pytorch/pull/87855 on behalf of https://github.com/osalpekar due to perf regression in quantization tests	2023-01-06 19:56:35 +00:00

1 2 3 4 5 ...

253 Commits