pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Simon Fan	ab04f3aee1	[ca] set autograd graph task state (#143108 ) GraphTask holds metadata needed for a single execution of backward(), it is 1:1 with backward calls, at least for compiled autograd. It is used for certain torch._C global autograd state APIs. In SAC, we use torch._C._current_graph_task_id() as a dict key to store information during unpack hook execution: `a5fb07af27/torch/utils/checkpoint.py (L1128)` If we don't set an active task, it will randomize the key, and will do its logic as if each unpacked tensor was from a different graph task `a5fb07af27/torch/utils/checkpoint.py (L1112-L1115)` The sketchy part of this PR is that in eager autograd, GraphTask is mutated during execution. But inspecting the struct, the mutation seems to only be used to communicate between autograd threads (created when multiple devices are involved) or for deprecated uses. We shouldn't run into the mutation case at all in compiled autograd. Also, only the graph task id is accessible from python hooks. FIXES https://github.com/pytorch/pytorch/issues/142862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143108 Approved by: https://github.com/jansel, https://github.com/albanD	2024-12-13 03:10:48 +00:00
Richard Barnes	7667235a23	c10::optional -> std::optional (#142514 ) Fixes issues introduced in https://github.com/pytorch/pytorch/pull/141348 and https://github.com/pytorch/pytorch/pull/139578 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142514 Approved by: https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2024-12-12 17:23:46 +00:00
cyy	f7b9533c3f	[4/N] Apply bugprone-unchecked-optional-access (#142832 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/142832 Approved by: https://github.com/albanD	2024-12-12 04:33:32 +00:00
cyy	7d98b3dcee	[3/N] Apply bugprone-unchecked-optional-access (#142442 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/142442 Approved by: https://github.com/albanD	2024-12-11 01:39:10 +00:00
cyy	b4c0973b59	[2/N] Apply bugprone-unchecked-optional-access (#141091 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141091 Approved by: https://github.com/Skylion007, https://github.com/albanD Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-12-09 19:30:19 +00:00
rzou	215f5d77b5	[functional autograd] Refactor validate_outputs into a functional variant (#141348 ) Today, validate_outputs is stateful (it depends on the autograd graph). This PR refactors it into a stateless form that just depends on InputMetadata. Test Plan: - new unittest Pull Request resolved: https://github.com/pytorch/pytorch/pull/141348 Approved by: https://github.com/soulitzer ghstack dependencies: #141278	2024-12-04 18:06:31 +00:00
Simon Fan	db4e8a1d8a	[ca] expose option to collect sizes as dynamic (#141153 ) This is to address recompiles from eager nodes that saved dynamic activations Pull Request resolved: https://github.com/pytorch/pytorch/pull/141153 Approved by: https://github.com/jansel ghstack dependencies: #141152	2024-11-22 19:26:27 +00:00
PyTorch MergeBot	614e727191	Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211 )" This reverts commit `cd942d00dd`. Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/izaitsevfb due to causes crash internally during test listing ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2492328790))	2024-11-21 21:05:22 +00:00
cyyever	cd942d00dd	[Environment Variable][7/N] Use thread-safe getenv functions (#140211 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211 Approved by: https://github.com/ezyang, https://github.com/eqy	2024-11-21 00:25:20 +00:00
PyTorch MergeBot	4a18e26ff5	Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211 )" This reverts commit `a3cff4bbd4`. Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/ezyang due to One of these diffs had incorrect downstream optional handling, we must reaudit all of these diffs ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2473709246))	2024-11-13 14:05:01 +00:00
cyy	a3cff4bbd4	[Environment Variable][7/N] Use thread-safe getenv functions (#140211 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211 Approved by: https://github.com/ezyang, https://github.com/eqy	2024-11-12 18:49:51 +00:00
cyy	032135f8a2	[2/N] Turn inline static functions into static (#140068 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140068 Approved by: https://github.com/ezyang	2024-11-09 03:31:24 +00:00
soulitzer	d6f340f66c	Determine autograd engine ready queue based on InputMetadata instead of InputBuffer (#135633 ) Thanks @awgu for raising this issue and the small repro From offline discussion with @albanD, in the case where a forward returns multiple outputs with different devices, we'd want to select the ready queue based on the device of the first one. Even though this is somewhat arbitrary, we prefer this over deciding which ready queue to push based on whichever input buffer's we happen to compute last, which can vary depending on more factors and thus be harder to reason about. This is in theory bc-breaking, but it seems unlikely that someone would depend on this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135633 Approved by: https://github.com/albanD	2024-10-04 23:59:46 +00:00
Jane Xu	7f2d20e687	Run all autograd node post hooks (#134728 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134728 Approved by: https://github.com/albanD, https://github.com/soulitzer	2024-09-06 19:44:28 +00:00
cyy	929d2f8253	[3/N] Fix clang-tidy warnings in torch/csrc/autograd (#133389 ) Follows #133295 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133389 Approved by: https://github.com/Skylion007	2024-08-16 00:57:54 +00:00
cyy	71efbf701d	[3/N] Change #include <c10/util/Optional.h> to #include <optional> (#130300 ) Follows #130236 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130300 Approved by: https://github.com/ezyang	2024-07-09 13:32:57 +00:00
cyy	f4dcf2ae93	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-07-08 07:03:53 +00:00
PyTorch MergeBot	846bb30e13	Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 )" This reverts commit `bd72e28314`. Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build `bd72e28314`. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))	2024-06-15 01:58:20 +00:00
cyy	bd72e28314	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang	2024-06-14 23:21:01 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
Richard Barnes	98e5238ad8	[codemod][lowrisk] Remove unused exception parameter from caffe2/caffe2/image/image_input_op.h (#123056 ) Summary: `-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it. This: ``` try { ... } catch (exception& e) { // no use of e } ``` should instead be written as ``` } catch (exception&) { ``` If the code compiles, this is safe to land. Test Plan: Sandcastle Reviewed By: palmje Differential Revision: D55548497 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123056 Approved by: https://github.com/Skylion007	2024-04-04 17:24:43 +00:00
Valentin Andrei	8bb3e0b643	[pytorch] Name the main and autograd threads for better debugging (#121170 ) The main thread and the autograd one are latency critical threads. They launch CPU/GPU/Accelerator kernels and if for some reason they get preempted, the rank can become a straggler in a distributed training application. By naming these threads we can debug performance issues that impact the latency sensitive threads. I used Kineto traces to verify if the thread names were propagated: <img width="851" alt="Screenshot 2024-03-04 at 3 07 43 PM" src="https://github.com/pytorch/pytorch/assets/23515689/68b4a09c-b8e5-4f14-a5c0-6593f866c03f"> Also: ``` nvidia-smi +-----------------------------------------------------------------------------+ \| Processes: \| \| GPU GI CI PID Type Process name GPU Memory \| \| ID ID Usage \| \|=============================================================================\| \| 0 N/A N/A 3065920 C ...me#python#py_version_3_10 1968MiB \| \| 1 N/A N/A 3065926 C ...me#python#py_version_3_10 1978MiB \| \| 2 N/A N/A 3065930 C ...me#python#py_version_3_10 2084MiB \| \| 3 N/A N/A 3065936 C ...me#python#py_version_3_10 2016MiB \| \| 4 N/A N/A 3065939 C ...me#python#py_version_3_10 1998MiB \| \| 5 N/A N/A 3065943 C ...me#python#py_version_3_10 2070MiB \| \| 6 N/A N/A 3065948 C ...me#python#py_version_3_10 2026MiB \| \| 7 N/A N/A 3065952 C ...me#python#py_version_3_10 2070MiB \| +-----------------------------------------------------------------------------+ [me@myhost ~]$ ps -T -p 3065920 PID SPID TTY TIME CMD 3065920 3065920 pts/14 00:01:04 pt_main_thread ... 3065920 3092181 pts/14 00:00:40 pt_autograd_d0 3065920 3092182 pts/14 00:00:00 pt_autograd_d1 3065920 3092183 pts/14 00:00:00 pt_autograd_d2 3065920 3092184 pts/14 00:00:00 pt_autograd_d3 3065920 3092185 pts/14 00:00:00 pt_autograd_d4 3065920 3092186 pts/14 00:00:00 pt_autograd_d5 3065920 3092187 pts/14 00:00:00 pt_autograd_d6 3065920 3092188 pts/14 00:00:00 pt_autograd_d7 ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/121170 Approved by: https://github.com/albanD	2024-03-05 22:15:39 +00:00
Rohan Potdar	f67c77c497	Update engine.cpp (#120773 ) Minor comment fix; `backward` and `grad` are flipped here. See https://pytorch.org/docs/stable/_modules/torch/autograd.html#backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/120773 Approved by: https://github.com/albanD, https://github.com/janeyx99, https://github.com/soulitzer	2024-02-28 18:23:35 +00:00
albanD	ca777fbbb7	Add Accelerator device and shell hooks (#119329 ) This adds a concept of Accelerator that points to one of our devices. See DeviceAccelerator.h in this PR for details https://github.com/pytorch/pytorch/pull/119329/files#diff-83cc748bed5df1a453c272cc5ecc7e572d4eb694c5125384d8fbd17a0b5f50c8 It also adds scaffolding for shared C++ API to allow generic feature implementation. This PR in particular updates the autograd engine to use this generic API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119329 Approved by: https://github.com/ezyang, https://github.com/huydhn	2024-02-13 23:15:24 +00:00
PyTorch MergeBot	214f06ae3a	Revert "Add Accelerator device and shell hooks (#119329 )" This reverts commit `4b9568a360`. Reverted https://github.com/pytorch/pytorch/pull/119329 on behalf of https://github.com/huydhn due to Breaks internal build and requires OSS file update to fix it ([comment](https://github.com/pytorch/pytorch/pull/119329#issuecomment-1940278598))	2024-02-13 02:23:45 +00:00
Edward Z. Yang	482345d747	Refactor out shape test into InputMetadata::maybe_reduce (#119559 ) I'm going to gut this function shortly, and having it all on InputMetadata is convenient for this purpose. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119559 Approved by: https://github.com/soulitzer	2024-02-12 19:27:48 +00:00
albanD	4b9568a360	Add Accelerator device and shell hooks (#119329 ) This adds a concept of Accelerator that points to one of our devices. See DeviceAccelerator.h in this PR for details https://github.com/pytorch/pytorch/pull/119329/files#diff-83cc748bed5df1a453c272cc5ecc7e572d4eb694c5125384d8fbd17a0b5f50c8 It also adds scaffolding for shared C++ API to allow generic feature implementation. This PR in particular updates the autograd engine to use this generic API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119329 Approved by: https://github.com/ezyang	2024-02-09 18:54:28 +00:00
albanD	a6e16fe202	Fix global in header warning (#119380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119380 Approved by: https://github.com/janeyx99	2024-02-07 20:35:21 +00:00
garfield1997	fbf92500fb	enable privateuseone to perform streaming backward (#117111 ) Fixes #116957 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117111 Approved by: https://github.com/soulitzer	2024-01-30 15:13:31 +00:00
cyy	2f17a21b2b	[Reland] [13/N] Enable clang-tidy on headers of torch/csrc (#117088 ) Reland of #116560 and fixes the issued reported by #116695 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117088 Approved by: https://github.com/albanD	2024-01-10 23:58:04 +00:00
PyTorch MergeBot	791db94c62	Revert "[13/N] Enable clang-tidy on headers of torch/csrc (#116560 )" This reverts commit `b0629cdd67`. Reverted https://github.com/pytorch/pytorch/pull/116560 on behalf of https://github.com/izaitsevfb due to Reverting, as it depends on #116353, which has to be reverted ([comment](https://github.com/pytorch/pytorch/pull/116560#issuecomment-1876033363))	2024-01-03 22:08:40 +00:00
cyy	b0629cdd67	[13/N] Enable clang-tidy on headers of torch/csrc (#116560 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116560 Approved by: https://github.com/Skylion007, https://github.com/albanD	2024-01-02 05:33:04 +00:00
cyy	bb2a1e9941	Enable readability-redundant-smartptr-get in clang-tidy (#116381 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116381 Approved by: https://github.com/Skylion007	2023-12-26 06:05:15 +00:00
mantaionut	d521857411	Terminate handler (#101332 ) Fixes #50051. This PR is based on #50320 and I address the last feedback. On Windows it is enabled by default. Can be enabled or disabled via USE_CUSTOM_TERMINATE env variable. This PR adds support for overriding the terminate handler in order to log uncaught exceptions in the threads. If an exception is thrown and not caught, it will print <Unhandled exception caught in c10/util/AbortHandler.h> The point of doing this is that in issue #50051, exceptions were thrown but not logged. With this logging system it will be easier to debug it in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101332 Approved by: https://github.com/albanD, https://github.com/malfet	2023-12-12 17:55:27 +00:00
Scott Wolchok	165f4f6ccf	[PyTorch] Redirect c10::optional to std::optional (#101995 ) We have C++17 now! I am intentionally dropping the `c10::optional<c10::ArrayRef>` size optimization. It was intended to improve dispatch, but thanks to D34602980 / #70864 we don't use `optional<ArrayRef>` in function arguments anymore anyway. Differential Revision: [D46079028](https://our.internmc.facebook.com/intern/diff/D46079028/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101995 Approved by: https://github.com/malfet, https://github.com/Skylion007, https://github.com/ezyang	2023-11-30 02:46:41 +00:00
soulitzer	c435b8c10a	Fix autograd engine callback error propagation from device thread (#113702 ) The existing try-catch doesn't work because it doesn't call err.persist(). This is in contrast to the try-catch for evaluate_function which does work because it calls into python_engine's thread_on_exception which calls persist. Calling persist on a python_error stashes the PyErr state from the thread-local PyThreadState onto the python_error object, so that when this error object is stored onto the future and passed back to the calling cpu thread, python_engine's execute try-catch can then err.restore() the error state. Finally, the python_engine's execute would re-raise so that this is re-caught by the HANDLE_TH_ERRORS macro. Fixes https://github.com/pytorch/pytorch/issues/75750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113702 Approved by: https://github.com/albanD	2023-11-17 20:17:02 +00:00
Michael Voznesensky	7d98549ca9	retain_graph=True in compiled_autograd (#110367 ) Adds support for retain_graph=True - known as keep_graph_ internally in the autograd engine. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110367 Approved by: https://github.com/jansel	2023-10-06 08:22:10 +00:00
cyy	e9e93c5350	[Reland] Move torch::make_unique to std::make_unique (#109780 ) We can first try to move torch::make_unique to std::make_unique despite reverting of #108866 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/109780 Approved by: https://github.com/ezyang	2023-09-21 18:30:21 +00:00
PyTorch MergeBot	525e4f42d0	Revert "replace torch::make_unique with std::make_unique (#108866 )" This reverts commit `03e35efbf7`. Reverted https://github.com/pytorch/pytorch/pull/108866 on behalf of https://github.com/clee2000 due to Sorry but I found more usages of `torch::make_unique` internally, I can go change all of these, but I'd prefer if that gets done before this gets merged ([comment](https://github.com/pytorch/pytorch/pull/108866#issuecomment-1722577925))	2023-09-17 21:57:30 +00:00
cyy	75b954b715	[4/N] Enable clang-tidy in torch/csrc/autograd (#109455 ) The PR enables clang-tidy checks in torch/csrc/autograd. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109455 Approved by: https://github.com/Skylion007	2023-09-17 17:11:50 +00:00
cyy	a14d30d8d1	[1/N] apply clang-tidy in torch/csrc/autograd (#109032 ) This PR begins a new series of patches for enabling clang-tidy checks in torch/csrc/augograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/109032 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-09-15 23:28:43 +00:00
cyy	36b8ca4e48	[2/N] apply clang-tidy in torch/csrc/autograd (#109277 ) This PR follows the work of PR #109032. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109277 Approved by: https://github.com/albanD	2023-09-15 00:39:12 +00:00
cyy	03e35efbf7	replace torch::make_unique with std::make_unique (#108866 ) It should be safe to remove the old torch::make_unique functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108866 Approved by: https://github.com/albanD	2023-09-14 20:52:26 +00:00
Jason Ansel	a01e795a6d	[Compiled Autograd] Fix bug with multithreading check (#106621 ) Fixes #106555 There was bug where the multithreading check would fire because of the `compiled_autograd.disable()` calls in AotAutograd, even though compiled autograd was already disabled, so that call was doing nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106621 Approved by: https://github.com/yanboliang	2023-08-04 20:49:21 +00:00
Jason Ansel	ac6d8fb16e	[Compiled Autograd] Add eager autograd tests (#105808 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105808 Approved by: https://github.com/albanD, https://github.com/soulitzer	2023-07-28 15:59:35 +00:00
Jason Ansel	c902b84e0b	Compiled autograd (#103822 ) This branch: 1) converts the autograd tape into an FX graph 2) caches that conversion using a "shadow" graph 3) compiles and runs the generated FX graph instead of the normal autograd What works currently: 1) Caching, capture, and initial integration 2) Backwards hooks 3) Inlining AotAutograd generated subgraphs 4) torch.compiling the generated FX graph 5) Auto-detecting dynamic shapes based on changes Future work 1) Larger scale testing 1) Boxed calling convention, so memory can be freed incrementally 1) Support hooks on SavedTensor 1) Additional testing by running eager autograd tests under compiled_autograd.enable() Pull Request resolved: https://github.com/pytorch/pytorch/pull/103822 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-07-24 21:12:05 +00:00
soulitzer	39477f7ca9	Remove unnecessary seen check in get_current_graph_task_execution_order (#105487 ) https://github.com/pytorch/pytorch/pull/105353#discussion_r1266977015 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105487 Approved by: https://github.com/albanD, https://github.com/jansel	2023-07-18 23:49:45 +00:00
soulitzer	cf404a8ce4	Fix get_current_graph_task_execution_order accumulate_grads ordering (#105353 ) Fixes https://github.com/pytorch/pytorch/issues/105293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105353 Approved by: https://github.com/albanD	2023-07-18 00:59:25 +00:00
Masaki Kozuki	07c60d11b3	replace `AT_ERROR(...)` with `TORCH_CHECK(false, ...)` (#104534 ) Merely cosmetic for `AT_ERROR` I found by chance, following `e9d2d74f0a/c10/util/Exception.h (L622)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104534 Approved by: https://github.com/soulitzer	2023-07-03 22:43:19 +00:00
Aidyn-A	69eef5a4be	[CUDA12] set_device change (#94864 ) This PR adds workaround for CUDA 12 [`cudaSetDevice` change](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g159587909ffa0791bbe4b40187a4c6bb) which will always create primary context on target device. So operations like this: ```Python import torch x = torch.randn(1, device="cuda:1") ``` would always create primary context on on device `cuda:1` because it is creating a tensor on it and on device `cuda:0` because the destructor of CUDA Device guard calls `cudaSetDevice(0)`. After this PR the CUDA Device guard will not call `cudaSetDevice(0)` if primary context does not exist on `cuda:0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94864 Approved by: https://github.com/malfet, https://github.com/atalman, https://github.com/ezyang	2023-04-10 17:31:12 +00:00

1 2 3 4 5 ...

271 Commits