pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
cyy	032135f8a2	[2/N] Turn inline static functions into static (#140068 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/140068 Approved by: https://github.com/ezyang	2024-11-09 03:31:24 +00:00
soulitzer	d6f340f66c	Determine autograd engine ready queue based on InputMetadata instead of InputBuffer (#135633 ) Thanks @awgu for raising this issue and the small repro From offline discussion with @albanD, in the case where a forward returns multiple outputs with different devices, we'd want to select the ready queue based on the device of the first one. Even though this is somewhat arbitrary, we prefer this over deciding which ready queue to push based on whichever input buffer's we happen to compute last, which can vary depending on more factors and thus be harder to reason about. This is in theory bc-breaking, but it seems unlikely that someone would depend on this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135633 Approved by: https://github.com/albanD	2024-10-04 23:59:46 +00:00
Jane Xu	7f2d20e687	Run all autograd node post hooks (#134728 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134728 Approved by: https://github.com/albanD, https://github.com/soulitzer	2024-09-06 19:44:28 +00:00
cyy	929d2f8253	[3/N] Fix clang-tidy warnings in torch/csrc/autograd (#133389 ) Follows #133295 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133389 Approved by: https://github.com/Skylion007	2024-08-16 00:57:54 +00:00
cyy	71efbf701d	[3/N] Change #include <c10/util/Optional.h> to #include <optional> (#130300 ) Follows #130236 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130300 Approved by: https://github.com/ezyang	2024-07-09 13:32:57 +00:00
cyy	f4dcf2ae93	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-07-08 07:03:53 +00:00
PyTorch MergeBot	846bb30e13	Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 )" This reverts commit `bd72e28314`. Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build `bd72e28314`. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))	2024-06-15 01:58:20 +00:00
cyy	bd72e28314	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang	2024-06-14 23:21:01 +00:00
Jeff Daily	ae9a4fa63c	[ROCm] enforce ROCM_VERSION >= 6.0 (#125646 ) Remove any code relying on ROCM_VERSION < 6.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125646 Approved by: https://github.com/albanD, https://github.com/eqy	2024-05-12 18:01:28 +00:00
Richard Barnes	98e5238ad8	[codemod][lowrisk] Remove unused exception parameter from caffe2/caffe2/image/image_input_op.h (#123056 ) Summary: `-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it. This: ``` try { ... } catch (exception& e) { // no use of e } ``` should instead be written as ``` } catch (exception&) { ``` If the code compiles, this is safe to land. Test Plan: Sandcastle Reviewed By: palmje Differential Revision: D55548497 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123056 Approved by: https://github.com/Skylion007	2024-04-04 17:24:43 +00:00
Valentin Andrei	8bb3e0b643	[pytorch] Name the main and autograd threads for better debugging (#121170 ) The main thread and the autograd one are latency critical threads. They launch CPU/GPU/Accelerator kernels and if for some reason they get preempted, the rank can become a straggler in a distributed training application. By naming these threads we can debug performance issues that impact the latency sensitive threads. I used Kineto traces to verify if the thread names were propagated: <img width="851" alt="Screenshot 2024-03-04 at 3 07 43 PM" src="https://github.com/pytorch/pytorch/assets/23515689/68b4a09c-b8e5-4f14-a5c0-6593f866c03f"> Also: ``` nvidia-smi +-----------------------------------------------------------------------------+ \| Processes: \| \| GPU GI CI PID Type Process name GPU Memory \| \| ID ID Usage \| \|=============================================================================\| \| 0 N/A N/A 3065920 C ...me#python#py_version_3_10 1968MiB \| \| 1 N/A N/A 3065926 C ...me#python#py_version_3_10 1978MiB \| \| 2 N/A N/A 3065930 C ...me#python#py_version_3_10 2084MiB \| \| 3 N/A N/A 3065936 C ...me#python#py_version_3_10 2016MiB \| \| 4 N/A N/A 3065939 C ...me#python#py_version_3_10 1998MiB \| \| 5 N/A N/A 3065943 C ...me#python#py_version_3_10 2070MiB \| \| 6 N/A N/A 3065948 C ...me#python#py_version_3_10 2026MiB \| \| 7 N/A N/A 3065952 C ...me#python#py_version_3_10 2070MiB \| +-----------------------------------------------------------------------------+ [me@myhost ~]$ ps -T -p 3065920 PID SPID TTY TIME CMD 3065920 3065920 pts/14 00:01:04 pt_main_thread ... 3065920 3092181 pts/14 00:00:40 pt_autograd_d0 3065920 3092182 pts/14 00:00:00 pt_autograd_d1 3065920 3092183 pts/14 00:00:00 pt_autograd_d2 3065920 3092184 pts/14 00:00:00 pt_autograd_d3 3065920 3092185 pts/14 00:00:00 pt_autograd_d4 3065920 3092186 pts/14 00:00:00 pt_autograd_d5 3065920 3092187 pts/14 00:00:00 pt_autograd_d6 3065920 3092188 pts/14 00:00:00 pt_autograd_d7 ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/121170 Approved by: https://github.com/albanD	2024-03-05 22:15:39 +00:00
Rohan Potdar	f67c77c497	Update engine.cpp (#120773 ) Minor comment fix; `backward` and `grad` are flipped here. See https://pytorch.org/docs/stable/_modules/torch/autograd.html#backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/120773 Approved by: https://github.com/albanD, https://github.com/janeyx99, https://github.com/soulitzer	2024-02-28 18:23:35 +00:00
albanD	ca777fbbb7	Add Accelerator device and shell hooks (#119329 ) This adds a concept of Accelerator that points to one of our devices. See DeviceAccelerator.h in this PR for details https://github.com/pytorch/pytorch/pull/119329/files#diff-83cc748bed5df1a453c272cc5ecc7e572d4eb694c5125384d8fbd17a0b5f50c8 It also adds scaffolding for shared C++ API to allow generic feature implementation. This PR in particular updates the autograd engine to use this generic API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119329 Approved by: https://github.com/ezyang, https://github.com/huydhn	2024-02-13 23:15:24 +00:00
PyTorch MergeBot	214f06ae3a	Revert "Add Accelerator device and shell hooks (#119329 )" This reverts commit `4b9568a360`. Reverted https://github.com/pytorch/pytorch/pull/119329 on behalf of https://github.com/huydhn due to Breaks internal build and requires OSS file update to fix it ([comment](https://github.com/pytorch/pytorch/pull/119329#issuecomment-1940278598))	2024-02-13 02:23:45 +00:00
Edward Z. Yang	482345d747	Refactor out shape test into InputMetadata::maybe_reduce (#119559 ) I'm going to gut this function shortly, and having it all on InputMetadata is convenient for this purpose. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119559 Approved by: https://github.com/soulitzer	2024-02-12 19:27:48 +00:00
albanD	4b9568a360	Add Accelerator device and shell hooks (#119329 ) This adds a concept of Accelerator that points to one of our devices. See DeviceAccelerator.h in this PR for details https://github.com/pytorch/pytorch/pull/119329/files#diff-83cc748bed5df1a453c272cc5ecc7e572d4eb694c5125384d8fbd17a0b5f50c8 It also adds scaffolding for shared C++ API to allow generic feature implementation. This PR in particular updates the autograd engine to use this generic API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119329 Approved by: https://github.com/ezyang	2024-02-09 18:54:28 +00:00
albanD	a6e16fe202	Fix global in header warning (#119380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119380 Approved by: https://github.com/janeyx99	2024-02-07 20:35:21 +00:00
garfield1997	fbf92500fb	enable privateuseone to perform streaming backward (#117111 ) Fixes #116957 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117111 Approved by: https://github.com/soulitzer	2024-01-30 15:13:31 +00:00
cyy	2f17a21b2b	[Reland] [13/N] Enable clang-tidy on headers of torch/csrc (#117088 ) Reland of #116560 and fixes the issued reported by #116695 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117088 Approved by: https://github.com/albanD	2024-01-10 23:58:04 +00:00
PyTorch MergeBot	791db94c62	Revert "[13/N] Enable clang-tidy on headers of torch/csrc (#116560 )" This reverts commit `b0629cdd67`. Reverted https://github.com/pytorch/pytorch/pull/116560 on behalf of https://github.com/izaitsevfb due to Reverting, as it depends on #116353, which has to be reverted ([comment](https://github.com/pytorch/pytorch/pull/116560#issuecomment-1876033363))	2024-01-03 22:08:40 +00:00
cyy	b0629cdd67	[13/N] Enable clang-tidy on headers of torch/csrc (#116560 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116560 Approved by: https://github.com/Skylion007, https://github.com/albanD	2024-01-02 05:33:04 +00:00
cyy	bb2a1e9941	Enable readability-redundant-smartptr-get in clang-tidy (#116381 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/116381 Approved by: https://github.com/Skylion007	2023-12-26 06:05:15 +00:00
mantaionut	d521857411	Terminate handler (#101332 ) Fixes #50051. This PR is based on #50320 and I address the last feedback. On Windows it is enabled by default. Can be enabled or disabled via USE_CUSTOM_TERMINATE env variable. This PR adds support for overriding the terminate handler in order to log uncaught exceptions in the threads. If an exception is thrown and not caught, it will print <Unhandled exception caught in c10/util/AbortHandler.h> The point of doing this is that in issue #50051, exceptions were thrown but not logged. With this logging system it will be easier to debug it in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101332 Approved by: https://github.com/albanD, https://github.com/malfet	2023-12-12 17:55:27 +00:00
Scott Wolchok	165f4f6ccf	[PyTorch] Redirect c10::optional to std::optional (#101995 ) We have C++17 now! I am intentionally dropping the `c10::optional<c10::ArrayRef>` size optimization. It was intended to improve dispatch, but thanks to D34602980 / #70864 we don't use `optional<ArrayRef>` in function arguments anymore anyway. Differential Revision: [D46079028](https://our.internmc.facebook.com/intern/diff/D46079028/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/101995 Approved by: https://github.com/malfet, https://github.com/Skylion007, https://github.com/ezyang	2023-11-30 02:46:41 +00:00
soulitzer	c435b8c10a	Fix autograd engine callback error propagation from device thread (#113702 ) The existing try-catch doesn't work because it doesn't call err.persist(). This is in contrast to the try-catch for evaluate_function which does work because it calls into python_engine's thread_on_exception which calls persist. Calling persist on a python_error stashes the PyErr state from the thread-local PyThreadState onto the python_error object, so that when this error object is stored onto the future and passed back to the calling cpu thread, python_engine's execute try-catch can then err.restore() the error state. Finally, the python_engine's execute would re-raise so that this is re-caught by the HANDLE_TH_ERRORS macro. Fixes https://github.com/pytorch/pytorch/issues/75750 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113702 Approved by: https://github.com/albanD	2023-11-17 20:17:02 +00:00
Michael Voznesensky	7d98549ca9	retain_graph=True in compiled_autograd (#110367 ) Adds support for retain_graph=True - known as keep_graph_ internally in the autograd engine. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110367 Approved by: https://github.com/jansel	2023-10-06 08:22:10 +00:00
cyy	e9e93c5350	[Reland] Move torch::make_unique to std::make_unique (#109780 ) We can first try to move torch::make_unique to std::make_unique despite reverting of #108866 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/109780 Approved by: https://github.com/ezyang	2023-09-21 18:30:21 +00:00
PyTorch MergeBot	525e4f42d0	Revert "replace torch::make_unique with std::make_unique (#108866 )" This reverts commit `03e35efbf7`. Reverted https://github.com/pytorch/pytorch/pull/108866 on behalf of https://github.com/clee2000 due to Sorry but I found more usages of `torch::make_unique` internally, I can go change all of these, but I'd prefer if that gets done before this gets merged ([comment](https://github.com/pytorch/pytorch/pull/108866#issuecomment-1722577925))	2023-09-17 21:57:30 +00:00
cyy	75b954b715	[4/N] Enable clang-tidy in torch/csrc/autograd (#109455 ) The PR enables clang-tidy checks in torch/csrc/autograd. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109455 Approved by: https://github.com/Skylion007	2023-09-17 17:11:50 +00:00
cyy	a14d30d8d1	[1/N] apply clang-tidy in torch/csrc/autograd (#109032 ) This PR begins a new series of patches for enabling clang-tidy checks in torch/csrc/augograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/109032 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-09-15 23:28:43 +00:00
cyy	36b8ca4e48	[2/N] apply clang-tidy in torch/csrc/autograd (#109277 ) This PR follows the work of PR #109032. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109277 Approved by: https://github.com/albanD	2023-09-15 00:39:12 +00:00
cyy	03e35efbf7	replace torch::make_unique with std::make_unique (#108866 ) It should be safe to remove the old torch::make_unique functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108866 Approved by: https://github.com/albanD	2023-09-14 20:52:26 +00:00
Jason Ansel	a01e795a6d	[Compiled Autograd] Fix bug with multithreading check (#106621 ) Fixes #106555 There was bug where the multithreading check would fire because of the `compiled_autograd.disable()` calls in AotAutograd, even though compiled autograd was already disabled, so that call was doing nothing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106621 Approved by: https://github.com/yanboliang	2023-08-04 20:49:21 +00:00
Jason Ansel	ac6d8fb16e	[Compiled Autograd] Add eager autograd tests (#105808 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105808 Approved by: https://github.com/albanD, https://github.com/soulitzer	2023-07-28 15:59:35 +00:00
Jason Ansel	c902b84e0b	Compiled autograd (#103822 ) This branch: 1) converts the autograd tape into an FX graph 2) caches that conversion using a "shadow" graph 3) compiles and runs the generated FX graph instead of the normal autograd What works currently: 1) Caching, capture, and initial integration 2) Backwards hooks 3) Inlining AotAutograd generated subgraphs 4) torch.compiling the generated FX graph 5) Auto-detecting dynamic shapes based on changes Future work 1) Larger scale testing 1) Boxed calling convention, so memory can be freed incrementally 1) Support hooks on SavedTensor 1) Additional testing by running eager autograd tests under compiled_autograd.enable() Pull Request resolved: https://github.com/pytorch/pytorch/pull/103822 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-07-24 21:12:05 +00:00
soulitzer	39477f7ca9	Remove unnecessary seen check in get_current_graph_task_execution_order (#105487 ) https://github.com/pytorch/pytorch/pull/105353#discussion_r1266977015 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105487 Approved by: https://github.com/albanD, https://github.com/jansel	2023-07-18 23:49:45 +00:00
soulitzer	cf404a8ce4	Fix get_current_graph_task_execution_order accumulate_grads ordering (#105353 ) Fixes https://github.com/pytorch/pytorch/issues/105293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105353 Approved by: https://github.com/albanD	2023-07-18 00:59:25 +00:00
Masaki Kozuki	07c60d11b3	replace `AT_ERROR(...)` with `TORCH_CHECK(false, ...)` (#104534 ) Merely cosmetic for `AT_ERROR` I found by chance, following `e9d2d74f0a/c10/util/Exception.h (L622)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/104534 Approved by: https://github.com/soulitzer	2023-07-03 22:43:19 +00:00
Aidyn-A	69eef5a4be	[CUDA12] set_device change (#94864 ) This PR adds workaround for CUDA 12 [`cudaSetDevice` change](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g159587909ffa0791bbe4b40187a4c6bb) which will always create primary context on target device. So operations like this: ```Python import torch x = torch.randn(1, device="cuda:1") ``` would always create primary context on on device `cuda:1` because it is creating a tensor on it and on device `cuda:0` because the destructor of CUDA Device guard calls `cudaSetDevice(0)`. After this PR the CUDA Device guard will not call `cudaSetDevice(0)` if primary context does not exist on `cuda:0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94864 Approved by: https://github.com/malfet, https://github.com/atalman, https://github.com/ezyang	2023-04-10 17:31:12 +00:00
shibo	a3701b6740	fix backward bug for custom device (#98586 ) Fixes #ISSUE_NUMBER In the backward on some device , it may get an error to get device index because of exchange a new thread. So just set_device and check the device index in `setDevice` func may be better for some many kinds of devices. For CUDA, the device index check is also included in `setDevice` func.https://github.com/pytorch/pytorch/blob/master/c10/cuda/impl/CUDAGuardImpl.h#:~:text=%7D-,void%20setDevice(Device%20d)%20const%20override%20%7B,%7D,-void%20uncheckedSetDevice(Device ``` void setDevice(Device d) const override { TORCH_INTERNAL_ASSERT(d.is_cuda()); Device current_device = getDevice(); if (current_device != d) { C10_CUDA_CHECK(cudaSetDevice(d.index())); } } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98586 Approved by: https://github.com/albanD	2023-04-10 15:56:38 +00:00
PyTorch MergeBot	279ca5f9db	Revert "[CUDA12] set_device change (#94864 )" This reverts commit `c18be2b2ec`. Reverted https://github.com/pytorch/pytorch/pull/94864 on behalf of https://github.com/ezyang due to avoid affecting cuda 11	2023-04-05 14:53:00 +00:00
Aidyn-A	c18be2b2ec	[CUDA12] set_device change (#94864 ) This PR adds workaround for CUDA 12 [`cudaSetDevice` change](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g159587909ffa0791bbe4b40187a4c6bb) which will always create primary context on target device. So operations like this: ```Python import torch x = torch.randn(1, device="cuda:1") ``` would always create primary context on on device `cuda:1` because it is creating a tensor on it and on device `cuda:0` because the destructor of CUDA Device guard calls `cudaSetDevice(0)`. After this PR the CUDA Device guard will not call `cudaSetDevice(0)` if primary context does not exist on `cuda:0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94864 Approved by: https://github.com/malfet, https://github.com/atalman, https://github.com/ezyang	2023-04-05 14:34:00 +00:00
Aidyn-A	c69b3b8d4f	[CUDA12] Autograd engine use current device only (#92354 ) This is a device agnostic version #91191. The reason of existence of this PR is device agnostic policy of autograd engine. Hence, the compile time `USE_CUDA` is not supported, so doing something like: `fa1ea9f9bc/torch/csrc/autograd/engine.cpp (L351-L357)` is not effective. In this PR a check upon CUDA devices in device registry is added such that threads set the same CUDA device. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92354 Approved by: https://github.com/albanD, https://github.com/ngimel	2023-03-13 20:04:12 +00:00
cyy	d0e4ca233e	some reference and move fixes (#95942 ) This PR introduces some modifications: 1. We find out some const function parameters that can be passed by reference and add the reference. 2. We find more opportunists of passing by value and change them accordingly. 3. Some use-after-move errors are fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942 Approved by: https://github.com/Skylion007	2023-03-10 03:44:09 +00:00
Kazuaki Ishizaki	69aa6b4bb9	fix typo in comments under torch/csrc/autograd (#96061 ) This PR fixes typos in comments of `.cpp` and `.h` files under `torch/csrc/autograd` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/96061 Approved by: https://github.com/soulitzer	2023-03-06 18:05:14 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
cyy	f172feae0d	More tidy fixes (#93069 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93069 Approved by: https://github.com/Skylion007	2023-01-27 06:40:50 +00:00
cyy	e292ddff4e	More clang-tidy fixes (#92944 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92944 Approved by: https://github.com/Skylion007	2023-01-25 19:11:51 +00:00
Kshiteej K	550f98332b	[fix] vmap and anomaly mode interaction (#92672 ) Fixes https://github.com/pytorch/functorch/issues/1049 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92672 Approved by: https://github.com/albanD	2023-01-24 18:12:52 +00:00
Aaron Gokaslan	8c8cd9539d	Add missing moves to torch autograd (#92772 ) Applies some additional std::move functions to torch/csrc/autograd to opportunities that were found via static analysis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92772 Approved by: https://github.com/ezyang	2023-01-24 02:01:52 +00:00

1 2 3 4 5 ...

260 Commits