Commit Graph

305 Commits

Author SHA1 Message Date
soulitzer
12f742941d Warn if AccumulateGrad stream does not match producer node stream (#165065)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065
Approved by: https://github.com/ngimel
2025-10-22 17:33:27 +00:00
PyTorch MergeBot
f975bd58af Revert "Warn if AccumulateGrad stream does not match producer node stream (#165065)"
This reverts commit a70ef954b9.

Reverted https://github.com/pytorch/pytorch/pull/165065 on behalf of https://github.com/izaitsevfb due to breaks lint ([comment](https://github.com/pytorch/pytorch/pull/165065#issuecomment-3391387386))
2025-10-10 17:29:29 +00:00
soulitzer
a70ef954b9 Warn if AccumulateGrad stream does not match producer node stream (#165065)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065
Approved by: https://github.com/ngimel
ghstack dependencies: #162815
2025-10-10 16:46:01 +00:00
soulitzer
71aefd5595 [reland] Allow setting grad_dtype on leaf tensors (#164751)
ghstack-source-id: e44b3941530be83a630ec93f1478eec741ffca2e
Pull-Request-resolved: https://github.com/pytorch/pytorch/pull/162815

Fixes #ISSUE_NUMBER

Relanding due to internal weirdness. Separate PR to codev w/o ghstack.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164751
Approved by: https://github.com/albanD
2025-10-08 20:23:13 +00:00
PyTorch MergeBot
3ddf2018d0 Revert "Support setting grad_dtype on leaf tensors (#162815)"
This reverts commit dca73982c5.

Reverted https://github.com/pytorch/pytorch/pull/162815 on behalf of https://github.com/yangw-dev due to break internal test D83850533, see more details below ([comment](https://github.com/pytorch/pytorch/pull/162815#issuecomment-3367498501))
2025-10-03 23:14:28 +00:00
soulitzer
dca73982c5 Support setting grad_dtype on leaf tensors (#162815)
`grad_dtype` is a new attribute on Tensor to control gradient dtype:
- Access/setting is leaf-only.
- grad_dtype is respected when (1) when assigning to .grad, and (2) in the engine after the previous node produces incoming gradients for AccumulateGrad. (See table below for details)
- Not setting grad_dtype preserves the current behavior. Accessing it returns `t.dtype`
- `grad_dtype` cannot be set when there is already a `.grad` present and the dtypes conflict.

| `grad_dtype` setting | Setting `.grad` manually | Incoming gradient from autograd engine |
|-----------------------|--------------------------|-----------------------------------------|
| **Default (tensor’s dtype)** | `.grad` must match tensor’s dtype | Engine casts incoming grad to tensor’s dtype |
| **Set to specific dtype** | `.grad` must match that dtype | Engine casts incoming grad to the specified dtype |
| **Set to `None`** | `.grad` may be any dtype | Engine does not cast; accepts incoming grad dtype as-is |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162815
Approved by: https://github.com/albanD
2025-10-02 23:09:07 +00:00
FFFrog
ab2ce3c50e [Code Clean] Replace std::runtime_error with TORCH_CHECK (#163264)
Related ISSUE: https://github.com/pytorch/pytorch/issues/148114
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163264
Approved by: https://github.com/albanD, https://github.com/cyyever
2025-09-25 11:28:51 +00:00
Simon Fan
c8205cb354 [autograd] match 0-dim gradients device type regardless of subclassness (#160165)
Not sure if there some subclasses where the outer.dim() == 0 but you wouldn't want to move it?

FIXES https://github.com/pytorch/pytorch/issues/160084

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160165
Approved by: https://github.com/ezyang, https://github.com/albanD
2025-08-11 17:57:32 +00:00
soulitzer
8bda95228f [autograd] Avoid creating and recording event when unnecessary (#157503)
Today, we always create and record an events in two places:
1) Upon seeing the first producer, we record an event on the producer, and we wait for this event in two places: (1) when the engine goes to run the consumer, the consumer stream waits for this event. (2) prior to doing accumulation, the accumulation stream waits for this event.

2) After doing accumulation, we record an event on the accumulation stream and wait for this event in a single place: when the engine goes to run the consumer.

We do not actually need to record the event in the cases where the 1st producer stream is the same as the consumer and as the accumulation stream, and where the accumulation stream is the same as the consumer stream.

Removing this unnecessary create + record event should save a few us for each instance avoided.

Fixes https://github.com/pytorch/pytorch/issues/157407

----

Manual test plan:
- [x] @eqy to confirm perf is restored
- [x] Running the repro originally reported before/after the patch

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157503
Approved by: https://github.com/eqy
ghstack dependencies: #155715
2025-07-09 03:36:14 +00:00
Xuehai Pan
5b210bb3a6 [BE][9/16] fix typos in torch/ (torch/csrc/) (#156319)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156319
Approved by: https://github.com/albanD
ghstack dependencies: #156313, #156314, #156315, #156316, #156317
2025-06-23 02:57:50 +00:00
PyTorch MergeBot
1d3bca40ed Revert "[BE][9/16] fix typos in torch/ (torch/csrc/) (#156319)"
This reverts commit a23ccaa847.

Reverted https://github.com/pytorch/pytorch/pull/156319 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912) [HUD commit link](c95f7fa874) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213))
2025-06-22 12:31:56 +00:00
Xuehai Pan
a23ccaa847 [BE][9/16] fix typos in torch/ (torch/csrc/) (#156319)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156319
Approved by: https://github.com/albanD
ghstack dependencies: #156313, #156314, #156315, #156316, #156317
2025-06-22 08:43:49 +00:00
Simon Fan
5f2f343e1e [ca] suggest to disable compiled autograd for trace-time NotImplementedErrors (#156509)
Example:

```python
  File "/home/xmfan/core/a/pytorch/torch/autograd/graph.py", line 829, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: TorchDispatchMode not yet implemented for compiled autograd.
  You can disable compiled autograd for this operation by:
  1.  Relocating the unsupported autograd call outside the compiled region.
  2.  Wrapping the unsupported autograd call within a scope that disables compiled autograd.
  3.  Configuring the specific compilation unit to disable compiled autograd.
  4.  Globally disabling compiled autograd at the application's initialization.
```

No duplicate error messages for python side trace-time errors
```python
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xmfan/core/a/pytorch/torch/_dynamo/compiled_autograd.py", line 344, in begin_capture
    raise NotImplementedError(
NotImplementedError: Found tensor of type <class 'torch.nn.utils._expanded_weights.expanded_weights_impl.ExpandedWeight'>, which is not supported by FakeTensorMode. You can turn off compiled autograd by either:
1. Moving the unsupported autograd call outside of the torch.compile'd region.
2. Wrapping the unsupported autograd call in the torch._dynamo.compiled_autograd._disable() context manager.
3. Setting torch._dynamo.config.compiled_autograd=False for the torch.compile call containing the unsupported autograd call.
4. Setting torch._dynamo.config.compiled_autograd=False at the start of the program.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156509
Approved by: https://github.com/jansel
ghstack dependencies: #156374
2025-06-21 18:33:46 +00:00
Simon Fan
17b38b850e [ca] Allow using compiled autograd context managers during backward runtime (#156120)
Added an invariant that nested compiled autograd context managers must exit before their parent context manager. This allows us to defer the thread check.

FIXES https://github.com/pytorch/pytorch/issues/152219

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156120
Approved by: https://github.com/jansel
ghstack dependencies: #155521, #155480
2025-06-18 03:01:15 +00:00
Sean McGovern
297805fd8f Typo fixes for "overridden" in comments and function names (#155944)
This word appears often in class descriptions and is not consistently spelled. Update comments and some function names to use the correct spelling consistently. Facilitates searching the codebase.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155944
Approved by: https://github.com/Skylion007
2025-06-14 03:37:38 +00:00
soulitzer
a060f3d272 Rewrite autograd producer consumer stream sync logic (#151079)
Also see previous work https://github.com/pytorch/pytorch/pull/142097

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151079
Approved by: https://github.com/albanD
2025-05-16 15:42:22 +00:00
PyTorch MergeBot
2c1912452d Revert "Rewrite autograd producer consumer stream sync logic (#151079)"
This reverts commit f78e4529a9.

Reverted https://github.com/pytorch/pytorch/pull/151079 on behalf of https://github.com/jeanschmidt due to Seems to have introduced regressions in internal signals, see [D74648937](https://www.internalfb.com/diff/D74648937) ([comment](https://github.com/pytorch/pytorch/pull/151079#issuecomment-2880176879))
2025-05-14 13:07:12 +00:00
Simon Fan
a80eb84a5f [ca] support higher order gradients (create_graph=True) (#153222)
Adds create_graph support if you don't compile or compile only with torch.compile(backend="eager").

Using a backend that uses AOTDispatch produces a post-dispatch AOT backward, where its double backward will be silently incorrect if the forward trace involved any ops that are not composite implicit.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153222
Approved by: https://github.com/jansel
ghstack dependencies: #153193
2025-05-13 16:42:09 +00:00
soulitzer
f78e4529a9 Rewrite autograd producer consumer stream sync logic (#151079)
Also see previous work https://github.com/pytorch/pytorch/pull/142097

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151079
Approved by: https://github.com/albanD
2025-05-12 21:07:16 +00:00
cyyever
f2cfeb23e5 [Environment Variable][7/N] Use thread-safe getenv functions (#140211)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211
Approved by: https://github.com/ezyang, https://github.com/eqy
2025-04-24 01:06:29 +00:00
Simon Fan
dcb378cff2 [ca] support anomly mode nan checks with different semantics than eager (#149897)
see note in code

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149897
Approved by: https://github.com/jansel
ghstack dependencies: #149647, #149709, #149651
2025-03-27 05:05:34 +00:00
cyy
8fa81a6066 Enable misc-use-internal-linkage check and apply fixes (#148948)
Enables clang-tidy rule [`misc-use-internal-linkage`](https://clang.llvm.org/extra/clang-tidy/checks/misc/use-internal-linkage.html). This new check was introduced in Clang-Tidy 18 and is available due to recent update of Clang-Tidy 19.

The check marks functions and variables used only in the translation unit as static. Therefore undesired symbols are not leaked into other units, more link time optimisations are possible and the resulting binaries may be smaller.

The detected violations were mostly fixed by using static. In other cases, the symbols were indeed consumed by others files, then their declaring headers were included. Still some declarations were wrong and have been fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148948
Approved by: https://github.com/Skylion007
2025-03-12 14:22:56 +00:00
Simon Fan
0a2da008f8 [ca] trace saved variable unpacking (#147242)
## Before

Previously, CA will always unpack all saved variables stored in the autograd graph before executing it. This meant that we can't capture unpack hooks as part of the CA graph, and they would fire out of order wrt to other backward hooks. For memory saving APIs built on top of saved tensor hooks like non-reentrant checkpointing and offloading, we couldn't achieve any savings because all activations would be recomputed/loaded and active at the same time, resulting in no-op.

## After

We add unpack hooks into the CA graph so that they can be executed progressively. The python hook and hook input themselves are wrapped by non-traceable code, so CA polyfills the wrapping as:
```python
# pseudocode
class SavedVariable:
  def unpack(self):
    if self.hook:
      return self.hook(self.packed_data)
    else:
      return self.packed_data

# This approach won't directly work when we add support for Forward AD or double-backward.
```

Directly executing the CA graph (without torch.compiling it) under checkpointing/offloading, memory profile is expected to stay the same as when using the eager autograd engine. If AOT backward is in the autograd graph, memory profile is expected to be better than the eager autograd engine, since we can now delay saved activations unpacking into the AOT backward's execution.

All tests pass when running the CA graph directly, the remaining issues are in Dynamo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147242
Approved by: https://github.com/jansel
2025-02-26 16:37:17 +00:00
PyTorch MergeBot
90e3a3d86d Revert "[ca] trace saved variable unpacking (#147242)"
This reverts commit 68ddca9449.

Reverted https://github.com/pytorch/pytorch/pull/147242 on behalf of https://github.com/wdvr due to failing tests in the slow workflow, see below ([comment](https://github.com/pytorch/pytorch/pull/147242#issuecomment-2683604547))
2025-02-26 00:40:16 +00:00
Simon Fan
68ddca9449 [ca] trace saved variable unpacking (#147242)
## Before

Previously, CA will always unpack all saved variables stored in the autograd graph before executing it. This meant that we can't capture unpack hooks as part of the CA graph, and they would fire out of order wrt to other backward hooks. For memory saving APIs built on top of saved tensor hooks like non-reentrant checkpointing and offloading, we couldn't achieve any savings because all activations would be recomputed/loaded and active at the same time, resulting in no-op.

## After

We add unpack hooks into the CA graph so that they can be executed progressively. The python hook and hook input themselves are wrapped by non-traceable code, so CA polyfills the wrapping as:
```python
# pseudocode
class SavedVariable:
  def unpack(self):
    if self.hook:
      return self.hook(self.packed_data)
    else:
      return self.packed_data

# This approach won't directly work when we add support for Forward AD or double-backward.
```

Directly executing the CA graph (without torch.compiling it) under checkpointing/offloading, memory profile is expected to stay the same as when using the eager autograd engine. If AOT backward is in the autograd graph, memory profile is expected to be better than the eager autograd engine, since we can now delay saved activations unpacking into the AOT backward's execution.

All tests pass when running the CA graph directly, the remaining issues are in Dynamo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147242
Approved by: https://github.com/jansel
2025-02-25 20:38:51 +00:00
PyTorch MergeBot
00dc5b10f6 Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211)"
This reverts commit 2fd1b6b361.

Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/atalman due to Breaks executorch tests ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2632202864))
2025-02-03 22:04:28 +00:00
cyy
2fd1b6b361 [Environment Variable][7/N] Use thread-safe getenv functions (#140211)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211
Approved by: https://github.com/ezyang, https://github.com/eqy
2025-02-01 12:33:41 +00:00
PyTorch MergeBot
284f217011 Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211)"
This reverts commit 97b3b73f3e.

Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/ZainRizvi due to Sorry but this is failing internally. @eqy @ezyang can you please help this get remerged? See D68779772. ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2622504898))
2025-01-29 18:24:29 +00:00
cyyever
97b3b73f3e [Environment Variable][7/N] Use thread-safe getenv functions (#140211)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211
Approved by: https://github.com/ezyang, https://github.com/eqy
2025-01-28 15:21:12 +00:00
rzou
ea141d8134 functional compiled autograd (#144707)
This PR squashes together the following commits:

https://github.com/pytorch/pytorch/pull/144115
https://github.com/pytorch/pytorch/pull/143417
https://github.com/pytorch/pytorch/pull/143405
https://github.com/pytorch/pytorch/pull/143387
https://github.com/pytorch/pytorch/pull/143304
https://github.com/pytorch/pytorch/pull/143296

This is a refactor of compiled autograd to use "functional autograd". The end goal is that it gets compiled autograd's initial capture to stop specializing on Tensor metadata, therefore allowing compiled autograd to better handle Tensor subclasses.

For more information, please read the commit messages for each PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144707
Approved by: https://github.com/bdhirsh, https://github.com/xmfan, https://github.com/jansel
2025-01-27 05:20:56 +00:00
PyTorch MergeBot
6dd8283381 Revert "[compiled autograd] Proxy opaque nodes for built-in autograd nodes (#143296)"
This reverts commit 5531fafffe.

Reverted https://github.com/pytorch/pytorch/pull/143296 on behalf of https://github.com/izaitsevfb due to breaking internal tests T213390054 ([comment](https://github.com/pytorch/pytorch/pull/143296#issuecomment-2611224926))
2025-01-23 23:34:13 +00:00
cyy
29f52e3972 [2/N] Remove unnecessary once flag usage (#145057)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145057
Approved by: https://github.com/albanD
2025-01-23 09:48:46 +00:00
rzou
5531fafffe [compiled autograd] Proxy opaque nodes for built-in autograd nodes (#143296)
This PR is on the way to getting compiled autograd's initial capture to
stop specializing on Tensor metadata.

This PR changes compiled autograd's initial capture to proxy an opaque
(w.r.t. Dynamo) function into the graph for all built-in codegen'ed
autograd nodes and validate_outputs.

We changed each codegen'ed apply_with_saved (e.g.
MulBackward0::apply_with_saved) to call into Python to proxy a function
(compiled_autograd.ops.MulBackward0) into the graph. Then, we use the
node's InputMetadata to "guess" at the properties of the output Tensors
to create some new FakeTensors.

Some details:
- MulBackward0::apply_with_saved lives in libtorch_cpu, but needs to be
  call to Python via libtorch_python. There is an indirection
  (PyCompilerInterface) to do this.
- MulBackward0::apply_with_saved passes a C++ function to Python. To make
  our lives easier, every codegen'ed apply_with_saved passes a C++
  function with the same signature
  `(variable_list, ivalue_list) -> variable_list`.
- We define how to pack arbitrary C++ types into IValue via a helper
  IValuePacker struct and codegen functional variants of each builtin
  C++ autograd node (e.g. MulBackward0_apply_functional_ivalue).

MulBackward0 before this PR:
https://gist.github.com/zou3519/a80381d5fa38e970e413fcd91b0530de

MulBackward0 after this PR:
https://gist.github.com/zou3519/0c2eee8b3d8d96232b51ef430b53c5b0

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143296
Approved by: https://github.com/jansel
2025-01-22 21:50:29 +00:00
cyy
843627b7b1 Remove unnecessary once flag usage (#143255)
Static variables in C++11 is guaranteed to be initialised exactly once, as mentioned [here](https://en.cppreference.com/w/cpp/language/storage_duration)
```
If multiple threads attempt to initialize the same static local variable concurrently,
the initialization occurs exactly once
(similar behavior can be obtained for arbitrary functions with std::call_once.
Usual implementations of this feature use variants
of the double-checked locking pattern,
which reduces runtime overhead for already-initialized local statics
 to a single non-atomic boolean comparison.
```
Given that static c10::once_flag is used before, why not just use the associated function to initialised the related static variables? That is the motivation behind this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143255
Approved by: https://github.com/albanD
2025-01-16 02:36:11 +00:00
Simon Fan
ab04f3aee1 [ca] set autograd graph task state (#143108)
GraphTask holds metadata needed for a single execution of backward(), it is 1:1 with backward calls, at least for compiled autograd. It is used for certain torch._C global autograd state APIs.

In SAC, we use torch._C._current_graph_task_id() as a dict key to store information during unpack hook execution: a5fb07af27/torch/utils/checkpoint.py (L1128)

If we don't set an active task, it will randomize the key, and will do its logic as if each unpacked tensor was from a different graph task
a5fb07af27/torch/utils/checkpoint.py (L1112-L1115)

The sketchy part of this PR is that in eager autograd, GraphTask is mutated during execution. But inspecting the struct, the mutation seems to only be used to communicate between autograd threads (created when multiple devices are involved) or for deprecated uses. We shouldn't run into the mutation case at all in compiled autograd. Also, only the graph task id is accessible from python hooks.

FIXES https://github.com/pytorch/pytorch/issues/142862

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143108
Approved by: https://github.com/jansel, https://github.com/albanD
2024-12-13 03:10:48 +00:00
Richard Barnes
7667235a23 c10::optional -> std::optional (#142514)
Fixes issues introduced in https://github.com/pytorch/pytorch/pull/141348 and https://github.com/pytorch/pytorch/pull/139578

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142514
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-12-12 17:23:46 +00:00
cyy
f7b9533c3f [4/N] Apply bugprone-unchecked-optional-access (#142832)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142832
Approved by: https://github.com/albanD
2024-12-12 04:33:32 +00:00
cyy
7d98b3dcee [3/N] Apply bugprone-unchecked-optional-access (#142442)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142442
Approved by: https://github.com/albanD
2024-12-11 01:39:10 +00:00
cyy
b4c0973b59 [2/N] Apply bugprone-unchecked-optional-access (#141091)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141091
Approved by: https://github.com/Skylion007, https://github.com/albanD

Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>
2024-12-09 19:30:19 +00:00
rzou
215f5d77b5 [functional autograd] Refactor validate_outputs into a functional variant (#141348)
Today, validate_outputs is stateful (it depends on the autograd graph).
This PR refactors it into a stateless form that just depends on
InputMetadata.

Test Plan:
- new unittest
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141348
Approved by: https://github.com/soulitzer
ghstack dependencies: #141278
2024-12-04 18:06:31 +00:00
Simon Fan
db4e8a1d8a [ca] expose option to collect sizes as dynamic (#141153)
This is to address recompiles from eager nodes that saved dynamic activations

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141153
Approved by: https://github.com/jansel
ghstack dependencies: #141152
2024-11-22 19:26:27 +00:00
PyTorch MergeBot
614e727191 Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211)"
This reverts commit cd942d00dd.

Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/izaitsevfb due to causes crash internally during test listing ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2492328790))
2024-11-21 21:05:22 +00:00
cyyever
cd942d00dd [Environment Variable][7/N] Use thread-safe getenv functions (#140211)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211
Approved by: https://github.com/ezyang, https://github.com/eqy
2024-11-21 00:25:20 +00:00
PyTorch MergeBot
4a18e26ff5 Revert "[Environment Variable][7/N] Use thread-safe getenv functions (#140211)"
This reverts commit a3cff4bbd4.

Reverted https://github.com/pytorch/pytorch/pull/140211 on behalf of https://github.com/ezyang due to One of these diffs had incorrect downstream optional handling, we must reaudit all of these diffs ([comment](https://github.com/pytorch/pytorch/pull/140211#issuecomment-2473709246))
2024-11-13 14:05:01 +00:00
cyy
a3cff4bbd4 [Environment Variable][7/N] Use thread-safe getenv functions (#140211)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140211
Approved by: https://github.com/ezyang, https://github.com/eqy
2024-11-12 18:49:51 +00:00
cyy
032135f8a2 [2/N] Turn inline static functions into static (#140068)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140068
Approved by: https://github.com/ezyang
2024-11-09 03:31:24 +00:00
soulitzer
d6f340f66c Determine autograd engine ready queue based on InputMetadata instead of InputBuffer (#135633)
Thanks @awgu for raising this issue and the small repro

From offline discussion with @albanD, in the case where a forward returns multiple outputs with different devices, we'd want to select the ready queue based on the device of the first one. Even though this is somewhat arbitrary, we prefer this over deciding which ready queue to push based on whichever input buffer's we happen to compute last, which can vary depending on more factors and thus be harder to reason about. This is in theory bc-breaking, but it seems unlikely that someone would depend on this behavior.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135633
Approved by: https://github.com/albanD
2024-10-04 23:59:46 +00:00
Jane Xu
7f2d20e687 Run all autograd node post hooks (#134728)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134728
Approved by: https://github.com/albanD, https://github.com/soulitzer
2024-09-06 19:44:28 +00:00
cyy
929d2f8253 [3/N] Fix clang-tidy warnings in torch/csrc/autograd (#133389)
Follows #133295
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133389
Approved by: https://github.com/Skylion007
2024-08-16 00:57:54 +00:00
cyy
71efbf701d [3/N] Change #include <c10/util/Optional.h> to #include <optional> (#130300)
Follows #130236

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130300
Approved by: https://github.com/ezyang
2024-07-09 13:32:57 +00:00