Commit Graph

3173 Commits

Author SHA1 Message Date
Yuki Kobayashi
f55f2f42a7 Add missing docstring for sym_ite (#154201)
`sym_ite` is listed in [the reference page](https://docs.pytorch.org/docs/stable/torch.html) and has no document.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154201
Approved by: https://github.com/Skylion007
2025-05-26 15:59:21 +00:00
bobrenjc93
53ecb8159a Introduce statically_known_false (#154291)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154291
Approved by: https://github.com/mengluy0125
2025-05-24 14:23:55 +00:00
Svetlana Karslioglu
1ab2993345 Add a link to transformer_building_blocks tutorial (#154281)
Cross-link to https://docs.pytorch.org/tutorials/intermediate/transformer_building_blocks.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154281
Approved by: https://github.com/mikaylagawarecki
2025-05-24 02:50:24 +00:00
Svetlana Karslioglu
ec368a1903 Add sitemap (#154158)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154158
Approved by: https://github.com/albanD
2025-05-23 18:01:00 +00:00
Shangdi Yu
04a6fe7914 Update provenance tracking doc (#154062)
Summary: Update the doc to reflect the changes in https://github.com/pytorch/pytorch/pull/153584/files#diff-e0cdb58c0f84f56f20c5433339b6d83c470dcde47847e2328effea6bedd4cd27 and https://github.com/pytorch/tlparse/pull/110

Test Plan: CI

Differential Revision: D75155981

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154062
Approved by: https://github.com/svekars, https://github.com/desertfire
2025-05-23 17:09:52 +00:00
Anita Katahoire
996c4d803d Removing conda references from PyTorch Docs (#152702)
Addresses #148339

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152702
Approved by: https://github.com/svekars, https://github.com/albanD, https://github.com/atalman
2025-05-20 20:33:28 +00:00
Svetlana Karslioglu
7c9d94e9bb Redirect mobile_optimizer.rst to executorch (#153664)
Redirect mobile_optimizer.rst to executorch

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153664
Approved by: https://github.com/byjlw, https://github.com/malfet
2025-05-20 18:13:45 +00:00
Mikayla Gawarecki
6383ddcfa4 Update serialization docs (#153631)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153631
Approved by: https://github.com/albanD
2025-05-19 20:22:07 +00:00
Angela Yi
b4fb801b2d [export] Move PT2 constants to torch::_export (#153206)
Test Plan:
`buck2 test //sigmoid/...`
https://www.internalfb.com/intern/testinfra/testrun/1970325119807758

Differential Revision: D74417085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153206
Approved by: https://github.com/zhxchen17, https://github.com/dolpm
2025-05-17 08:21:59 +00:00
Anthony Shoumikhin
7d39e73c57 Fix more URLs (#153277)
Or ignore them.
Found by running the lint_urls.sh script locally with https://github.com/pytorch/pytorch/pull/153246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153277
Approved by: https://github.com/malfet
2025-05-14 16:23:50 +00:00
angelayi
d51bc27378 [export] Make draft_export public (#153219)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153219
Approved by: https://github.com/pianpwk
2025-05-14 02:18:36 +00:00
Svetlana Karslioglu
f136046919 Clean up right nav (#153090)
- Move community and language binding links to the horizontal bar
- Add an intro to the community page.
- Fix the link in the ogp_image
- Fix the link in the version switcher
- Clean up unneeded links

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153090
Approved by: https://github.com/albanD
2025-05-12 21:00:45 +00:00
PyTorch MergeBot
fdc387ec7c Revert "refine fp32 precision api (#125888)"
This reverts commit 4c11b26158.

Reverted https://github.com/pytorch/pytorch/pull/125888 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause some failures on ROCm ([comment](https://github.com/pytorch/pytorch/pull/125888#issuecomment-2869274791))
2025-05-11 00:35:46 +00:00
haozhe.zhu
4c11b26158 refine fp32 precision api (#125888)
Based on the [conversation](https://github.com/pytorch/pytorch/issues/121791), we plan to drop the "highest, high, medium" to represent fp32  internal computation data types . Instead, we will directly use the algorithm to represent it.

### Design Choice: Directly use algorithms name like "TF32", "BF16".
#### Pros
 - The names are more informative. 'tf32' is more informative than a simple "high".
 - Easier to extend new algorithm like `tf32x3`
#### Cons
 - "HIGHEST, HIGH, MEDIUM" indicated the relative precision between different algorithms. However, we can have more documents to discuss them.

### We provide a layered structure for backends/operators.
('f32' is short for 'fp32_precision')
![image](https://github.com/user-attachments/assets/f89143e5-d6a1-4865-9351-9a50439f5067)

### We provide 3 fp32 compute precision can be set:
 - **"ieee"**: Not allowed to use any other internal computation data types .
 - **"tf32"**: Allowed to use tf32 as internal computation data types.
 - **"bf16"**: Allowed to use bf16 as internal computation data types.
 - **"none"**:  Precision's are not set. Can be override by its father node.

### Overriding Precision Settings
Child node can be override by its father node if it is set to default.
For current default settings:
```
backend = generic, op = all, precision setting = none
    backend = cuda, op = all, precision setting = none
        backend = cuda, op = conv, precision setting = tf32
        backend = cuda, op = rnn, precision setting = tf32
        backend = cuda, op = matmul, precision setting = none
    backend = matmul, op = all, precision setting = none
        backend = matmul, op = conv, precision setting = none
        backend = matmul, op = rnn, precision setting = none
        backend = matmul, op = matmul, precision setting = none
```
 - If the user set `torch.backends.mkldnn.fp32_precision="bf16"`, his child nodes `torch.backends.mkldnn.matmul.fp32_precision` / `torch.backends.mkldnn.conv.fp32_precision` / `torch.backends.mkldnn.rnn.fp32_precision` will also be override to "bf16".
 - If the user set `torch.backends.fp32_precision="bf16"`,  `torch.backends.mkldnn.fp32_precision` and his child nodes will also we override to "bf16".

### Backward Compatible
Since new API allow user to have more fine-grained control. There will be some conflict. For example, previous `torch.backends.cudnn.allow_tf32` are not enough to represent the status for `torch.backends.cudnn.rnn.fp32_precision="ieee"` and `torch.backends.cudnn.conv.fp32_precision="tf32"`. Therefore, our goal for backward compatible is
 - If the user only uses previous APIs, it will work as previous expectations.
 - If the user use **new** API to change the status to an **un-representable** status for old API, and try to access the status by **old** API. We will raise Runtime Error and point the document for user.

### Test Plan
```
python test/test_cuda.py -k test_fp32_precision_with_tf32
python test/test_cuda.py -k test_fp32_precision_with_float32_matmul_precision
python test/test_cuda.py -k test_invalid_status_for_legacy_api
python test/test_mkldnn.py -k test_mlkdnn_get_set
python test/test_mkldnn.py -k test_generic_precision
python test/test_mkldnn.py -k test_invalid
python test/test_mkldnn.py -k test_default_use_parent
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125888
Approved by: https://github.com/jgong5, https://github.com/albanD

Co-authored-by: Jiang, Yanbing <yanbing.jiang@intel.com>
2025-05-10 11:13:04 +00:00
soulitzer
9d00f2b375 [autograd][docs] Add more details on why save_for_backward is important in extending autograd note (#153005)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153005
Approved by: https://github.com/albanD
2025-05-09 16:36:57 +00:00
Shangdi Yu
faff387bfd Mini tutorial for provenance tracking (#152211)
as title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152211
Approved by: https://github.com/svekars, https://github.com/eellison, https://github.com/desertfire
2025-05-09 01:41:04 +00:00
Wei Feng
5a8c9c3ab0 [FSDP2][Doc] add pointer to torchtitan (#153079)
<img width="838" alt="Screenshot 2025-05-08 at 10 51 05 AM" src="https://github.com/user-attachments/assets/4cf43a16-3801-424b-a74f-ede1d41ff052" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153079
Approved by: https://github.com/mori360
2025-05-08 22:22:07 +00:00
Yuxin Wu
2cf7fd0d2b Update docs of saved_tensors_hooks to avoid ref cycle (#153049)
Fixes #115255

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153049
Approved by: https://github.com/Skylion007, https://github.com/soulitzer
2025-05-07 18:54:56 +00:00
angelayi
60ecc560af [export] Add draft-export docs (#152637)
Sample page: https://docs-preview.pytorch.org/pytorch/pytorch/152637/draft_export.html

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152637
Approved by: https://github.com/zou3519, https://github.com/svekars
2025-05-07 01:12:45 +00:00
Ti-Tai Wang
5fa5017479 [ONNX] Suggest users setting dynamo=True when exporting (#152478)
Fixes #152025

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152478
Approved by: https://github.com/justinchuby
2025-05-06 23:18:11 +00:00
Laith Sakka
376529c78b consolidate guard_or_x and definitely_x (#152463)
definitely_true is almost same as guard_or_false, the potential differences are not meaningful to a degree that justify the
existence of both. same for definitely_false, it can be expressed with guard_or_true and guard_or_false.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152463
Approved by: https://github.com/bobrenjc93
2025-05-02 18:08:11 +00:00
Huy Do
3f10091d3c Clean up conda usage in benchmark scripts (#152552)
Fixes https://github.com/pytorch/pytorch/issues/152123.

* Switch `benchmarks/dynamo/Makefile` to use uv.  Note that these scripts are only used locally, so it's kind of ok to keep conda here IMO.  But switching to uv is probably nicer to most folks.
* Delete some files that are outdated and not used anymore

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152552
Approved by: https://github.com/atalman, https://github.com/albanD
2025-04-30 21:27:29 +00:00
Svetlana Karslioglu
e58c73be44 Add latex settings (#152350)
- Fixes #147027
- Only lualatex can build our 3K pages PDF with reasonable quality, xelatex runs out of memory and pdflatex just fails.
- Move notes under the same toctree as python-api which is needed for the PDF but doesn't change how the HTML is generated.

This is the produced PDF:
[pytorch.pdf](https://github.com/user-attachments/files/19945450/pytorch.pdf)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152350
Approved by: https://github.com/albanD
2025-04-29 19:28:43 +00:00
Zizeng Meng
861945100e [Kineto] Enable OOM observer (#152160)
Summary:
# Context:
When memory leak happens, it usually trigger the OOM in the later iterations. The snapshot of full iteration will be huge and hard to interpret.
On CUDA side, they provide OOM observer which generates snapshot when OOM happens with latest 1,500,000 entries for debugging.

In this diff, we want to implement the feature on MTIA side

Test Plan:
Run this test with last diff in the stack.
```
buck run @//mode/opt  kineto/libkineto/fb/mtia/integration_tests:mtia_memory_auto_trace_test
```

As shown, the memory_snapshot is generated when oom happens
Log: P1794792326
Snapshot: https://fburl.com/pytorch_memory_visualizer/lx73y6s3 {F1977402355}

Differential Revision: D71993315

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152160
Approved by: https://github.com/sraikund16
2025-04-27 15:56:44 +00:00
Anthony Shoumikhin
e2f9759bd0 Fix broken URLs (#152237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237
Approved by: https://github.com/huydhn, https://github.com/malfet
2025-04-27 09:56:42 +00:00
Dan Johnson
d22c4cc353 Add option to use mempool on OOM (#151487)
MemPool is a separate pool of memory handled by the caching allocator. This PR adds the option let the caching allocator try to use this pool as a last resort instead of OOMing by associating a use_on_oom bool with each MemPool.

Usage:
Users can optionally specify a ``use_on_oom`` bool (which is False by default) during MemPool creation. If true, then the CUDACachingAllocator will be able to use memory in this pool as a last resort instead of OOMing.

```
pool = torch.cuda.MemPool(allocator, use_on_oom=True)
with torch.cuda.use_mem_pool(pool):
    a = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
del a
# at the memory limit, this will succeed by using pool's memory in order to avoid the oom
b = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
```

Testing:
```
python test/test_cuda.py -k test_mempool_limited_memory_with_allocator
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151487
Approved by: https://github.com/eqy, https://github.com/syed-ahmed, https://github.com/ngimel
2025-04-26 04:04:57 +00:00
Yu, Guangye
33c75cae0a Add torch.accelerator.device_index as accelerator's device switch context (#148864)
# Motivation
We propose adding support for the Python with statement on `torch.accelerator.device_index` to enable device switching functionality. This enhancement would simplify writing device-agnostic code and provide benefits across all accelerators. Its device-specific counterparts include [`torch.cuda.device`](00199acdb8/torch/cuda/__init__.py (L482)) and  [`torch.cuda._DeviceGuard`](00199acdb8/torch/cuda/__init__.py (L469)).

**Design Philosophy**
It accepts either an `Int` or `None` as input. When `None` is passed, no device switch is performed. Supporting `None` is important for compatibility, as it's possible to encounter `None` values from `torch.device.index`.

Therefore, with this PR, we can do like this

```python
src = 0
dst = 1
# Set src to current device
torch.accelerator.set_device_index(src)
with torch.accelerator.device_index(dst):
    # Inside with statement, we set dst to current device
    assert torch.accelerator.get_device_index() == dst
# Here the current device should be src
assert torch.accelerator.get_device_index() == src
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148864
Approved by: https://github.com/albanD
2025-04-25 09:45:25 +00:00
Jane Xu
8a9c66bb70 Improve stable library apis per Scott's feedback (#152040)
Following 3 suggestions:
1. inline at::Tensor arg
2. use uniq ptr of array vs std::vector
3. document the `std::optional<S>()` case

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152040
Approved by: https://github.com/swolchok, https://github.com/albanD
2025-04-24 20:51:03 +00:00
ILCSFNO
bd09d87fdb add Out Notes (#151306)
Fixes #150181
@albanD Could you please have a check?

Build locally without pytorch build:

![Developer-FAQ](https://github.com/user-attachments/assets/351a7e0b-588e-48ae-ad0a-03f427c86e89)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151306
Approved by: https://github.com/albanD
2025-04-24 20:25:09 +00:00
Svetlana Karslioglu
ff075d0815 Update docs dependencies for local build (#151796)
Fixes #151786

- Changed requirements.txt to a symlink to .ci/docker/requirements-docs.txt
- Updated README.md with better doc build instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151796
Approved by: https://github.com/malfet
2025-04-24 18:40:42 +00:00
Pian Pawakapan
2ee8de54b1 [dynamic shapes] user-code friendly statically_known_true, has_static_value (#151601)
Fixes #151480

Allows `statically_known_true` in user code, as well as introducing `has_static_value`, returning True if the input has a static bool/float/int value

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151601
Approved by: https://github.com/laithsakka, https://github.com/zou3519, https://github.com/jingsh
2025-04-24 02:53:59 +00:00
Kaiyu Shi
f39a1a43ee Fix typos in meta.rst (#151979)
### Fixes made:
- "allow you to the module" → corrected to "allows you to move the module"

- "allow" → changed to "allows" to agree with the singular subject "method"
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151979
Approved by: https://github.com/colesbury
2025-04-24 01:25:09 +00:00
Syed Tousif Ahmed
334aab0dea Updates NCCLConfig with QOS variable (#151821)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151821
Approved by: https://github.com/kwen2501
2025-04-23 00:03:49 +00:00
Scott Wolchok
2f74cffab2 Remove reinterpret_casts with undefined behavior from stable/library.h (#151595)
There is a list of valid uses of `reinterpret_cast` (see https://en.cppreference.com/w/cpp/language/reinterpret_cast), and the use here was not on the list, hence undefined behavior. Implement what we meant using memcpy, which is well-defined.

Differential Revision: [D73200791](https://our.internmc.facebook.com/intern/diff/D73200791/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151595
Approved by: https://github.com/janeyx99
2025-04-22 20:24:47 +00:00
zeshengzong
fa0f13b90b Fix doc requirements install error (#151787)
Fixes #151786

Change version in requirements of docs consistent with version in [CI version file](https://github.com/pytorch/pytorch/blob/main/.ci/docker/requirements-docs.txt), which changed in #149331

### Test Result

![image](https://github.com/user-attachments/assets/f8646c03-116f-4f1c-b017-11b70995626b)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151787
Approved by: https://github.com/malfet
2025-04-22 18:33:44 +00:00
Svetlana Karslioglu
2fb1326483 Add dates to pages (#151602)
re: #150873
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151602
Approved by: https://github.com/albanD
2025-04-21 19:53:55 +00:00
Will Constable
bedefa46a9 Document non-pytorch CUDA memory allocation and how to query it (#150880)
This PR documents the fact that PyTorch does not have visibility into how every CUDA memory allocation happend - it only knows about allocations that went through the pytorch CUDA allocator.

It also adds a code snippet showing how to use pynvml to query current GPU memory usage.

## Preview
Added a note at the top of "Understanding CUDA Memory Usage" doc:
<img width="732" alt="image" src="https://github.com/user-attachments/assets/69e28d2a-841a-4b1b-b886-e96fb5d76582" />

which links to a section below:
<img width="733" alt="image" src="https://github.com/user-attachments/assets/cab4f252-9ac2-4fc6-a45d-fdb958fc7dbc" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150880
Approved by: https://github.com/kwen2501, https://github.com/ngimel
2025-04-18 03:48:54 +00:00
Kashif Rasul
2ed2cb5805 add generalized pareto distribution (GPD) (#135968)
Add the GPD as a distribution class

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135968
Approved by: https://github.com/albanD

Co-authored-by: Alexander März <statmixedmlgit@gmail.com>
2025-04-17 18:51:02 +00:00
Svetlana Karslioglu
cd7bc60e11 Migrate to new theme (#149331)
- Migrate pytorch docs, cpp docs and functorch docs to the pytorch_sphinx_theme2
- Migrate index.rst to markdown and restructure to use high-level horizontal bar sections Python API, Developer Notes
- Added python-api.md which becomes the main container for the API docs. This file will be used to add all api references in the toctree. It would be great to have lint for this file: https://github.com/pytorch/pytorch/issues/150718
- Enabled mermaid sphinx extension and opengraph sphinx extension

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149331
Approved by: https://github.com/malfet, https://github.com/atalman, https://github.com/albanD
2025-04-16 21:35:19 +00:00
Pian Pawakapan
6dddd6520d [dynamic shapes] add sym_and, sym_or (#150456)
This has been pretty helpful for the size-oblivious rewrite. Wanted the variadic args version to avoid `sym_or(a, sym_or(b, sym_or(c, d)))` in favor of `sym_or(a, b, c, d)`. Happy to change this to ban the 1-arg version.

This is better than plain and/or because the whole symbolic expression gets preserved, and if we guard on it or defer as a runtime assert, we preserve all branches.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150456
Approved by: https://github.com/laithsakka
2025-04-14 18:18:06 +00:00
fzyzcjy
50abc1ecc4 Super tiny fix typo (#151212)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151212
Approved by: https://github.com/Skylion007
2025-04-14 16:47:40 +00:00
zeshengzong
5eebcb991a Add scripts to generate plots of LRSchedulers (#149189)
Fixes #92007

## Changes

- Add script to generate plots for `lr_scheduler`
- Add plots to `lr_scheduler` docs
- Add example section if it missing in `lr_scheduler` docs

## Test Result

### LambdaLR

![image](https://github.com/user-attachments/assets/37fc0894-e2ec-48f2-a2d6-3514e51e1ea2)

### MultiplicativeLR

![image](https://github.com/user-attachments/assets/2122b3a0-a4ce-42c7-bb45-559c1fc73e0f)

### StepLR

![image](https://github.com/user-attachments/assets/47bc9d96-4b60-4586-a000-f213583bbe8f)

### MultiStepLR

![image](https://github.com/user-attachments/assets/c822b849-d5be-4b94-aa7a-0017a2c9ff15)

### ConstantLR

![image](https://github.com/user-attachments/assets/83107cdd-7b00-44a6-b09d-e8ee849b4a12)

### LinearLR

![image](https://github.com/user-attachments/assets/60190105-691a-4101-8966-5b0c396093a4)

### ExponentialLR

![image](https://github.com/user-attachments/assets/dfcbcbca-89e5-4a2f-b1bd-33e25d2405ec)

### PolynomialLR

![image](https://github.com/user-attachments/assets/7c3d4fce-c846-40a0-b62e-f3e81c7e08bd)

### CosineAnnealingLR

![image](https://github.com/user-attachments/assets/26712769-dde9-4faa-b61b-e23c51daef50)

### ChainedScheduler

![image](https://github.com/user-attachments/assets/20734a8b-e939-424f-b45a-773f86f020b1)

### SequentialLR

![image](https://github.com/user-attachments/assets/2cd3ed67-2a0a-4c42-9ad2-e0be090d3751)

### ReduceLROnPlateau

![image](https://github.com/user-attachments/assets/b77f641e-4810-450d-b2cd-8b3f134ea188)

### CyclicLR

![image](https://github.com/user-attachments/assets/29b8666f-41b3-45e4-9159-6929074e6108)

### OneCycleLR

![image](https://github.com/user-attachments/assets/d5b683ef-41e8-4ca8-9fe8-0f1e6b433866)

### CosineAnnealingWarmRestarts

![image](https://github.com/user-attachments/assets/1d45ea80-dea8-494d-a8ab-e9cfc94c55d6)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149189
Approved by: https://github.com/janeyx99
2025-04-14 09:53:38 +00:00
Tristan Rice
df4e5294a6 Reapply "ProcessGroupGloo: support lazy_init (#150801)" (#151031)
This reverts commit 73f3d6d9aa.

Reapplies #150801

Test plan:

See #150801

submodule

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151031
Approved by: https://github.com/fduwjj
2025-04-11 01:58:35 +00:00
Will Constable
c9a35c2a6e [C10D] Document object collectives limitations (#150815)
Adds louder warning labels in the doc page and docstring for object
collectives in hopes of raising awareness of several footgun issues
including accidental creation of cuda contexts by serializing and
sending 'device-local' gpu tensors over the object-* apis.

Preview:
<img width="902" alt="image" src="https://github.com/user-attachments/assets/e0c08c70-d8e5-4e15-b3e2-5cd563714f71" />

addresses #150798

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150815
Approved by: https://github.com/kwen2501
2025-04-10 22:48:39 +00:00
PyTorch MergeBot
73f3d6d9aa Revert "ProcessGroupGloo: support lazy_init (#150801)"
This reverts commit f237ee54bf.

Reverted https://github.com/pytorch/pytorch/pull/150801 on behalf of https://github.com/atalman due to failing internally ([comment](https://github.com/pytorch/pytorch/pull/150801#issuecomment-2793161239))
2025-04-10 13:44:31 +00:00
Yu, Guangye
6972255dad Document poison fork note for accelerator APIs (#147507)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147507
Approved by: https://github.com/sraikund16, https://github.com/kwen2501, https://github.com/albanD
2025-04-10 02:37:37 +00:00
Tristan Rice
f237ee54bf ProcessGroupGloo: support lazy_init (#150801)
This adds lazy initialization support to ProcessGroupGloo via `TORCH_GLOO_LAZY_INIT` or via `create_device(..., lazy_init=True)`

This is still a draft PR as there's one race condition when doing coalesced operations that needs to be fixed upstream in Gloo first. Depends on https://github.com/facebookincubator/gloo/pull/427 landing first

This also updates the gloo submodule to include the required changes.

Test plan:

added lazy init test variants

```
pytest -v test/distributed/test_c10d_gloo.py -k Lazy
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150801
Approved by: https://github.com/fduwjj
2025-04-09 19:29:50 +00:00
Antoine Broyelle
886d9acb0d [docs] Add 32-bit complex to the list of dtypes (#144590)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144590
Approved by: https://github.com/janeyx99
2025-04-09 13:10:21 +00:00
zeshengzong
c9c0f8eae3 Add plot for torch.nn.Threshold and torch.nn.GLU (#150171)
Fixes #150170

## Changes

- Add plot for `torch.nn.Threshold` and `torch.nn.GLU`
- Add example output make them easier get result by users

## Test Result

![image](https://github.com/user-attachments/assets/f6c5bc46-f9b7-4db7-9797-e08d8423d1b3)

![image](https://github.com/user-attachments/assets/ad4e6c84-7b29-44f1-b7bd-9c81e4a92ef8)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150171
Approved by: https://github.com/albanD
2025-04-08 03:55:37 +00:00
ZhaoqiongZ
96f35f55e2 update get start xpu document for v2.7 (#150397)
update get start xpu document for v2.7
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150397
Approved by: https://github.com/guangyey, https://github.com/EikanWang, https://github.com/atalman

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-04-03 18:17:08 +00:00