soulitzer
b3861ac8e7
[reland] Warn if AccumulateGrad stream does not match producer node stream ( #166136 )
...
docker-builds / docker-build (pytorch-linux-jammy-linter, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3-clang12-executorch, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3-clang12-onnx, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3-clang18-asan, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3-gcc11-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.10-clang12, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.10-gcc11, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.12-halide, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.12-triton-cpu, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.13-clang12, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-py3.14-clang12, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3-benchmarks, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-xpu-n-1-py3, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-xpu-n-py3, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-jammy-xpu-n-py3-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-noble-riscv64-py3.12-gcc14, linux.12xlarge) (push) Has been cancelled
docker-builds / docker-build (pytorch-linux-noble-rocm-n-py3, linux.12xlarge) (push) Has been cancelled
ossf-scorecard / Scorecards analysis (push) Has been cancelled
Close nonexistent disable issues / close-nonexistent-disable-issues (push) Has been cancelled
Index PyTorch Tests for Target Determination / get-label-type (push) Has been cancelled
nightly / get-label-type (push) Has been cancelled
nightly / update-commit-hashes (main, .ci/docker/ci_commit_pins, triton, triton-lang) (push) Has been cancelled
nightly / update-commit-hashes (main, .github/ci_commit_pins, audio, pytorch) (push) Has been cancelled
nightly / update-commit-hashes (main, .github/ci_commit_pins, vision, pytorch) (push) Has been cancelled
nightly / update-commit-hashes (main, .github/ci_commit_pins, vllm, vllm-project) (push) Has been cancelled
Index PyTorch Tests for Target Determination / index (push) Has been cancelled
nightly / Link checks (push) Has been cancelled
nightly / docs build (push) Has been cancelled
nightly / docs push (push) Has been cancelled
ghstack-source-id: 59641aa32dc6fd027abf3276017432b693aa71f8
Pull-Request-resolved: https://github.com/pytorch/pytorch/pull/165065
Fixes #ISSUE_NUMBER
Opening a new PR for codev
Pull Request resolved: https://github.com/pytorch/pytorch/pull/166136
Approved by: https://github.com/ngimel
2025-11-01 12:33:48 +00:00
Scott Wolchok
7d16fcf2df
Re-re-re-re-apply "C++-accessible Placements via pybind11 ( #163030 )" ( #166132 )
...
Was reverted (again!) due to a merge conflict that crept in sometime during the "export to github -> land internally -> merge on github" process.
D85096233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/166132
Approved by: https://github.com/Skylion007 , https://github.com/ezyang , https://github.com/malfet
2025-10-27 21:19:32 +00:00
PyTorch MergeBot
75b8295868
Revert "Warn if AccumulateGrad stream does not match producer node stream ( #165065 )"
...
This reverts commit 12f742941d .
Reverted https://github.com/pytorch/pytorch/pull/165065 on behalf of https://github.com/clee2000 due to broke internal builds D85273204 usages of TORCH_API void add need to be updated? ([comment](https://github.com/pytorch/pytorch/pull/165065#issuecomment-3438061854 ))
2025-10-23 17:02:49 +00:00
Eddie Yan
e64a814ae7
[CUDA] Add experimental green context support for SM carveout ( #159104 )
...
Low-level PyTorch APIs should be usable/stable enough at this point but we might move the underlying driver API usage a bit from here...
Built on top of @drisspg 's branch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159104
Approved by: https://github.com/ngimel , https://github.com/malfet , https://github.com/kwen2501
Co-authored-by: drisspg <drisspguessous@gmail.com>
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-10-22 21:38:52 +00:00
soulitzer
12f742941d
Warn if AccumulateGrad stream does not match producer node stream ( #165065 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065
Approved by: https://github.com/ngimel
2025-10-22 17:33:27 +00:00
Yuanyuan Chen
99c8640b5d
[1/N] Change C-style casts to static_cast or reinterpret_cast ( #165750 )
...
This series of changes try to cover C style casts into C++ alternatives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165750
Approved by: https://github.com/Skylion007
2025-10-20 23:27:13 +00:00
PyTorch MergeBot
ab82456c16
Revert "[1/N] Change C-style casts to static_cast or reinterpret_cast ( #165750 )"
...
This reverts commit e1e8491b31 .
Reverted https://github.com/pytorch/pytorch/pull/165750 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165750#issuecomment-3422413890 ))
2025-10-20 14:51:58 +00:00
Yuanyuan Chen
e1e8491b31
[1/N] Change C-style casts to static_cast or reinterpret_cast ( #165750 )
...
This series of changes try to cover C style casts into C++ alternatives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165750
Approved by: https://github.com/Skylion007
2025-10-20 04:36:19 +00:00
Yuanyuan Chen
032bed95cd
Various C++ code fixes in LSAN integration ( #165818 )
...
This PR extracts the C++ code fixes from #154584 , which are fixes in enabling LSAN.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165818
Approved by: https://github.com/ezyang
2025-10-18 17:59:23 +00:00
Nikita Shulga
ce109b3f79
Add torch.backends.mkldnn.is_acl_available() method ( #165678 )
...
That tells whether or not PyTorch was compiled with Arm Compute Library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165678
Approved by: https://github.com/Skylion007 , https://github.com/atalman , https://github.com/albanD
ghstack dependencies: #165583 , #165584 , #165676
2025-10-16 22:34:21 +00:00
PyTorch MergeBot
f975bd58af
Revert "Warn if AccumulateGrad stream does not match producer node stream ( #165065 )"
...
This reverts commit a70ef954b9 .
Reverted https://github.com/pytorch/pytorch/pull/165065 on behalf of https://github.com/izaitsevfb due to breaks lint ([comment](https://github.com/pytorch/pytorch/pull/165065#issuecomment-3391387386 ))
2025-10-10 17:29:29 +00:00
soulitzer
a70ef954b9
Warn if AccumulateGrad stream does not match producer node stream ( #165065 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065
Approved by: https://github.com/ngimel
ghstack dependencies: #162815
2025-10-10 16:46:01 +00:00
FFFrog
5390324984
[CodeClean] Replace std::runtime_error with TORCH_CHECK ( #164129 )
...
As the title stated.
**Changes**:
- torch/csrc/Module.cpp
- torch/csrc/utils.cpp
- torch/csrc/stable
- torch/lib/libshm
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164129
Approved by: https://github.com/albanD
2025-10-09 19:01:07 +00:00
Simon Layton
6a7f5c0d21
Add scaled_mm python API, test ( #164142 )
...
Summary:
* Add `torch.nn.functional.scaled_mm` as an abstraction around the C++
methods
* Wraps `torch._scaled_mm_v2` API by default, but user can force use of
the older `torch._scaled_mm` interface.
* Scaled MM tests now run on the new API
Test Plan:
`pytest test/test_scaled_matmul_cuda.py`
Reviewers:
Subscribers:
Tasks:
Tags:
Signed-off-by: Simon Layton <simonlaytonmeta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164142
Approved by: https://github.com/drisspg
ghstack dependencies: #164141
2025-10-09 12:43:18 +00:00
Natalia Gimelshein
37c6087334
Add split-K control to cuBLAS reduced-precision settings ( #164766 )
...
## Summary
- add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options
- extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls
- update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations
## Testing
- python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v *(fails: ModuleNotFoundError: No module named 'psutil')*
------
https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766
Approved by: https://github.com/malfet , https://github.com/albanD
2025-10-08 18:48:45 +00:00
PyTorch MergeBot
df640df68a
Revert "Reapply "C++-accessible Placements via pybind11 ( #163030 )" ( #164519 )"
...
This reverts commit 8c0bc879b9 .
Reverted https://github.com/pytorch/pytorch/pull/164519 on behalf of https://github.com/malfet due to Still breaks internal workflows ([comment](https://github.com/pytorch/pytorch/pull/164519#issuecomment-3378469432 ))
2025-10-07 19:46:17 +00:00
Scott Wolchok
8c0bc879b9
Reapply "C++-accessible Placements via pybind11 ( #163030 )" ( #164519 )
...
This makes Placement data representation available in C++ via pybind11. Reapply with fix for internal errors.
D83788896
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164519
Approved by: https://github.com/Skylion007 , https://github.com/ezyang
2025-10-06 23:19:14 +00:00
PyTorch MergeBot
331191ce4b
Revert "[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )"
...
This reverts commit 29cbcbac42 .
Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/izaitsevfb due to reverted internally, see [D83214133](https://www.internalfb.com/diff/D83214133 ) ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3369348172 ))
2025-10-05 21:39:57 +00:00
Yuanyuan Chen
5103ecc5d8
[1/N] Fix clang-tidy readability checks ( #164561 )
...
Check all `.cpp` files except `jit` files for readability thoroughly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164561
Approved by: https://github.com/Skylion007
2025-10-04 09:40:38 +00:00
Lakshay Garg
f006aee601
Speed up FP precision lookup ( #164044 )
...
This commit simplifies the precision lookup and setting logic
by reducing the number of branches and using a custom hash
function. Fixes #161822 . The issue described in #163709 still
persists. This is meant as a short term fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044
Approved by: https://github.com/ngimel , https://github.com/eqy
2025-10-03 21:35:20 +00:00
PyTorch MergeBot
2a7c486750
Revert "Speed up FP precision lookup ( #164044 )"
...
This reverts commit 723ba21393 .
Reverted https://github.com/pytorch/pytorch/pull/164044 on behalf of https://github.com/yangw-dev due to broke internal build In file included from xplat/caffe2/aten/src/ATen/DeviceAccelerator.cpp:1: xplat/caffe2/aten/src/ATen/Context.h:502:38: error: shift count >= width of type [-Werror,-Wshift-count-overflow] 502 | return std::hash<size_t>{}((k1 << 32) | k2); ([comment](https://github.com/pytorch/pytorch/pull/164044#issuecomment-3363016702 ))
2025-10-02 21:00:44 +00:00
PyTorch MergeBot
f6f7676756
Revert "C++-accessible Placements via pybind11 ( #163030 )"
...
This reverts commit 3e03deab6f .
Reverted https://github.com/pytorch/pytorch/pull/163030 on behalf of https://github.com/swolchok due to doesn't pass pyre ([comment](https://github.com/pytorch/pytorch/pull/163030#issuecomment-3362450379 ))
2025-10-02 18:25:24 +00:00
PyTorch MergeBot
c6329524d8
Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro ( #163527 )"
...
This reverts commit 50c0550f5a .
Reverted https://github.com/pytorch/pytorch/pull/163527 on behalf of https://github.com/swolchok due to breaking import torch in debug builds, see #164297 ([comment](https://github.com/pytorch/pytorch/pull/163527#issuecomment-3361919142 ))
2025-10-02 15:42:42 +00:00
Scott Wolchok
3e03deab6f
C++-accessible Placements via pybind11 ( #163030 )
...
This makes Placement data representation available in C++ via pybind11.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163030
Approved by: https://github.com/ezyang
2025-10-02 02:38:23 +00:00
Lakshay Garg
723ba21393
Speed up FP precision lookup ( #164044 )
...
This commit simplifies the precision lookup and setting logic
by reducing the number of branches and using a custom hash
function. Fixes #161822 . The issue described in #163709 still
persists. This is meant as a short term fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044
Approved by: https://github.com/ngimel , https://github.com/eqy
2025-10-02 00:59:19 +00:00
Han Qi
b5c4f46bb9
Add functions to setup PrivateUse1 as a python backend device. ( #157859 )
...
Fixes #156052 and #156444 .
This PR setup the privateuseone key in Python to be used as a python backend for pytorch.
Meaning that, after calling `setup_privateuseone_for_python_backend('npy')`, one can use a subclass to with that device to hold arbitrary python data as "device data" and use `torch.library` to register ops that takes that Tensor.
Changes done in this PR:
1. Register an vanilla Device Guard: I extended NoOpDeviceGuard to have allow device index of 0 and to not raise errors when event related functions are accessed. If I don't do those, when calling backward I would get errors. (CPU backend uses NoOpDeviceGuard just fine, although there seems to be special treatment of CPU in the autograd engine.
2. Tensor subclass allows not having `__torch_dispatch__` if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops.
3. python function that invokes the other incantations to setup the privateusekey backend.
This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp ; great thanks to @bdhirsh and @geohot.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157859
Approved by: https://github.com/albanD
2025-10-01 21:32:59 +00:00
PyTorch MergeBot
410ed3006b
Revert "Add functions to setup PrivateUse1 as a python backend device. ( #157859 )"
...
This reverts commit 1310d6a1f9 .
Reverted https://github.com/pytorch/pytorch/pull/157859 on behalf of https://github.com/jeanschmidt due to introduce linting errors ([comment](https://github.com/pytorch/pytorch/pull/157859#issuecomment-3352140098 ))
2025-09-30 13:24:37 +00:00
Han Qi
1310d6a1f9
Add functions to setup PrivateUse1 as a python backend device. ( #157859 )
...
Fixes #156052 and #156444 .
This PR setup the privateuseone key in Python to be used as a python backend for pytorch.
Meaning that, after calling `setup_privateuseone_for_python_backend('npy')`, one can use a subclass to with that device to hold arbitrary python data as "device data" and use `torch.library` to register ops that takes that Tensor.
Changes done in this PR:
1. Register an vanilla Device Guard: I extended NoOpDeviceGuard to have allow device index of 0 and to not raise errors when event related functions are accessed. If I don't do those, when calling backward I would get errors. (CPU backend uses NoOpDeviceGuard just fine, although there seems to be special treatment of CPU in the autograd engine.
2. Tensor subclass allows not having `__torch_dispatch__` if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops.
3. python function that invokes the other incantations to setup the privateusekey backend.
This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp ; great thanks to @bdhirsh and @geohot.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157859
Approved by: https://github.com/albanD
2025-09-30 08:39:36 +00:00
Scott Wolchok
50c0550f5a
Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro ( #163527 )
...
See comment on the macro definition. In short, pybind11 3.x
added `py::native_enum`, and also had to add overhead for that new way
to bind enums on the critical path for calling functions that take
regular old `py::enum_`s as arguments (for example, `__eq__`).
Differential Revision: [D82873169](https://our.internmc.facebook.com/intern/diff/D82873169/ )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163527
Approved by: https://github.com/ezyang
2025-09-26 17:59:22 +00:00
PyTorch MergeBot
00059db034
Revert "[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )"
...
This reverts commit 09cb34c1dc .
Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/malfet due to reverted internally and now can be safely reverted in OSS ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3334176367 ))
2025-09-25 13:47:46 +00:00
Brian Hirsh
7d710403b0
Reapply "Make functionalization ViewMeta serializable with pickle. ( #143712 )" ( #163769 )
...
### Summary:
NOTE: This is a re-export of https://github.com/pytorch/pytorch/pull/161994 ; the changes between these two PRs is exclusively to the buck/build files
(Summary from #161994 )
Attempted rebase of https://github.com/pytorch/pytorch/pull/143712 .
This reverts commit 6c713ccb5e .
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela
imported-using-ghimport
Test Plan: Imported from OSS
Differential Revision: D81524507
Pulled By: Lucaskabela
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163769
Approved by: https://github.com/dolpm
Co-authored-by: Brian Hirsh <hirsheybar@fb.com>
2025-09-25 10:27:37 +00:00
PaliC
29cbcbac42
[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )
...
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process
Gonna ask for review from @huydhn as there are some changes to CI.
Testing: imported internally and the failed android build seems to work now!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD , https://github.com/huydhn
2025-09-25 08:53:19 +00:00
Edward Yang
09cb34c1dc
[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )
...
Summary:
Original: D81957844 and D81957923
Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well
#buildall
Test Plan:
sandcastle and oss ci
Rollback Plan:
Reviewed By: H-Huang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang , https://github.com/dcci
2025-09-22 21:12:18 +00:00
PyTorch MergeBot
edafc902d7
Revert "[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )"
...
This reverts commit d1993c27ae .
Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/wdvr due to reverted internally, please see D82771705 @PaliC ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3317110247 ))
2025-09-22 06:22:37 +00:00
PyTorch MergeBot
f0078941cf
Revert "[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )"
...
This reverts commit 6c334885d4 .
Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/wdvr due to reverted internally - @ezyang see D82281294 ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3317017530 ))
2025-09-22 05:39:07 +00:00
Sherlock Huang
033b7d1e1a
[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available ( #163187 )
...
Reland of #160532
Summary:
To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this.
I expect the following pattern to work
```
with FakeTensorMode(allow_non_fake_inputs=True):
cuda_module = module.to("cuda:0")
cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs])
with torch.no_grad():
ep = torch.export.export(cuda_module, cuda_sample_inputs)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016
Approved by: https://github.com/huydhn
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163187
Approved by: https://github.com/angelayi
2025-09-18 04:46:26 +00:00
Sahan Paliskara
d1993c27ae
[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )
...
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process
Gonna ask for review from @huydhn as there are some changes to CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD , https://github.com/huydhn
2025-09-17 16:40:55 +00:00
PyTorch MergeBot
79fd497423
Revert "[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build ( #163016 )"
...
This reverts commit f1eb99e2e4 .
Reverted https://github.com/pytorch/pytorch/pull/163016 on behalf of https://github.com/jeffdaily due to broke rocm CI, see export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nonzero_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/17787208381/job/50564369696 ) [HUD commit link](f1eb99e2e4 ) ([comment](https://github.com/pytorch/pytorch/pull/163016#issuecomment-3303707552 ))
2025-09-17 16:17:53 +00:00
Sherlock Huang
f1eb99e2e4
[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build ( #163016 )
...
Reland of #160532
Summary:
To support exporting a cuda model on a CPU-only machine under fake tensor mode.
User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call.
This diff supports this.
I expect the following pattern to work
```
with FakeTensorMode(allow_non_fake_inputs=True):
cuda_module = module.to("cuda:0")
cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs])
with torch.no_grad():
ep = torch.export.export(cuda_module, cuda_sample_inputs)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016
Approved by: https://github.com/huydhn
2025-09-17 05:01:33 +00:00
PyTorch MergeBot
4db203f875
Revert "[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )"
...
This reverts commit 05ee8114f8 .
Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/jeanschmidt due to seems to have introduced errors in linting see https://github.com/pytorch/pytorch/actions/runs/17750689989/job/50444910643 ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3298626136 ))
2025-09-16 12:52:57 +00:00
PaliC
05ee8114f8
[BE] Make PyObjectSlot use a global PyInterpreter ( #162659 )
...
This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process
Gonna ask for review from @huydhn as there are some changes to CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659
Approved by: https://github.com/albanD , https://github.com/huydhn
2025-09-16 00:37:09 +00:00
PyTorch MergeBot
9c93dc8123
Revert "Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build ( #160532 )"
...
This reverts commit a956c4ab1c .
Reverted https://github.com/pytorch/pytorch/pull/160532 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/160532#issuecomment-3287745165 ))
2025-09-13 07:42:12 +00:00
Sherlock Huang
a956c4ab1c
Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build ( #160532 )
...
Summary:
To support exporting a cuda model on a CPU-only machine under fake tensor mode.
User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call.
This diff supports this.
I expect the following pattern to work
```
with FakeTensorMode(allow_non_fake_inputs=True):
cuda_module = module.to("cuda:0")
cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs])
with torch.no_grad():
ep = torch.export.export(cuda_module, cuda_sample_inputs)
```
Test Plan:
CI
Rollback Plan:
Differential Revision: D80181887
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160532
Approved by: https://github.com/henryoier , https://github.com/ezyang
2025-09-13 01:50:51 +00:00
Edward Yang
6c334885d4
[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )
...
Summary:
Original: D81957844 and D81957923
Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well
#buildall
Test Plan:
sandcastle and oss ci
Rollback Plan:
Reviewed By: H-Huang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang , https://github.com/dcci
2025-09-12 10:54:42 +00:00
PyTorch MergeBot
6b59a19242
Revert "[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )"
...
This reverts commit 6e8f17c580 .
Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3283985880 ))
2025-09-12 06:52:03 +00:00
Edward Yang
6e8f17c580
[RELAND] Always build USE_DISTRIBUTED ( #160449 ) and Make distributed modules importable even when backend not built ( #159889 ) ( #162594 )
...
Summary:
Original: D81957844 and D81957923
Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well
#buildall
Test Plan:
sandcastle and oss ci
Rollback Plan:
Reviewed By: H-Huang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594
Approved by: https://github.com/H-Huang , https://github.com/dcci
2025-09-12 03:56:18 +00:00
Edward Yang
dda071587f
Revert "Make distributed modules importable even when backend not built ( #159889 )" ( #162568 )
...
This reverts commit a0d026688c .
Revert "Always build USE_DISTRIBUTED. (#160449 )"
This reverts commit d80297a684 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162568
Approved by: https://github.com/huydhn
2025-09-10 04:29:42 +00:00
Edward Yang
d80297a684
Always build USE_DISTRIBUTED. ( #160449 )
...
Signed-off-by: Edward Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab , https://github.com/albanD , https://github.com/dcci
2025-09-08 19:10:36 +00:00
PyTorch MergeBot
1e0656f063
Revert "Always build USE_DISTRIBUTED. ( #160449 )"
...
This reverts commit de893e96c7 .
Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to internal changes breaks import checks, see [D81845053](https://www.internalfb.com/diff/D81845053 ) ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3264887002 ))
2025-09-08 07:04:36 +00:00
Edward Yang
de893e96c7
Always build USE_DISTRIBUTED. ( #160449 )
...
Signed-off-by: Edward Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449
Approved by: https://github.com/wconstab , https://github.com/albanD , https://github.com/dcci
2025-09-05 20:15:11 +00:00