pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
soulitzer	b3861ac8e7	[reland] Warn if AccumulateGrad stream does not match producer node stream (#166136 ) Some checks failed docker-builds / docker-build (pytorch-linux-jammy-linter, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-clang12-executorch, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-clang12-onnx, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-clang18-asan, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3-gcc11-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.10-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.10-gcc11, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-halide, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.12-triton-cpu, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.13-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-py3.14-clang12, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-rocm-n-py3-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-xpu-n-1-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-xpu-n-py3, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-jammy-xpu-n-py3-inductor-benchmarks, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-riscv64-py3.12-gcc14, linux.12xlarge) (push) Has been cancelled Details docker-builds / docker-build (pytorch-linux-noble-rocm-n-py3, linux.12xlarge) (push) Has been cancelled Details ossf-scorecard / Scorecards analysis (push) Has been cancelled Details Close nonexistent disable issues / close-nonexistent-disable-issues (push) Has been cancelled Details Index PyTorch Tests for Target Determination / get-label-type (push) Has been cancelled Details nightly / get-label-type (push) Has been cancelled Details nightly / update-commit-hashes (main, .ci/docker/ci_commit_pins, triton, triton-lang) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, audio, pytorch) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, vision, pytorch) (push) Has been cancelled Details nightly / update-commit-hashes (main, .github/ci_commit_pins, vllm, vllm-project) (push) Has been cancelled Details Index PyTorch Tests for Target Determination / index (push) Has been cancelled Details nightly / Link checks (push) Has been cancelled Details nightly / docs build (push) Has been cancelled Details nightly / docs push (push) Has been cancelled Details ghstack-source-id: 59641aa32dc6fd027abf3276017432b693aa71f8 Pull-Request-resolved: https://github.com/pytorch/pytorch/pull/165065 Fixes #ISSUE_NUMBER Opening a new PR for codev Pull Request resolved: https://github.com/pytorch/pytorch/pull/166136 Approved by: https://github.com/ngimel	2025-11-01 12:33:48 +00:00
Scott Wolchok	7d16fcf2df	Re-re-re-re-apply "C++-accessible Placements via pybind11 (#163030 )" (#166132 ) Was reverted (again!) due to a merge conflict that crept in sometime during the "export to github -> land internally -> merge on github" process. D85096233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/166132 Approved by: https://github.com/Skylion007, https://github.com/ezyang, https://github.com/malfet	2025-10-27 21:19:32 +00:00
PyTorch MergeBot	75b8295868	Revert "Warn if AccumulateGrad stream does not match producer node stream (#165065 )" This reverts commit `12f742941d`. Reverted https://github.com/pytorch/pytorch/pull/165065 on behalf of https://github.com/clee2000 due to broke internal builds D85273204 usages of TORCH_API void add need to be updated? ([comment](https://github.com/pytorch/pytorch/pull/165065#issuecomment-3438061854))	2025-10-23 17:02:49 +00:00
Eddie Yan	e64a814ae7	[CUDA] Add experimental green context support for SM carveout (#159104 ) Low-level PyTorch APIs should be usable/stable enough at this point but we might move the underlying driver API usage a bit from here... Built on top of @drisspg 's branch Pull Request resolved: https://github.com/pytorch/pytorch/pull/159104 Approved by: https://github.com/ngimel, https://github.com/malfet, https://github.com/kwen2501 Co-authored-by: drisspg <drisspguessous@gmail.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2025-10-22 21:38:52 +00:00
soulitzer	12f742941d	Warn if AccumulateGrad stream does not match producer node stream (#165065 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065 Approved by: https://github.com/ngimel	2025-10-22 17:33:27 +00:00
Yuanyuan Chen	99c8640b5d	[1/N] Change C-style casts to static_cast or reinterpret_cast (#165750 ) This series of changes try to cover C style casts into C++ alternatives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165750 Approved by: https://github.com/Skylion007	2025-10-20 23:27:13 +00:00
PyTorch MergeBot	ab82456c16	Revert "[1/N] Change C-style casts to static_cast or reinterpret_cast (#165750 )" This reverts commit `e1e8491b31`. Reverted https://github.com/pytorch/pytorch/pull/165750 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165750#issuecomment-3422413890))	2025-10-20 14:51:58 +00:00
Yuanyuan Chen	e1e8491b31	[1/N] Change C-style casts to static_cast or reinterpret_cast (#165750 ) This series of changes try to cover C style casts into C++ alternatives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165750 Approved by: https://github.com/Skylion007	2025-10-20 04:36:19 +00:00
Yuanyuan Chen	032bed95cd	Various C++ code fixes in LSAN integration (#165818 ) This PR extracts the C++ code fixes from #154584, which are fixes in enabling LSAN. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165818 Approved by: https://github.com/ezyang	2025-10-18 17:59:23 +00:00
Nikita Shulga	ce109b3f79	Add `torch.backends.mkldnn.is_acl_available()` method (#165678 ) That tells whether or not PyTorch was compiled with Arm Compute Library Pull Request resolved: https://github.com/pytorch/pytorch/pull/165678 Approved by: https://github.com/Skylion007, https://github.com/atalman, https://github.com/albanD ghstack dependencies: #165583, #165584, #165676	2025-10-16 22:34:21 +00:00
PyTorch MergeBot	f975bd58af	Revert "Warn if AccumulateGrad stream does not match producer node stream (#165065 )" This reverts commit `a70ef954b9`. Reverted https://github.com/pytorch/pytorch/pull/165065 on behalf of https://github.com/izaitsevfb due to breaks lint ([comment](https://github.com/pytorch/pytorch/pull/165065#issuecomment-3391387386))	2025-10-10 17:29:29 +00:00
soulitzer	a70ef954b9	Warn if AccumulateGrad stream does not match producer node stream (#165065 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165065 Approved by: https://github.com/ngimel ghstack dependencies: #162815	2025-10-10 16:46:01 +00:00
FFFrog	5390324984	[CodeClean] Replace std::runtime_error with TORCH_CHECK (#164129 ) As the title stated. Changes: - torch/csrc/Module.cpp - torch/csrc/utils.cpp - torch/csrc/stable - torch/lib/libshm Pull Request resolved: https://github.com/pytorch/pytorch/pull/164129 Approved by: https://github.com/albanD	2025-10-09 19:01:07 +00:00
Simon Layton	6a7f5c0d21	Add scaled_mm python API, test (#164142 ) Summary: * Add `torch.nn.functional.scaled_mm` as an abstraction around the C++ methods * Wraps `torch._scaled_mm_v2` API by default, but user can force use of the older `torch._scaled_mm` interface. * Scaled MM tests now run on the new API Test Plan: `pytest test/test_scaled_matmul_cuda.py` Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Simon Layton <simonlaytonmeta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/164142 Approved by: https://github.com/drisspg ghstack dependencies: #164141	2025-10-09 12:43:18 +00:00
Natalia Gimelshein	37c6087334	Add split-K control to cuBLAS reduced-precision settings (#164766 ) ## Summary - add a CuBLASReductionOption enum so the CUDA context can track reduced-precision and split-K options - extend the Python bindings, backend helpers, and docs to accept an optional allow_splitk argument for fp16/bf16 matmul controls - update cuBLAS/cuBLASLt call sites plus dynamo guards and tests to respect the new combinations ## Testing - python test/test_cuda.py TestCuda.test_cublas_allow_fp16_reduced_precision_reduction_get_set -v (fails: ModuleNotFoundError: No module named 'psutil') ------ https://chatgpt.com/codex/tasks/task_e_68e404623178832f8a3e1d34e1e175da Pull Request resolved: https://github.com/pytorch/pytorch/pull/164766 Approved by: https://github.com/malfet, https://github.com/albanD	2025-10-08 18:48:45 +00:00
PyTorch MergeBot	df640df68a	Revert "Reapply "C++-accessible Placements via pybind11 (#163030 )" (#164519 )" This reverts commit `8c0bc879b9`. Reverted https://github.com/pytorch/pytorch/pull/164519 on behalf of https://github.com/malfet due to Still breaks internal workflows ([comment](https://github.com/pytorch/pytorch/pull/164519#issuecomment-3378469432))	2025-10-07 19:46:17 +00:00
Scott Wolchok	8c0bc879b9	Reapply "C++-accessible Placements via pybind11 (#163030 )" (#164519 ) This makes Placement data representation available in C++ via pybind11. Reapply with fix for internal errors. D83788896 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164519 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2025-10-06 23:19:14 +00:00
PyTorch MergeBot	331191ce4b	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `29cbcbac42`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/izaitsevfb due to reverted internally, see [D83214133](https://www.internalfb.com/diff/D83214133) ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3369348172))	2025-10-05 21:39:57 +00:00
Yuanyuan Chen	5103ecc5d8	[1/N] Fix clang-tidy readability checks (#164561 ) Check all `.cpp` files except `jit` files for readability thoroughly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164561 Approved by: https://github.com/Skylion007	2025-10-04 09:40:38 +00:00
Lakshay Garg	f006aee601	Speed up FP precision lookup (#164044 ) This commit simplifies the precision lookup and setting logic by reducing the number of branches and using a custom hash function. Fixes #161822. The issue described in #163709 still persists. This is meant as a short term fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044 Approved by: https://github.com/ngimel, https://github.com/eqy	2025-10-03 21:35:20 +00:00
PyTorch MergeBot	2a7c486750	Revert "Speed up FP precision lookup (#164044 )" This reverts commit `723ba21393`. Reverted https://github.com/pytorch/pytorch/pull/164044 on behalf of https://github.com/yangw-dev due to broke internal build In file included from xplat/caffe2/aten/src/ATen/DeviceAccelerator.cpp:1: xplat/caffe2/aten/src/ATen/Context.h:502:38: error: shift count >= width of type [-Werror,-Wshift-count-overflow] 502 \| return std::hash<size_t>{}((k1 << 32) \| k2); ([comment](https://github.com/pytorch/pytorch/pull/164044#issuecomment-3363016702))	2025-10-02 21:00:44 +00:00
PyTorch MergeBot	f6f7676756	Revert "C++-accessible Placements via pybind11 (#163030 )" This reverts commit `3e03deab6f`. Reverted https://github.com/pytorch/pytorch/pull/163030 on behalf of https://github.com/swolchok due to doesn't pass pyre ([comment](https://github.com/pytorch/pytorch/pull/163030#issuecomment-3362450379))	2025-10-02 18:25:24 +00:00
PyTorch MergeBot	c6329524d8	Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 )" This reverts commit `50c0550f5a`. Reverted https://github.com/pytorch/pytorch/pull/163527 on behalf of https://github.com/swolchok due to breaking import torch in debug builds, see #164297 ([comment](https://github.com/pytorch/pytorch/pull/163527#issuecomment-3361919142))	2025-10-02 15:42:42 +00:00
Scott Wolchok	3e03deab6f	C++-accessible Placements via pybind11 (#163030 ) This makes Placement data representation available in C++ via pybind11. Pull Request resolved: https://github.com/pytorch/pytorch/pull/163030 Approved by: https://github.com/ezyang	2025-10-02 02:38:23 +00:00
Lakshay Garg	723ba21393	Speed up FP precision lookup (#164044 ) This commit simplifies the precision lookup and setting logic by reducing the number of branches and using a custom hash function. Fixes #161822. The issue described in #163709 still persists. This is meant as a short term fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164044 Approved by: https://github.com/ngimel, https://github.com/eqy	2025-10-02 00:59:19 +00:00
Han Qi	b5c4f46bb9	Add functions to setup PrivateUse1 as a python backend device. (#157859 ) Fixes #156052 and #156444. This PR setup the privateuseone key in Python to be used as a python backend for pytorch. Meaning that, after calling `setup_privateuseone_for_python_backend('npy')`, one can use a subclass to with that device to hold arbitrary python data as "device data" and use `torch.library` to register ops that takes that Tensor. Changes done in this PR: 1. Register an vanilla Device Guard: I extended NoOpDeviceGuard to have allow device index of 0 and to not raise errors when event related functions are accessed. If I don't do those, when calling backward I would get errors. (CPU backend uses NoOpDeviceGuard just fine, although there seems to be special treatment of CPU in the autograd engine. 2. Tensor subclass allows not having `__torch_dispatch__` if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops. 3. python function that invokes the other incantations to setup the privateusekey backend. This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp; great thanks to @bdhirsh and @geohot. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157859 Approved by: https://github.com/albanD	2025-10-01 21:32:59 +00:00
PyTorch MergeBot	410ed3006b	Revert "Add functions to setup PrivateUse1 as a python backend device. (#157859 )" This reverts commit `1310d6a1f9`. Reverted https://github.com/pytorch/pytorch/pull/157859 on behalf of https://github.com/jeanschmidt due to introduce linting errors ([comment](https://github.com/pytorch/pytorch/pull/157859#issuecomment-3352140098))	2025-09-30 13:24:37 +00:00
Han Qi	1310d6a1f9	Add functions to setup PrivateUse1 as a python backend device. (#157859 ) Fixes #156052 and #156444. This PR setup the privateuseone key in Python to be used as a python backend for pytorch. Meaning that, after calling `setup_privateuseone_for_python_backend('npy')`, one can use a subclass to with that device to hold arbitrary python data as "device data" and use `torch.library` to register ops that takes that Tensor. Changes done in this PR: 1. Register an vanilla Device Guard: I extended NoOpDeviceGuard to have allow device index of 0 and to not raise errors when event related functions are accessed. If I don't do those, when calling backward I would get errors. (CPU backend uses NoOpDeviceGuard just fine, although there seems to be special treatment of CPU in the autograd engine. 2. Tensor subclass allows not having `__torch_dispatch__` if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops. 3. python function that invokes the other incantations to setup the privateusekey backend. This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp; great thanks to @bdhirsh and @geohot. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157859 Approved by: https://github.com/albanD	2025-09-30 08:39:36 +00:00
Scott Wolchok	50c0550f5a	Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 ) See comment on the macro definition. In short, pybind11 3.x added `py::native_enum`, and also had to add overhead for that new way to bind enums on the critical path for calling functions that take regular old `py::enum_`s as arguments (for example, `__eq__`). Differential Revision: [D82873169](https://our.internmc.facebook.com/intern/diff/D82873169/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163527 Approved by: https://github.com/ezyang	2025-09-26 17:59:22 +00:00
PyTorch MergeBot	00059db034	Revert "[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 )" This reverts commit `09cb34c1dc`. Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/malfet due to reverted internally and now can be safely reverted in OSS ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3334176367))	2025-09-25 13:47:46 +00:00
Brian Hirsh	7d710403b0	Reapply "Make functionalization `ViewMeta` serializable with pickle. (#143712 )" (#163769 ) ### Summary: NOTE: This is a re-export of https://github.com/pytorch/pytorch/pull/161994 ; the changes between these two PRs is exclusively to the buck/build files (Summary from #161994 ) Attempted rebase of https://github.com/pytorch/pytorch/pull/143712. This reverts commit `6c713ccb5e`. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela imported-using-ghimport Test Plan: Imported from OSS Differential Revision: D81524507 Pulled By: Lucaskabela Pull Request resolved: https://github.com/pytorch/pytorch/pull/163769 Approved by: https://github.com/dolpm Co-authored-by: Brian Hirsh <hirsheybar@fb.com>	2025-09-25 10:27:37 +00:00
PaliC	29cbcbac42	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Testing: imported internally and the failed android build seems to work now! Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-25 08:53:19 +00:00
Edward Yang	09cb34c1dc	[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 ) Summary: Original: D81957844 and D81957923 Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594 Approved by: https://github.com/H-Huang, https://github.com/dcci	2025-09-22 21:12:18 +00:00
PyTorch MergeBot	edafc902d7	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `d1993c27ae`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/wdvr due to reverted internally, please see D82771705 @PaliC ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3317110247))	2025-09-22 06:22:37 +00:00
PyTorch MergeBot	f0078941cf	Revert "[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 )" This reverts commit `6c334885d4`. Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/wdvr due to reverted internally - @ezyang see D82281294 ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3317017530))	2025-09-22 05:39:07 +00:00
Sherlock Huang	033b7d1e1a	[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available (#163187 ) Reland of #160532 Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016 Approved by: https://github.com/huydhn Pull Request resolved: https://github.com/pytorch/pytorch/pull/163187 Approved by: https://github.com/angelayi	2025-09-18 04:46:26 +00:00
Sahan Paliskara	d1993c27ae	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-17 16:40:55 +00:00
PyTorch MergeBot	79fd497423	Revert "[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#163016 )" This reverts commit `f1eb99e2e4`. Reverted https://github.com/pytorch/pytorch/pull/163016 on behalf of https://github.com/jeffdaily due to broke rocm CI, see export/test_export_opinfo.py::TestExportOnFakeCudaCUDA::test_fake_export_nonzero_cuda_float32 [GH job link](https://github.com/pytorch/pytorch/actions/runs/17787208381/job/50564369696) [HUD commit link](`f1eb99e2e4`) ([comment](https://github.com/pytorch/pytorch/pull/163016#issuecomment-3303707552))	2025-09-17 16:17:53 +00:00
Sherlock Huang	f1eb99e2e4	[Reland] Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#163016 ) Reland of #160532 Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163016 Approved by: https://github.com/huydhn	2025-09-17 05:01:33 +00:00
PyTorch MergeBot	4db203f875	Revert "[BE] Make PyObjectSlot use a global PyInterpreter (#162659 )" This reverts commit `05ee8114f8`. Reverted https://github.com/pytorch/pytorch/pull/162659 on behalf of https://github.com/jeanschmidt due to seems to have introduced errors in linting see https://github.com/pytorch/pytorch/actions/runs/17750689989/job/50444910643 ([comment](https://github.com/pytorch/pytorch/pull/162659#issuecomment-3298626136))	2025-09-16 12:52:57 +00:00
PaliC	05ee8114f8	[BE] Make PyObjectSlot use a global PyInterpreter (#162659 ) This pr gets rid of the pyobj_interpreter_ variable from PyObjectSlot and saves a word in the process Gonna ask for review from @huydhn as there are some changes to CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162659 Approved by: https://github.com/albanD, https://github.com/huydhn	2025-09-16 00:37:09 +00:00
PyTorch MergeBot	9c93dc8123	Revert "Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#160532 )" This reverts commit `a956c4ab1c`. Reverted https://github.com/pytorch/pytorch/pull/160532 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/160532#issuecomment-3287745165))	2025-09-13 07:42:12 +00:00
Sherlock Huang	a956c4ab1c	Return NoOpDeviceGuardImpl in replace of CudaDeviceGuard when device is not available, or cpu-only build (#160532 ) Summary: To support exporting a cuda model on a CPU-only machine under fake tensor mode. User commonly need to move sample inputs to the cuda device with .to("cuda:0") or .to("cuda") call. This diff supports this. I expect the following pattern to work ``` with FakeTensorMode(allow_non_fake_inputs=True): cuda_module = module.to("cuda:0") cuda_sample_inputs = tuple([x.to("cuda:0") for x in sample_inputs]) with torch.no_grad(): ep = torch.export.export(cuda_module, cuda_sample_inputs) ``` Test Plan: CI Rollback Plan: Differential Revision: D80181887 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160532 Approved by: https://github.com/henryoier, https://github.com/ezyang	2025-09-13 01:50:51 +00:00
Edward Yang	6c334885d4	[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 ) Summary: Original: D81957844 and D81957923 Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594 Approved by: https://github.com/H-Huang, https://github.com/dcci	2025-09-12 10:54:42 +00:00
PyTorch MergeBot	6b59a19242	Revert "[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 )" This reverts commit `6e8f17c580`. Reverted https://github.com/pytorch/pytorch/pull/162594 on behalf of https://github.com/huydhn due to Reverted internally ([comment](https://github.com/pytorch/pytorch/pull/162594#issuecomment-3283985880))	2025-09-12 06:52:03 +00:00
Edward Yang	6e8f17c580	[RELAND] Always build USE_DISTRIBUTED (#160449 ) and Make distributed modules importable even when backend not built (#159889 ) (#162594 ) Summary: Original: D81957844 and D81957923 Also, https://github.com/pytorch/pytorch/pull/162142 is patched in as well #buildall Test Plan: sandcastle and oss ci Rollback Plan: Reviewed By: H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/162594 Approved by: https://github.com/H-Huang, https://github.com/dcci	2025-09-12 03:56:18 +00:00
Edward Yang	dda071587f	Revert "Make distributed modules importable even when backend not built (#159889 )" (#162568 ) This reverts commit `a0d026688c`. Revert "Always build USE_DISTRIBUTED. (#160449)" This reverts commit `d80297a684`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162568 Approved by: https://github.com/huydhn	2025-09-10 04:29:42 +00:00
Edward Yang	d80297a684	Always build USE_DISTRIBUTED. (#160449 ) Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449 Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci	2025-09-08 19:10:36 +00:00
PyTorch MergeBot	1e0656f063	Revert "Always build USE_DISTRIBUTED. (#160449 )" This reverts commit `de893e96c7`. Reverted https://github.com/pytorch/pytorch/pull/160449 on behalf of https://github.com/jeanschmidt due to internal changes breaks import checks, see [D81845053](https://www.internalfb.com/diff/D81845053) ([comment](https://github.com/pytorch/pytorch/pull/160449#issuecomment-3264887002))	2025-09-08 07:04:36 +00:00
Edward Yang	de893e96c7	Always build USE_DISTRIBUTED. (#160449 ) Signed-off-by: Edward Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/160449 Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/dcci	2025-09-05 20:15:11 +00:00

1 2 3 4 5 ...

671 Commits