pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	c6329524d8	Revert "Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 )" This reverts commit `50c0550f5a`. Reverted https://github.com/pytorch/pytorch/pull/163527 on behalf of https://github.com/swolchok due to breaking import torch in debug builds, see #164297 ([comment](https://github.com/pytorch/pytorch/pull/163527#issuecomment-3361919142))	2025-10-02 15:42:42 +00:00
Scott Wolchok	50c0550f5a	Add magic TORCH_MAKE_PYBIND_ENUM_FASTER macro (#163527 ) See comment on the macro definition. In short, pybind11 3.x added `py::native_enum`, and also had to add overhead for that new way to bind enums on the critical path for calling functions that take regular old `py::enum_`s as arguments (for example, `__eq__`). Differential Revision: [D82873169](https://our.internmc.facebook.com/intern/diff/D82873169/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163527 Approved by: https://github.com/ezyang	2025-09-26 17:59:22 +00:00
PyTorch MergeBot	9a883007a2	Revert "Implement cuda graphs implementation of torch.cond and torch.while_loop (#140979 )" This reverts commit `c7515da7b0`. Reverted https://github.com/pytorch/pytorch/pull/140979 on behalf of https://github.com/huydhn due to This change has been reported to break internal code ([comment](https://github.com/pytorch/pytorch/pull/140979#issuecomment-2657361940))	2025-02-13 18:04:26 +00:00
Daniel Galvez	c7515da7b0	Implement cuda graphs implementation of torch.cond and torch.while_loop (#140979 ) This is a new PR for #130386 , which got stale and was closed. Since I force-pushed to that branch in order to rebase it on top of main, the PR can no longer be reopened, according to https://github.com/isaacs/github/issues/361 I fixed the possibly-not-warmed-up problem described here: https://github.com/pytorch/pytorch/pull/130386/files#r1690856534 Since starting this, torch.cond and torch.while_loop now apparently have support for backward passes. I will look into what it might take to support that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140979 Approved by: https://github.com/eqy, https://github.com/eellison	2025-02-11 18:16:15 +00:00
cyy	f95c71867e	[9/N] Fix extra warnings brought by clang-tidy-17 (#139286 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/139286 Approved by: https://github.com/ezyang	2024-10-31 05:20:31 +00:00
Will Constable	df59084012	Drop GIL around cudart APIs (#132520 ) Noticed a hang where the stuck thread blocked on cudaHostUnregister call, probably due to an internal cuda deadlock caused by something else, but was holding the GIL at the time and blocked other python threads. As far as I can tell cudart APIs all do not require the GIL held nor are they marked as thread unsafe. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132520 Approved by: https://github.com/LucasLLC, https://github.com/kirtiteja	2024-08-05 17:04:01 +00:00
cyy	3cd6a21e8f	[DeviceIndex][6/N] Use DeviceIndex in more places (#120133 ) This PR follows the series of patches beginning with #119142 and fixes various XPU and python related methods to use DeviceIndex. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120133 Approved by: https://github.com/Skylion007	2024-02-21 06:24:23 +00:00
Nikita Shulga	dfd441a12c	[BE] Use nested namespaces in `torch/csrc/cuda` (#106928 ) <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 6b1dde1</samp> > _`namespace` syntax_ > _Simplified with C++17_ > _Code is more readable_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/106928 Approved by: https://github.com/huydhn, https://github.com/izaitsevfb	2023-08-10 03:56:09 +00:00
Jianyu Huang	63b8ecc415	[CUDA12] Make PyTorch compatible with CUDA 12 (#91118 ) Fix the failure when building PyTorch from source code using CUDA 12 ``` In file included from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAFunctions.h:12, from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAStream.h:10, from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAGraphsC10Utils.h:3, from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.h:5, from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:2: /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp: In member function ‘void at::cuda::CUDAGraph::capture_end()’: /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:75: warning: converting to non-pointer type ‘long long unsigned int’ from NULL [-Wconversion-null] AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0)); ^ /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAException.h:31:42: note: in definition of macro ‘C10_CUDA_CHECK’ C10_UNUSED const cudaError_t __err = EXPR; \ ^~~~ /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:5: note: in expansion of macro ‘AT_CUDA_CHECK’ AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0)); ^~~~~~~~~~~~~ /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:75: error: too many arguments to function ‘cudaError_t cudaGraphInstantiate(CUgraphExec_st*, cudaGraph_t, long long unsigned int)’ AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0)); ^ /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAException.h:31:42: note: in definition of macro ‘C10_CUDA_CHECK’ C10_UNUSED const cudaError_t __err = EXPR; \ ^~~~ /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:168:5: note: in expansion of macro ‘AT_CUDA_CHECK’ AT_CUDA_CHECK(cudaGraphInstantiate(&graph_exec_, graph_, NULL, NULL, 0)); ^~~~~~~~~~~~~ In file included from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAStream.h:6, from /home/jianyuhuang/Work/Github/pytorch/c10/cuda/CUDAGraphsC10Utils.h:3, from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.h:5, from /home/jianyuhuang/Work/Github/pytorch/aten/src/ATen/cuda/CUDAGraph.cpp:2: /usr/local/cuda/include/cuda_runtime_api.h:11439:39: note: declared here extern __host__ cudaError_t CUDARTAPI cudaGraphInstantiate(cudaGraphExec_t pGraphExec, cudaGraph_t graph, unsigned long long flags __dv(0)); ^~~~~~~~~~~~~~~~~~~~ ninja: build stopped: subcommand failed. ``` ``` /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp: In function ‘void torch::cuda::shared::initCudartBindings(PyObject*)’: /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:13: error: ‘cudaOutputMode_t’ was not declared in this scope py::enum_<cudaOutputMode_t>( ^~~~~~~~~~~~~~~~ /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:13: note: suggested alternative: ‘cudaGraphNode_t’ py::enum_<cudaOutputMode_t>( ^~~~~~~~~~~~~~~~ cudaGraphNode_t /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:34:29: error: template argument 1 is invalid py::enum_<cudaOutputMode_t>( ^ /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:38:30: error: ‘cudaKeyValuePair’ was not declared in this scope .value("KeyValuePair", cudaKeyValuePair) ^~~~~~~~~~~~~~~~ /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:39:21: error: ‘cudaCSV’ was not declared in this scope .value("CSV", cudaCSV); ^~~~~~~ /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:39:21: note: suggested alternative: ‘cudart’ .value("CSV", cudaCSV); ^~~~~~~ cudart /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:99:7: error: ‘cudaProfilerInitialize’ was not declared in this scope cudaProfilerInitialize); ^~~~~~~~~~~~~~~~~~~~~~ /home/jianyuhuang/Work/Github/pytorch/torch/csrc/cuda/shared/cudart.cpp:99:7: note: suggested alternative: ‘cudaProfilerStart’ cudaProfilerInitialize); ^~~~~~~~~~~~~~~~~~~~~~ cudaProfilerStart ninja: build stopped: subcommand failed. ``` After these fixes, we can see CUDA 12 is successfully built with OSS PyTorch instructions. USE_CUDA=1 python setup.py develop 2>&1 \| tee compile.log Pull Request resolved: https://github.com/pytorch/pytorch/pull/91118 Approved by: https://github.com/ngimel, https://github.com/brad-mengchi	2022-12-20 10:58:53 +00:00
Richard Barnes	3ece9fb45d	Check all CUDA API calls for errors in torch/ (#81560 ) Summary: Original commit changeset: 0bb770d2cdb2 Original Phabricator Diff: D35194935 (`79e5b053b6`) Differential Revision: D35291874 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81560 Approved by: https://github.com/ezyang	2022-10-28 00:40:48 +00:00
Jeff Daily	263c05c918	[ROCm] work-around missing hipProfilerStart/Stop (#82778 ) ### Description cudaProfilerStart and cudaProfilerStop are deprecated but exposed by torch.cuda.cudart(). HIP has corresponding functions stubbed out, hipProfilerStart and hipProfilerStop, but they return hipErrorNotSupported. Profiling in HIP is supported, but not via these deprecated APIs. See https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__PROFILER__DEPRECATED.html. These functions are indirectly used by one or more unit tests that would otherwise pass if the non-functional HIP APIs were replaced with a dummy function. ### Testing Unskipped a related unit test, run by ciflow/trunk. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82778 Approved by: https://github.com/ezyang	2022-08-08 18:25:13 +00:00
Michael Suo	30fb2c4aba	[lint] autoformat test/cpp and torch/csrc Let's have some fun. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828 Approved by: https://github.com/ezyang	2022-06-11 21:11:16 +00:00
Nikita Shulga	c40a009d66	Revert D35194935: Check all CUDA API calls for errors in torch/ Test Plan: revert-hammer Differential Revision: D35194935 (`79e5b053b6`) Original commit changeset: f5ec5be87cdf Original Phabricator Diff: D35194935 (`79e5b053b6`) fbshipit-source-id: 0bb770d2cdb29b8e724c0b6a125c748f363d3358 (cherry picked from commit 04e5a73da4a53b0ec296f3df2c85626d19290c1f)	2022-03-31 05:48:30 +00:00
Richard Barnes	79e5b053b6	Check all CUDA API calls for errors in torch/ (#74923 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74923 Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D35194935 fbshipit-source-id: f5ec5be87cdf775eb9c99f8c3baed6b0366dda49 (cherry picked from commit 7284c4ed7d57261d4936055e0c1a3f8f911fb1f0)	2022-03-31 05:08:55 +00:00
Shubhorup Biswas	e505f06a79	More folders in clang-tidy (#74908 ) Summary: 1. Added folders torch/csrc/onnx and torch/csrc/cuda to clang-tidy. 2. Fixed clang-tidy violations in torch/csrc/cuda 3. Fixed(added Python import paths) in clang-tidy Python runner Fixes some of https://github.com/pytorch/pytorch/issues/62011 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74908 Reviewed By: atalman Differential Revision: D35221843 Pulled By: shahofblah fbshipit-source-id: f0d1f066550b383aa48449b12d194009977c0bd8 (cherry picked from commit 830186a673f432c9f3558f3e9cf1cd4294fa0fb0)	2022-03-29 22:59:16 +00:00
Mike Ruberry	dc87cf5fe1	Fixes mem_get_info when querying on a device other than the current device (#69640 ) Summary: Also fixes the documentation failing to appear and adds a test to validate that op works with multiple devices properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69640 Reviewed By: ngimel Differential Revision: D32965391 Pulled By: mruberry fbshipit-source-id: 4fe502809b353464da8edf62d92ca9863804f08e	2021-12-08 23:04:30 -08:00
Pruthvi Madugundu	085e2f7bdd	[ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610 - Replace HIP_PLATFORM_HCC with USE_ROCM - Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION. - In the next PR - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify. - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc. cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd Reviewed By: jbschlosser Differential Revision: D30909053 Pulled By: ezyang fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06	2021-09-29 09:55:43 -07:00
Emilio Castillo	f9ec86a6c6	External stream (#59527 ) Summary: Previous is https://github.com/pytorch/pytorch/issues/57781 We add now two CUDA bindings to avoid using ctypes to fix a windows issue. However, we use ctypes to allocate the stream and create its pointer (we can do this with a 0-dim tensor too if it feels better). CC. ezyang rgommers ngimel mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/59527 Reviewed By: albanD Differential Revision: D29053062 Pulled By: ezyang fbshipit-source-id: 661e7e58de98b1bdb7a0871808cd41d91fe8f13f	2021-06-14 13:46:11 -07:00
Corey Lammie	b4b95fc87a	Expose `cudaMemGetInfo` (#58635 ) Summary: This PR resolves the second issue outlined in https://github.com/pytorch/pytorch/issues/58376, which has previously been discussed in https://github.com/pytorch/pytorch/issues/50722. `cudaMemGetInfo` is bound/exposed to the Python API. An example function call is provided below: ``` device_free, device_total = torch.cuda.mem_get_info(torch.device('cuda:0')) print(device_free, device_total) ``` In `CUDACachingAllocator.cpp`, in constant to my initial PR, the newly defined function `std::pair<size_t, size_t> raw_cuda_mem_get_info(int device)` has been moved from the `CUDACaching` namespace to the `cuda` namespace. In addition, as suugested by ezyang, `det` has been removed from all function names. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58635 Reviewed By: zou3519 Differential Revision: D28649093 Pulled By: ezyang fbshipit-source-id: d8b7c53e52cf73f35495d8651863c5bb408d7a6a	2021-05-25 14:58:35 -07:00
Edward Yang	da4033d32a	Make cudaHostRegister actually useful on cudart. (#45159 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45159 By default, pybind11 binds void* to be capsules. After a lot of Googling, I have concluded that this is not actually useful: you can't actually create a capsule from Python land, and our data_ptr() function returns an int, which means that the function is effectively unusable. It didn't help that we had no tests exercising it. I've replaced the void* with uintptr_t, so that we now accept int (and you can pass data_ptr() in directly). I'm not sure if we should make these functions accept ctypes types; unfortunately, pybind11 doesn't seem to have any easy way to do this. Fixes #43006 Also added cudaHostUnregister which was requested. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: lw Differential Revision: D23849731 Pulled By: ezyang fbshipit-source-id: 8a79986f3aa9546abbd2a6a5828329ae90fd298f	2020-09-23 11:05:44 -07:00
Luca Wehrstedt	c20426f86d	Fix torch.cuda.check_error type errors (#41330 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41330 `torch.cuda.check_error` is annotated as taking an `int` as argument but when running `torch.cuda.check_error(34)` one would get: ``` TypeError: cudaGetErrorString(): incompatible function arguments. The following argument types are supported: 1. (arg0: torch._C._cudart.cudaError) -> str Invoked with: 34 ``` Even if one explicitly casted the argument, running `torch.cuda.check_error(torch._C._cudart.cudaError(34))` would give: ``` AttributeError: 'str' object has no attribute 'decode' ``` This PR fixes both issues (thus allowing `check_error` to be called with a un-casted int) and adds a test. ghstack-source-id: 107628709 Test Plan: Unit tests Reviewed By: ezyang Differential Revision: D22500549 fbshipit-source-id: 9170c1e466dd554d471e928b26eb472a712da9e1	2020-07-14 00:47:14 -07:00
Edward Yang	940e678da9	Add back cudaHostRegister to cudart API. (#34665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34665 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D20493861 Pulled By: ezyang fbshipit-source-id: 4215e3037a16be460f20cfc2859be5ee074128d3	2020-03-17 13:30:39 -07:00
Peter Bell	5fc5cf6571	Stop using ctypes to interface with CUDA libraries. (#33678 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33016, Continuation of https://github.com/pytorch/pytorch/issues/31160 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33678 Differential Revision: D20249187 Pulled By: ezyang fbshipit-source-id: 172ce4a0fee7fbe01436a421d1af22ef6173b6ed	2020-03-11 07:22:46 -07:00

23 Commits