pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	0f0b4bf029	[1/N] Remove unused header inclusion (#165763 ) This PR removes unused header inclusion in C++ files. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165763 Approved by: https://github.com/Skylion007	2025-10-18 05:23:11 +00:00
cyy	e9e1aacef8	Enable -Wunused on torch targets (#150077 ) For GCC, ``-Wunused`` contains: ``` -Wunused-function Warn whenever a static function is declared but not defined or a non\-inline static function is unused. -Wunused-label Warn whenever a label is declared but not used. To suppress this warning use the unused attribute. -Wunused-parameter Warn whenever a function parameter is unused aside from its declaration. To suppress this warning use the unused attribute. -Wunused-variable Warn whenever a local variable or non-constant static variable is unused aside from its declaration To suppress this warning use the unused attribute. ``` For Clang, some of the diagnostics controlled by ``-Wunused`` are enabled by default: ``` Controls [-Wunused-argument](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-argument), [-Wunused-but-set-variable](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-but-set-variable), [-Wunused-function](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-function), [-Wunused-label](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-label), [-Wunused-lambda-capture](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-lambda-capture), [-Wunused-local-typedef](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-local-typedef), [-Wunused-private-field](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-private-field), [-Wunused-property-ivar](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-property-ivar), [-Wunused-value](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-value), [-Wunused-variable](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-variable). ``` These checks are all usefull. This PR aims to enable ``-Wunused`` without breaking code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150077 Approved by: https://github.com/zou3519, https://github.com/wdvr	2025-05-02 07:14:19 +00:00
PyTorch MergeBot	6dadfc4457	Revert "Enable -Wunused on torch targets (#150077 )" This reverts commit `688adc9941`. Reverted https://github.com/pytorch/pytorch/pull/150077 on behalf of https://github.com/wdvr due to failing internally with use of undeclared identifier ([comment](https://github.com/pytorch/pytorch/pull/150077#issuecomment-2846499828))	2025-05-02 06:53:20 +00:00
cyy	688adc9941	Enable -Wunused on torch targets (#150077 ) For GCC, ``-Wunused`` contains: ``` -Wunused-function Warn whenever a static function is declared but not defined or a non\-inline static function is unused. -Wunused-label Warn whenever a label is declared but not used. To suppress this warning use the unused attribute. -Wunused-parameter Warn whenever a function parameter is unused aside from its declaration. To suppress this warning use the unused attribute. -Wunused-variable Warn whenever a local variable or non-constant static variable is unused aside from its declaration To suppress this warning use the unused attribute. ``` For Clang, some of the diagnostics controlled by ``-Wunused`` are enabled by default: ``` Controls [-Wunused-argument](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-argument), [-Wunused-but-set-variable](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-but-set-variable), [-Wunused-function](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-function), [-Wunused-label](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-label), [-Wunused-lambda-capture](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-lambda-capture), [-Wunused-local-typedef](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-local-typedef), [-Wunused-private-field](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-private-field), [-Wunused-property-ivar](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-property-ivar), [-Wunused-value](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-value), [-Wunused-variable](https://clang.llvm.org/docs/DiagnosticsReference.html#wunused-variable). ``` These checks are all usefull. This PR aims to enable ``-Wunused`` without breaking code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150077 Approved by: https://github.com/zou3519	2025-05-01 04:09:06 +00:00
Richard Barnes	fddabc6e0b	C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 ) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/6357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138364 Approved by: https://github.com/Skylion007, https://github.com/eqy	2024-10-19 13:17:43 +00:00
cyy	8c860aef0d	[Reland][Environment Variable][3/N] Use thread-safe getenv functions (#137942 ) Reland of #137328, which was reverted due to reverting a dependent PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/137942 Approved by: https://github.com/eqy	2024-10-15 07:47:24 +00:00
PyTorch MergeBot	df0c2f5cae	Revert "[Environment Variable][3/N] Use thread-safe getenv wrapper (#137328 )" This reverts commit `25ac5652d0`. Reverted https://github.com/pytorch/pytorch/pull/137328 on behalf of https://github.com/clee2000 due to need to revert this in order to revert #133896, please rebase and reland, sorry for the churn ([comment](https://github.com/pytorch/pytorch/pull/137328#issuecomment-2412143739))	2024-10-14 20:22:26 +00:00
cyyever	25ac5652d0	[Environment Variable][3/N] Use thread-safe getenv wrapper (#137328 ) Follows #124485 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137328 Approved by: https://github.com/eqy	2024-10-11 23:23:57 +00:00
PyTorch MergeBot	277ab8a4c0	Revert "[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 )" This reverts commit `a56e057814`. Reverted https://github.com/pytorch/pytorch/pull/119449 on behalf of https://github.com/jeanschmidt due to Broken internal signals, @albanD please help get this sorted :) ([comment](https://github.com/pytorch/pytorch/pull/119449#issuecomment-2069716129))	2024-04-22 14:44:44 +00:00
cyy	a56e057814	[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 ) This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119449 Approved by: https://github.com/malfet, https://github.com/albanD	2024-04-19 13:39:41 +00:00
PyTorch MergeBot	61bc188f42	Revert "[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 )" This reverts commit `b51f66c195`. Reverted https://github.com/pytorch/pytorch/pull/119449 on behalf of https://github.com/malfet due to Broke gcc9 builds ([comment](https://github.com/pytorch/pytorch/pull/119449#issuecomment-2064936414))	2024-04-18 18:53:59 +00:00
cyy	b51f66c195	[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 ) This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119449 Approved by: https://github.com/albanD	2024-04-18 13:35:48 +00:00
PyTorch MergeBot	f5049de242	Revert "[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 )" This reverts commit `5bef127c2e`. Reverted https://github.com/pytorch/pytorch/pull/119449 on behalf of https://github.com/PaliC due to your using TORCH_INTERNAL_ASSERT incorrectly ([comment](https://github.com/pytorch/pytorch/pull/119449#issuecomment-2062696010))	2024-04-17 23:44:00 +00:00
cyy	5bef127c2e	[Environment Variable][1/N] Use thread-safe env variable API in c10 (#119449 ) This PR is the beginning of attempts to wrap thread-unsafe getenv and set_env functions inside a RW mutex. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119449 Approved by: https://github.com/albanD	2024-04-16 04:39:20 +00:00
cyy	8a3c241094	Remove unused header inclusion (#119667 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/119667 Approved by: https://github.com/Skylion007	2024-02-12 05:36:25 +00:00
cyy	560c92c324	[DeviceIndex] Use DeviceIndex instead of int in CUDA wrappers (#119142 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119142 Approved by: https://github.com/ezyang	2024-02-08 23:00:56 +00:00
cyy	4a019047ad	Enable nested namespace check in clang-tidy (#118506 ) It is time to enable nested namespaces in the code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118506 Approved by: https://github.com/albanD	2024-01-31 00:32:35 +00:00
Aidyn-A	69eef5a4be	[CUDA12] set_device change (#94864 ) This PR adds workaround for CUDA 12 [`cudaSetDevice` change](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g159587909ffa0791bbe4b40187a4c6bb) which will always create primary context on target device. So operations like this: ```Python import torch x = torch.randn(1, device="cuda:1") ``` would always create primary context on on device `cuda:1` because it is creating a tensor on it and on device `cuda:0` because the destructor of CUDA Device guard calls `cudaSetDevice(0)`. After this PR the CUDA Device guard will not call `cudaSetDevice(0)` if primary context does not exist on `cuda:0`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94864 Approved by: https://github.com/malfet, https://github.com/atalman, https://github.com/ezyang	2023-04-10 17:31:12 +00:00
Richard Barnes	eadbf762fc	Fix CUDA error not getting captured by handler (#92227 ) Fixes #91758. Still leaves functions on the hotpath. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92227 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-01-17 00:16:29 +00:00
Richard Barnes	6f749fd171	Fixes to DSA infra (#91835 ) Differential Revision: D42397325 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91835 Approved by: https://github.com/soumith	2023-01-12 21:54:26 +00:00
Richard Barnes	ad188a227e	Introduce CUDA Device Assertions Infrastructure (#84609 ) Summary: This diff introduces a set of changes that makes it possible for the host to get assertions from CUDA devices. This includes the introduction of `CUDA_KERNEL_ASSERT2` A preprocessor macro to be used within a CUDA kernel that, upon an assertion failure, writes the assertion message, file, line number, and possibly other information to UVM (Managed memory). Once this is done, the original assertion is triggered, which places the GPU in a Bad State requiring recovery. In my tests, data written to UVM appears there before the GPU reaches the Bad State and is still accessible from the host after the GPU is in this state. Messages are written to a multi-message buffer which can, in theory, hold many assertion failures. I've done this as a precaution in case there are several, but I don't actually know whether that is possible and a simpler design which holds only a single message may well be all that is necessary. `TORCH_DSA_KERNEL_ARGS` This preprocess macro is added as an _argument_ to a kernel function's signature. It expands to supply the standardized names of all the arguments needed by `C10_CUDA_COMMUNICATING_KERNEL_ASSERTION` to handle device-side assertions. This includes, eg, the name of the pointer to the UVM memory the assertion would be written to. This macro abstracts the arguments so there is a single point of change if the system needs to be modified. `c10::cuda::get_global_cuda_kernel_launch_registry()` This host-side function returns a singleton object that manages the host's part of the device-side assertions. Upon allocation, the singleton allocates sufficient UVM (Managed) memory to hold information about several device-side assertion failures. The singleton also provides methods for getting the current traceback (used to identify when a kernel was launched). To avoid consuming all the host's memory the singleton stores launches in a circular buffer; a unique "generation number" is used to ensure that kernel launch failures map to their actual launch points (in the case that the circular buffer wraps before the failure is detected). `TORCH_DSA_KERNEL_LAUNCH` This host-side preprocessor macro replaces the standard ``` kernel_name<<<blocks, threads, shmem, stream>>>(args) ``` invocation with ``` TORCH_DSA_KERNEL_LAUNCH(blocks, threads, shmem, stream, args); ``` Internally, it fetches the UVM (Managed) pointer and generation number from the singleton and append these to the standard argument list. It also checks to ensure the kernel launches correctly. This abstraction on kernel launches can be modified to provide additional safety/logging. `c10::cuda::c10_retrieve_device_side_assertion_info` This host-side function checks, when called, that no kernel assertions have occurred. If one has. It then raises an exception with: 1. Information (file, line number) of what kernel was launched. 2. Information (file, line number, message) about the device-side assertion 3. Information (file, line number) about where the failure was detected. Checking for device-side assertions Device-side assertions are most likely to be noticed by the host when a CUDA API call such as `cudaDeviceSynchronize` is made and fails with a `cudaError_t` indicating > CUDA error: device-side assert triggered CUDA kernel errors Therefore, we rewrite `C10_CUDA_CHECK()` to include a call to `c10_retrieve_device_side_assertion_info()`. To make the code cleaner, most of the logic of `C10_CUDA_CHECK()` is now contained within a new function `c10_cuda_check_implementation()` to which `C10_CUDA_CHECK` passes the preprocessor information about filenames, function names, and line numbers. (In C++20 we can use `std::source_location` to eliminate macros entirely!) # Notes on special cases * Multiple assertions from the same block are recorded * Multiple assertions from different blocks are recorded * Launching kernels from many threads on many streams seems to be handled correctly * If two process are using the same GPU and one of the processes fails with a device-side assertion the other process continues without issue * X Multiple assertions from separate kernels on different streams seem to be recorded, but we can't reproduce the test condition * X Multiple assertions from separate devices should be all be shown upon exit, but we've been unable to generate a test that produces this condition Differential Revision: D37621532 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84609 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-08 01:26:07 +00:00

21 Commits