pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	bdd942efd7	Revert "Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138 )" This reverts commit `6cfc081675`. Reverted https://github.com/pytorch/pytorch/pull/144138 on behalf of https://github.com/albanD due to This seems to impact the caffe2 code ([comment](https://github.com/pytorch/pytorch/pull/144138#issuecomment-2590891200))	2025-01-14 19:04:12 +00:00
cyy	6cfc081675	Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138 ) To facilitate further possible changes of DeviceIndex to int16_t. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144138 Approved by: https://github.com/albanD	2025-01-10 23:53:19 +00:00
Tobias Ringwald	c4a1570864	Temporarily increased compile time limit of #GPUs to 120. (#121076 ) Fixes #115331. This is a temporary fix to increase the compile time number of GPUs to 120 until #119639 can be merged. Changing the parameter to 128 leads to annoying errors, as some checks would be tautological (`int8_t` is always < 128). Pull Request resolved: https://github.com/pytorch/pytorch/pull/121076 Approved by: https://github.com/albanD	2024-03-05 11:39:14 +00:00
PyTorch MergeBot	a9d9077f12	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit `7c556428c7`. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54286923 ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1969634480))	2024-02-28 18:57:09 +00:00
Tobias Ringwald	7c556428c7	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/huydhn	2024-02-27 07:05:48 +00:00
PyTorch MergeBot	fff9d98e58	Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 )" This reverts commit `e0268821dd`. Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think the Window failures are legit as they are failing now in trunk, i.e. `450339ab2d` ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1958428416))	2024-02-22 00:12:54 +00:00
Tobias Ringwald	e0268821dd	Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639 ) Fixes #115331. This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary: - `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`. - Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`. - Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this. - Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS` [^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639 Approved by: https://github.com/cyyever, https://github.com/albanD	2024-02-21 21:10:49 +00:00
albanD	54668ad6dc	Cleanup max cuda device (#118779 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118779 Approved by: https://github.com/ezyang	2024-02-01 21:11:28 +00:00
Tobias Ringwald	77366ba637	Increased hardcoded limit for number of GPUs. (#115368 ) Fixes #115331. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115368 Approved by: https://github.com/albanD	2023-12-18 18:39:19 +00:00
PyTorch MergeBot	722752fc28	Revert "Increased hardcoded limit for number of GPUs. (#115368 )" This reverts commit `c039f01bd9`. Reverted https://github.com/pytorch/pytorch/pull/115368 on behalf of https://github.com/osalpekar due to This was reverted internally due to a release breakage ([comment](https://github.com/pytorch/pytorch/pull/115368#issuecomment-1854956224))	2023-12-14 01:28:01 +00:00
Tobias Ringwald	c039f01bd9	Increased hardcoded limit for number of GPUs. (#115368 ) Fixes #115331. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115368 Approved by: https://github.com/albanD	2023-12-09 18:10:51 +00:00
Michael Dagitses	661d10aab4	use c10/macros/cmake_macros.h in fbcode build (#70851 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70851 This is a step towards OSS/fbcode convergence since OSS uses this file in both CMake and Bazel. ghstack-source-id: 147170896 Test Plan: Relying on the extensive CI internal tests for this. Reviewed By: malfet Differential Revision: D33299102 fbshipit-source-id: c650dd4755f8d696d5fce81c583d5c73782e3990 (cherry picked from commit `741ca140c8`)	2022-01-19 20:56:12 +00:00
Dmytro Dzhulgakov	96ea2594d8	Don't call cudaStreamDestroy at destruction time (#15692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15692 It was leading to ocassional crashes with dynamically linked CUDA because runtime was already destroyed. Also, unique_ptr<T[]> is more suitable than deque<T> for the purpose. Reviewed By: Yangqing Differential Revision: D13571988 fbshipit-source-id: 37eb26dfbe361c49160367b53f87bd037c6c0e46	2019-01-11 12:36:41 -08:00
Edward Yang	928687bb24	Add c10 cuda library. (#13900 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13900 Add c10 cuda library. Right now, this is not used by anything, and only tests if the CUDA headers are available (and not, e.g., that linking works.) Extra changes: - cmake/public/cuda.cmake now is correctly include guarded, so you can include it multiple times without trouble. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Reviewed By: smessmer Differential Revision: D13025313 fbshipit-source-id: fda85b4c35783ffb48ddd6bbb98dbd9154119d86	2018-11-19 08:20:07 -08:00

14 Commits