Commit Graph

14 Commits

Author SHA1 Message Date
PyTorch MergeBot
bdd942efd7 Revert "Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138)"
This reverts commit 6cfc081675.

Reverted https://github.com/pytorch/pytorch/pull/144138 on behalf of https://github.com/albanD due to This seems to impact the caffe2 code ([comment](https://github.com/pytorch/pytorch/pull/144138#issuecomment-2590891200))
2025-01-14 19:04:12 +00:00
cyy
6cfc081675 Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138)
To facilitate further possible changes of DeviceIndex to int16_t.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144138
Approved by: https://github.com/albanD
2025-01-10 23:53:19 +00:00
Tobias Ringwald
c4a1570864 Temporarily increased compile time limit of #GPUs to 120. (#121076)
Fixes #115331.

This is a temporary fix to increase the compile time number of GPUs to 120 until #119639 can be merged. Changing the parameter to 128 leads to annoying errors, as some checks would be tautological (`int8_t` is always < 128).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121076
Approved by: https://github.com/albanD
2024-03-05 11:39:14 +00:00
PyTorch MergeBot
a9d9077f12 Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)"
This reverts commit 7c556428c7.

Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54286923 ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1969634480))
2024-02-28 18:57:09 +00:00
Tobias Ringwald
7c556428c7 Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)
Fixes #115331.

This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary:

- `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`.
- Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`.
- Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this.
- Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS`

[^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639
Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/huydhn
2024-02-27 07:05:48 +00:00
PyTorch MergeBot
fff9d98e58 Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)"
This reverts commit e0268821dd.

Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think the Window failures are legit as they are failing now in trunk, i.e. 450339ab2d ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1958428416))
2024-02-22 00:12:54 +00:00
Tobias Ringwald
e0268821dd Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)
Fixes #115331.

This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary:

- `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`.
- Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`.
- Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this.
- Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS`

[^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639
Approved by: https://github.com/cyyever, https://github.com/albanD
2024-02-21 21:10:49 +00:00
albanD
54668ad6dc Cleanup max cuda device (#118779)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118779
Approved by: https://github.com/ezyang
2024-02-01 21:11:28 +00:00
Tobias Ringwald
77366ba637 Increased hardcoded limit for number of GPUs. (#115368)
Fixes #115331.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115368
Approved by: https://github.com/albanD
2023-12-18 18:39:19 +00:00
PyTorch MergeBot
722752fc28 Revert "Increased hardcoded limit for number of GPUs. (#115368)"
This reverts commit c039f01bd9.

Reverted https://github.com/pytorch/pytorch/pull/115368 on behalf of https://github.com/osalpekar due to This was reverted internally due to a release breakage ([comment](https://github.com/pytorch/pytorch/pull/115368#issuecomment-1854956224))
2023-12-14 01:28:01 +00:00
Tobias Ringwald
c039f01bd9 Increased hardcoded limit for number of GPUs. (#115368)
Fixes #115331.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115368
Approved by: https://github.com/albanD
2023-12-09 18:10:51 +00:00
Michael Dagitses
661d10aab4 use c10/macros/cmake_macros.h in fbcode build (#70851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70851

This is a step towards OSS/fbcode convergence since OSS uses this file
in both CMake and Bazel.
ghstack-source-id: 147170896

Test Plan: Relying on the extensive CI internal tests for this.

Reviewed By: malfet

Differential Revision: D33299102

fbshipit-source-id: c650dd4755f8d696d5fce81c583d5c73782e3990
(cherry picked from commit 741ca140c8)
2022-01-19 20:56:12 +00:00
Dmytro Dzhulgakov
96ea2594d8 Don't call cudaStreamDestroy at destruction time (#15692)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15692

It was leading to ocassional crashes with dynamically linked CUDA because runtime was already destroyed.

Also, unique_ptr<T[]> is more suitable than deque<T> for the purpose.

Reviewed By: Yangqing

Differential Revision: D13571988

fbshipit-source-id: 37eb26dfbe361c49160367b53f87bd037c6c0e46
2019-01-11 12:36:41 -08:00
Edward Yang
928687bb24 Add c10 cuda library. (#13900)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13900

Add c10 cuda library.

Right now, this is not used by anything, and only tests if the CUDA
headers are available (and not, e.g., that linking works.)

Extra changes:
- cmake/public/cuda.cmake now is correctly include guarded, so you
  can include it multiple times without trouble.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Reviewed By: smessmer

Differential Revision: D13025313

fbshipit-source-id: fda85b4c35783ffb48ddd6bbb98dbd9154119d86
2018-11-19 08:20:07 -08:00