Commit Graph

39 Commits

Author SHA1 Message Date
PyTorch MergeBot
bdd942efd7 Revert "Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138)"
This reverts commit 6cfc081675.

Reverted https://github.com/pytorch/pytorch/pull/144138 on behalf of https://github.com/albanD due to This seems to impact the caffe2 code ([comment](https://github.com/pytorch/pytorch/pull/144138#issuecomment-2590891200))
2025-01-14 19:04:12 +00:00
cyy
6cfc081675 Increase C10_COMPILE_TIME_MAX_GPUS to 128 (#144138)
To facilitate further possible changes of DeviceIndex to int16_t.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144138
Approved by: https://github.com/albanD
2025-01-10 23:53:19 +00:00
Ashwin Hari
5f5778476a rename ort to maia (#123265)
Fixes #123264

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123265
Approved by: https://github.com/albanD
2024-04-23 00:33:25 +00:00
PyTorch MergeBot
a9d9077f12 Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)"
This reverts commit 7c556428c7.

Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/kit1980 due to breaking internal builds, see D54286923 ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1969634480))
2024-02-28 18:57:09 +00:00
Tobias Ringwald
7c556428c7 Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)
Fixes #115331.

This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary:

- `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`.
- Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`.
- Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this.
- Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS`

[^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639
Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/huydhn
2024-02-27 07:05:48 +00:00
PyTorch MergeBot
fff9d98e58 Revert "Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)"
This reverts commit e0268821dd.

Reverted https://github.com/pytorch/pytorch/pull/119639 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think the Window failures are legit as they are failing now in trunk, i.e. 450339ab2d ([comment](https://github.com/pytorch/pytorch/pull/119639#issuecomment-1958428416))
2024-02-22 00:12:54 +00:00
Tobias Ringwald
e0268821dd Increased compile time max GPUs to 512. Switched to int16_t DeviceIndex. (#119639)
Fixes #115331.

This PR increases the number of valid GPU devices to 512 (from 64) in order to future-proof PyTorch for providers that offer [single nodes with a large device count](https://www.tensorwave.com/). Until now, `DeviceIndex` was an `int8_t`, thus multiple changes were necessary:

- `DeviceIndex` changed to `int16_t`. Updated consumers that assume it to be an `int8_t`.
- Updated bounds checking for `torch.device()` in the Python frontend. Right now, we allow funny things like `torch.device('cpu', 200).index == -56`, which is undefined behavior. I inserted some checks to only allow values between 0 and `c10::Device::MAX_NUM_DEVICES - 1`.
- Updated the `ArgumentInfo` struct as it hardcodes the device index as 8 bit field [^1]. Might be a breaking change, not sure if users rely on this.
- Introduced `c10::Device::MAX_NUM_DEVICES` as a replacement for the old `C10_COMPILE_TIME_MAX_GPUS`

[^1]: This field was unsigned, so I guess this has also been undef behavior the whole time? Our default device index is -1, so this always wrapped around to 255 when written to the `ArgumentInfo` struct. When I switched the `DeviceIndex` to `int16_t`, it actually stayed 255 after unpacking from `ArgumentInfo` again, as the `DeviceIndex` was now wide enough that it didn't wrap back to -1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119639
Approved by: https://github.com/cyyever, https://github.com/albanD
2024-02-21 21:10:49 +00:00
cyy
99f222372b [5/N] Fixes clang-tidy warnings in c10/{core,util}/*.h (#115354)
This PR continues to fix clang-tidy warnings for headers in c10/core and c10/util.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115354
Approved by: https://github.com/Skylion007
2023-12-09 17:16:04 +00:00
cyy
d9fb7166d6 [BE] use DeviceIndex instead of int64_t for related device interfaces (#103068)
This PR unifies the device interfaces in aten/*cpp and torch/csrc/*cpp to use  **c10::DeviceIndex**.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103068
Approved by: https://github.com/malfet
2023-08-25 20:16:14 +00:00
Alan Ji
2d727c8c3f remove the duplicate method is_private_use1 in class Device (#107198)
In the `Device` class, there are two methods with similar functions called `is_private_use1` and `is_privateuseone`.
ddf36c82b8/c10/core/Device.h (L84-L87)

ddf36c82b8/c10/core/Device.h (L159-L162)
The former is not being utilized and therefore, this PR removes it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107198
Approved by: https://github.com/bdhirsh
2023-08-17 18:23:29 +00:00
Jun Luo
17a3141696 Support is_mtia() (#106396)
Summary: As title.

Test Plan: CI tests.

Reviewed By: yuhc

Differential Revision: D47937061

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106396
Approved by: https://github.com/yuhc, https://github.com/ezyang
2023-08-02 03:24:23 +00:00
Benson Ma
66a2600b6a [T153220354] Fix header inclusions in c10 (#1541) (#101846)
Summary:
This is a re-attempt to land the iwyu header changes, by taking the diff from [PR 100304](https://github.com/pytorch/pytorch/pull/100304), and adding the bare minimal changes to make the diff build corectly in the internal builds.

X-link: https://github.com/facebookresearch/pytorch3d/pull/1541

X-link: https://github.com/fairinternal/pytorch3d/pull/44

- Re-work D45769819 to fix header inclusions in c10

Test Plan:
```
buck2 build --no-remote-cache mode/dev-nosan //caffe2/c10/...

buck2 build --no-remote-cache mode/dev-nosan //deeplearning/fbgemm/fbgemm_gpu/...

buck2 build mode/dev-nosan //vision/fair/pytorch3d/pytorch3d:_C
```

Reviewed By: malfet

Differential Revision: D45920611

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101846
Approved by: https://github.com/malfet, https://github.com/Skylion007
2023-05-20 19:35:14 +00:00
PyTorch MergeBot
4eaaa08623 Revert "Fix header inclusions in c10 by iwyu (#100304)"
This reverts commit 6037ee8cc9.

Reverted https://github.com/pytorch/pytorch/pull/100304 on behalf of https://github.com/jeanschmidt due to Breaking meta internal builds and fbgemm builds ([comment](https://github.com/pytorch/pytorch/pull/100304#issuecomment-1543919257))
2023-05-11 12:37:35 +00:00
cyy
6037ee8cc9 Fix header inclusions in c10 by iwyu (#100304)
This work introduces include-what-you-use  support for c10 by a CMake option defaulting to off. We also remove some unused header inclusions and  fix a trivial inclusion error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100304
Approved by: https://github.com/ezyang
2023-05-11 05:19:42 +00:00
PyTorch MergeBot
3271413e74 Revert "Fix header inclusions in c10 by iwyu (#100304)"
This reverts commit 39ec5fa722.

Reverted https://github.com/pytorch/pytorch/pull/100304 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, it is almost there but fails on Windows 39ec5fa722, which is in unstable mode after https://github.com/pytorch/pytorch/pull/100548 ([comment](https://github.com/pytorch/pytorch/pull/100304#issuecomment-1542975714))
2023-05-11 00:37:32 +00:00
cyy
39ec5fa722 Fix header inclusions in c10 by iwyu (#100304)
This work introduces include-what-you-use  support for c10 by a CMake option defaulting to off. We also remove some unused header inclusions and  fix a trivial inclusion error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100304
Approved by: https://github.com/ezyang
2023-05-10 15:42:43 +00:00
shibo
da322ea874 Enable torch.jit.load for custom device (#99535)
Fixes #ISSUE_NUMBER
1、torch.jit.load for custom device
```
# custom device named `foo`
ts_model = torch.jit.script(mode.to(device="foo"))
ts_model.save("./ts.pt") # it is a script model on device `foo`

# and then we want to load it and run it
torch.jit.load("./ts.pt")
```
2、 add some extra key for custom device with `privateuse1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99535
Approved by: https://github.com/albanD
2023-04-20 05:37:57 +00:00
ykddd
537c346117 feat(add method is_private_use1() in class Device) (#98123)
As the title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98123
Approved by: https://github.com/bdhirsh
2023-04-10 12:30:37 +00:00
Jun Luo
d47a4bf53f Align settings for new device key. (#98224)
Summary: As title.

Test Plan: All CI tests should pass.

Reviewed By: yuhc

Differential Revision: D44341331

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98224
Approved by: https://github.com/jackm321, https://github.com/ezyang
2023-04-04 08:39:11 +00:00
Kazuaki Ishizaki
64b8d20a5c Fix typos under c10 directory (#98079)
This PR fixes typos in comments and messages of files under `c10` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98079
Approved by: https://github.com/Skylion007
2023-03-31 18:31:11 +00:00
Charlie West-Taylor
953f39578a Mark IPU device as not supports_as_strided (#89130)
Currently causes issues in calls to `.to`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89130
Approved by: https://github.com/albanD
2022-11-23 19:51:53 +00:00
Elias Ellison
2d93e1fada Add slow path for device
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77684

Approved by: https://github.com/ezyang
2022-05-24 21:56:01 +00:00
Kulin Seth
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
Anthony Barbier
ce9e27a0fc Add new keys for Graphcore IPU (DispatchKey / Backend / DeviceType)
We need a key to register our out of tree backend: https://github.com/graphcore/poptorch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74763
Approved by: https://github.com/bdhirsh
2022-04-07 17:18:45 +00:00
Janet Yang
99db53eaa7 Jit save/load meta tensors (#73435)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73435

Add support for torch.jit.save and load for meta tensors to use in meta tensor based xl weights.

Test Plan:
```
buck test //caffe2/test:jit && -- -r .*save_load_meta_tensors.*
```

Reviewed By: houseroad

Differential Revision: D34479511

fbshipit-source-id: 117ccb12e9e427290a17297204508ec85495e3be
(cherry picked from commit ee9aaaf8208d6c9530c828a4b9f28cf2cca05630)
2022-03-10 19:48:29 +00:00
Sujoy Saraswati
c73f0e457e Tensor and device is_hpu methods (#65408)
Summary:
Add is_hpu() methods for Aten tensor and device

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65408

Reviewed By: malfet

Differential Revision: D31144227

Pulled By: wconstab

fbshipit-source-id: 115f4df4b8d54e6913dd51af7b6d4cacf6dd43c5
2021-09-23 18:42:45 -07:00
Alex Suhan
b176feec1e Add device and key for lazy tensors (#61621)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61621

Test Plan: CI

Reviewed By: mruberry

Differential Revision: D29912934

Pulled By: asuhan

fbshipit-source-id: 493c32063a3e756d93cbf1d876563a35eaafb537
2021-07-26 23:00:22 -07:00
Nicolas Weber
25e077bce1 [Issue 59296] added VE device (#59620)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59296

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59620

Reviewed By: zou3519

Differential Revision: D29196830

Pulled By: ezyang

fbshipit-source-id: 7bb49f776dc755804a0ba0bc3a7dbdab9c93914e
2021-06-21 16:44:52 -07:00
Peter Bell
0e7b5ea6c0 nonzero: Default to transposed output strides (#59370)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46224

cc ailzhang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59370

Reviewed By: ezyang

Differential Revision: D29143842

Pulled By: ngimel

fbshipit-source-id: 5aa7a247b4a70cd816d0eed368ab4c445568c986
2021-06-16 22:50:38 -07:00
Scott Wolchok
44cc873fba [PyTorch] Autoformat c10 (#56830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830

Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase.

Test Plan: CI

Reviewed By: zertosh

Differential Revision: D27979080

fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151
2021-04-30 21:23:28 -07:00
cyy
d8730194e7 use device methods (#52899)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52899

Reviewed By: zou3519

Differential Revision: D26752203

Pulled By: albanD

fbshipit-source-id: eaef89377999b20655fe85d5a38ca7a2c5882de7
2021-03-02 20:14:23 -08:00
chengjun
4a8ef4525e Add new backend type for Intel heterogeneous computation platform. (#49786)
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.

https://github.com/pytorch/pytorch/issues/48246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786

Reviewed By: mrshenli

Differential Revision: D25893962

Pulled By: ezyang

fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
2021-01-20 08:15:18 -08:00
Brian Hirsh
f54ab8fbfe Revert "Revert D25003113: make validate debug-only in Device copy ctr" (#49123)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49123

This reverts commit 7a4a2df225.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25463531

Pulled By: bdhirsh

fbshipit-source-id: 7c7ecdc1d63ffd137b84a129887c424b2083a958
2020-12-14 07:33:37 -08:00
Supriya Rao
7a4a2df225 Revert D25003113: make validate debug-only in Device copy ctr
Test Plan: revert-hammer

Differential Revision:
D25003113 (4b26cafb8f)

Original commit changeset: e17e6495db65

fbshipit-source-id: fd636c954a97bd80892464feb974a11b9dd96899
2020-12-09 13:58:11 -08:00
Brian Hirsh
4b26cafb8f make validate debug-only in Device copy ctr (#47854)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47854

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25003113

Pulled By: bdhirsh

fbshipit-source-id: e17e6495db65c48c7daf3429acbd86742286a1f3
2020-12-09 08:11:24 -08:00
Scott Wolchok
4c9eb57914 [PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47023

DeviceType pretty clearly only needs 1 byte. DeviceIndex only needs 1 byte given that machines don't have anywhere near 255 GPUs in them as far as I know.
ghstack-source-id: 116901430

Test Plan: Existing tests, added assertion to catch if my assumption about DeviceIndex is incorrect

Reviewed By: dzhulgakov

Differential Revision: D24605460

fbshipit-source-id: 7c9a89027fcf8eebd623b7cdbf6302162c981cd2
2020-11-18 19:39:40 -08:00
Jeremy Lilley
abf55eb3a8 Pickler: convert std::stringstream cases. (#29351)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29351

When torch::save()ing a smallish tensor, we spend ~5% of the time
still in std::stringstream constructors.

This removes the last couple of cases. Benchmark shows ~5% improvement:
  TorchSaveSmallTensor Pre: 13.12us
  TorchSaveSmallTensor Post: 12.48us
ghstack-source-id: 93517928

Test Plan:
buck build mode/opt experimental/jeremyl/c2:
   buck-out/opt/gen/experimental/jeremyl/c2/SerializationBench  --bm_regex=TorchSaveSmallTensor

Differential Revision: D18365066

fbshipit-source-id: a3284bec004751cedae1cdadf27f969422faff8e
2019-11-08 14:26:40 -08:00
Sam Gross
dee11a92c1 Use Device instead of Backend in TensorIterator (#20690)
Summary:
This PR also moves Device::validate into the header file, which makes
statements like `Device d = kCPU` effectively free.

Device includes the device's index, so TensorIterator::compute_types
now implicitly checks that all CUDA inputs are on the same GPU.
Previously, this was done ad-hoc in places like TensorIterator::binary_op.

Note that zero-dim Tensor (scalars) are NOT required to be on the
same device as other inputs because they behave almost like Python numbers.
TensorIterator handles copying zero-dim Tensors to the common device.

Prior to this PR, TensorIterator would copy zero-dim Tensors between CPU
and GPU, but not between different GPUs (because Backend didn't encode
the GPU index). This removes that restriction.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20690

Differential Revision: D15414826

Pulled By: colesbury

fbshipit-source-id: 1d0ad1f7d663252af36dd4590bcda418c2f7a09f
2019-05-24 12:14:08 -07:00
Sebastian Messmer
d408324350 Move files to/from c10/core and c10/util (#15316)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316

This starts cleaning up the files in c10 according to the module structure we decided on.

Move to c10/util:
- Half.h, Half-inl.h, Half.cpp, bitcasts.h

Move to c10/core:
- Device.h, Device.cpp
- DeviceType.h, DeviceType.cpp

i-am-not-moving-c2-to-c10

Reviewed By: dzhulgakov

Differential Revision: D13498493

fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63
2019-01-10 16:22:22 -08:00