Summary:
Apple recently announced ML Compute, a new framework available in macOS Big Sur, which enables users to accelerate the training of neural networks on Mac hardware. This PR is the first on a series of PRs that will enable the integration with ML Compute. Most of the integration code will live on a separate subrepo named `mlc`.
The integration with `mlc` (ML Compute) will be very similar to that of xla. We rely on registering our ops through:
TORCH_LIBRARY_IMPL(aten, PrivateUse1, m) {
m.impl_UNBOXED(<op_schema_name>, &customized_op_kernel)
...
}
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50634
Reviewed By: malfet
Differential Revision: D26614213
Pulled By: smessmer
fbshipit-source-id: 3b492b346c61cc3950ac880ac01a82fbdddbc07b
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.
https://github.com/pytorch/pytorch/issues/48246
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786
Reviewed By: mrshenli
Differential Revision: D25893962
Pulled By: ezyang
fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
Summary:
ezyang,
I have added the changes to DispatchKey, DeviceType, Backend to support the out-of-tree FPGA.
cc. tataetae
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38938
Differential Revision: D21748955
Pulled By: ezyang
fbshipit-source-id: fe76d9730818205961430d2a0e00727b5c547b32
Summary:
This PR made the expected torch device string error message to include `xla` as the acceptable torch device prefix string.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36446
Test Plan:
No Logic changed, and made sure `xla` is acceptable in `torch.device`.
```
import torch
device = torch.device("xla")
```
```
device = torch.device("unrecognized")
RuntimeError: Expected one of cpu, cuda, mkldnn, opengl, opencl, ideep, hip, msnpu, xla device type at start of device string: unrecognized
```
Differential Revision: D20993449
Pulled By: dahsh
fbshipit-source-id: 83afe4f913a650a655bfda9c2a64bf9e5aa27e16
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29351
When torch::save()ing a smallish tensor, we spend ~5% of the time
still in std::stringstream constructors.
This removes the last couple of cases. Benchmark shows ~5% improvement:
TorchSaveSmallTensor Pre: 13.12us
TorchSaveSmallTensor Post: 12.48us
ghstack-source-id: 93517928
Test Plan:
buck build mode/opt experimental/jeremyl/c2:
buck-out/opt/gen/experimental/jeremyl/c2/SerializationBench --bm_regex=TorchSaveSmallTensor
Differential Revision: D18365066
fbshipit-source-id: a3284bec004751cedae1cdadf27f969422faff8e
Summary:
This PR also moves Device::validate into the header file, which makes
statements like `Device d = kCPU` effectively free.
Device includes the device's index, so TensorIterator::compute_types
now implicitly checks that all CUDA inputs are on the same GPU.
Previously, this was done ad-hoc in places like TensorIterator::binary_op.
Note that zero-dim Tensor (scalars) are NOT required to be on the
same device as other inputs because they behave almost like Python numbers.
TensorIterator handles copying zero-dim Tensors to the common device.
Prior to this PR, TensorIterator would copy zero-dim Tensors between CPU
and GPU, but not between different GPUs (because Backend didn't encode
the GPU index). This removes that restriction.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20690
Differential Revision: D15414826
Pulled By: colesbury
fbshipit-source-id: 1d0ad1f7d663252af36dd4590bcda418c2f7a09f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15316
This starts cleaning up the files in c10 according to the module structure we decided on.
Move to c10/util:
- Half.h, Half-inl.h, Half.cpp, bitcasts.h
Move to c10/core:
- Device.h, Device.cpp
- DeviceType.h, DeviceType.cpp
i-am-not-moving-c2-to-c10
Reviewed By: dzhulgakov
Differential Revision: D13498493
fbshipit-source-id: dfcf1c490474a12ab950c72ca686b8ad86428f63