Commit Graph

52 Commits

Author SHA1 Message Date
Wei-Sheng Chin
bca75fe97a [MAIA] [Autocast] Enable autocast on MAIA device (#148511)
Fixes #148510.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148511
Approved by: https://github.com/albanD
2025-03-18 03:46:22 +00:00
cyy
288df042c5 [1/N] Change static functions in headers to inline (#127727)
So that it may fix some tricky linking issues.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127727
Approved by: https://github.com/ezyang
2024-06-03 04:34:36 +00:00
Ashwin Hari
5f5778476a rename ort to maia (#123265)
Fixes #123264

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123265
Approved by: https://github.com/albanD
2024-04-23 00:33:25 +00:00
Pearu Peterson
70d4d109f2 Make SparseCsr a functionality dispatch key (#120703)
As in the title.

To enable meta and fake tensor support for sparse compressed tensors in compliance with the meta/fake tensor support for sparse COO tensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120703
Approved by: https://github.com/ezyang
2024-03-01 13:28:46 +00:00
shibo19
af50efca24 add nested/sprase/quantized tensor key for privateuse1 (#102696)
Fixes #ISSUE_NUMBER
add nested/sprase/quantized tensor key for privateuse1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102696
Approved by: https://github.com/bdhirsh
2023-06-02 22:35:52 +00:00
shibo
da322ea874 Enable torch.jit.load for custom device (#99535)
Fixes #ISSUE_NUMBER
1、torch.jit.load for custom device
```
# custom device named `foo`
ts_model = torch.jit.script(mode.to(device="foo"))
ts_model.save("./ts.pt") # it is a script model on device `foo`

# and then we want to load it and run it
torch.jit.load("./ts.pt")
```
2、 add some extra key for custom device with `privateuse1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99535
Approved by: https://github.com/albanD
2023-04-20 05:37:57 +00:00
Jun Luo
d47a4bf53f Align settings for new device key. (#98224)
Summary: As title.

Test Plan: All CI tests should pass.

Reviewed By: yuhc

Differential Revision: D44341331

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98224
Approved by: https://github.com/jackm321, https://github.com/ezyang
2023-04-04 08:39:11 +00:00
Hangchen Yu
5a0fa04a49 Add MTIA DeviceType for Meta training and inference devices (#92232)
Summary: This adds a new MTIA DeviceType which is associated with the MTIA DispatchKey and will be used for the Meta in-house training and inference accelerators.

Test Plan: All CI should pass.

Differential Revision: D42526044

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92232
Approved by: https://github.com/ezyang
2023-01-16 12:20:23 +00:00
Edward Z. Yang
5ca24c60a1 Add Meta backend
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78803

Approved by: https://github.com/bdhirsh
2022-06-09 15:28:46 +00:00
Hangchen Yu
abb6fab0f4 Add new PrivateUse1 DeviceType for non-public devices (#77208)
Summary: The new PrivateUse1 DeviceType is associated with the PrivateUse1 DispatchKey, which can be used for non-public devices without introducing a new device type. Note that the stringified name of the PrivateUse1 device is "privateuseone".

Test Plan: All CI should pass.

Differential Revision: D35859437

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77208
Approved by: https://github.com/bdhirsh
2022-05-13 16:03:27 +00:00
Kulin Seth
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
Anthony Barbier
ce9e27a0fc Add new keys for Graphcore IPU (DispatchKey / Backend / DeviceType)
We need a key to register our out of tree backend: https://github.com/graphcore/poptorch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74763
Approved by: https://github.com/bdhirsh
2022-04-07 17:18:45 +00:00
Aaron Bockover
c78ab28441 Add support for the ONNX Runtime Eager Mode backend (#58248)
Summary:
This PR implements the necessary hooks/stubs/enums/etc for complete ONNX Runtime (ORT) Eager Mode integration. The actual extension will live out of tree at https://github.com/pytorch/ort.

We have been [working on this at Microsoft](https://github.com/microsoft/onnxruntime-pytorch/tree/eager-ort/torch_onnxruntime) for the last few months, and are finally ready to contribute the PyTorch core changes upstream (nothing major or exciting, just the usual boilerplate for adding new backends).

The ORT backend will allow us to ferry [almost] all torch ops into granular ONNX kernels that ORT will eagerly execute against any devices it supports (therefore, we only need a single ORT backend from a PyTorch perspective).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58248

Reviewed By: astaff

Differential Revision: D30344992

Pulled By: albanD

fbshipit-source-id: 69082b32121246340d686e16653626114b7714b2
2021-08-20 11:17:13 -07:00
Alex Suhan
b176feec1e Add device and key for lazy tensors (#61621)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61621

Test Plan: CI

Reviewed By: mruberry

Differential Revision: D29912934

Pulled By: asuhan

fbshipit-source-id: 493c32063a3e756d93cbf1d876563a35eaafb537
2021-07-26 23:00:22 -07:00
Nicolas Weber
25e077bce1 [Issue 59296] added VE device (#59620)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59296

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59620

Reviewed By: zou3519

Differential Revision: D29196830

Pulled By: ezyang

fbshipit-source-id: 7bb49f776dc755804a0ba0bc3a7dbdab9c93914e
2021-06-21 16:44:52 -07:00
Sujoy Saraswati
3c973de543 HABANA Device registration key and Autograd key addition (#57094)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57094

Reviewed By: mruberry

Differential Revision: D28355895

Pulled By: wconstab

fbshipit-source-id: 5d8b5762a69f444f4fe7f476891150fa5483d893
2021-05-12 13:07:33 -07:00
Scott Wolchok
44cc873fba [PyTorch] Autoformat c10 (#56830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830

Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase.

Test Plan: CI

Reviewed By: zertosh

Differential Revision: D27979080

fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151
2021-04-30 21:23:28 -07:00
haozhe.zhu
ab20ba4427 Fix issue with dispatch key: AutogradXPU (#56336)
Summary:
Automatically add dispatch key "AutogradXPU" with "xpu" tensor. And set "fall through" for AutogradXPU

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56336

Reviewed By: heitorschueroff

Differential Revision: D27872125

Pulled By: ailzhang

fbshipit-source-id: c120c62becd577699f9aecb4c356c889bd37ad06
2021-04-20 12:09:59 -07:00
Sameer Deshmukh
5fb1142702 Add CSR (compressed sparse row) layout for sparse tensors (#50937)
Summary:
Implement compressed sparse row format. Derived from the GCS implementation at https://github.com/pytorch/pytorch/pull/44190

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50937

Reviewed By: mrshenli

Differential Revision: D27439865

Pulled By: ezyang

fbshipit-source-id: 3ba3dcb9679505b980ff6a5f513e913bbae2fb1d
2021-04-12 10:09:12 -07:00
Edward Yang
e0aebe241d Refactor tensor_new.cpp to use TensorOptions instead of DispatchKey (#54034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54034

Fixes #53544

I had to touch a bunch of lines but the refactoring was fairly
mechanical.  Here's how it works.

The basic concept behind this PR is that tensor_new.cpp was previously
abusing DispatchKey when it actually meant TensorOptions.  The provided
DispatchKey argument to most of the constructor functions typically
comes from torch::tensors::get_default_dispatch_key();  it doesn't
really make sense for people to set the default dispatch key, but
this got grandfathered in due to the old API set_default_tensor_type
(where the "Type" concept got refactored into "DispatchKey" concept
over time).  See also #53124.  But the upshot is that, semantically,
what we refer to as the default dispatch key really is more like
torch.set_default_tensor_type(torch.Tensor) versus
torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user
wants to do something about *construction* of the tensor, and
TensorOptions captures that exactly.

So, how exactly to translate from one to the other?
- Sources (things that used to PRODUCE DispatchKey)
  - Most top level functions take a DispatchKey as their argument.  I
    use the new function dispatchKeyToTensorOptions to convert it into
    a TensorOptions
  - typeIdWithDefault now produces a TensorOptions (probably could do
    with a rename, though I didn't)
- Sinks (things that used to CONSUME DispatchKey)
  - Previously, the function options() was typically used to convert the
    DispatchKey into a TensorOptions.  Now its replacement build_options
    just takes a TensorOptions and sets some extra fields on it.
    Irritatingly, I can't just replace
    `build_options(options, scalar_type, device)` with
    `options.dtype(scalar_type).device(device)` because the semantics
    are slightly different: if device is nullopt, we should preserve
    the usage of the device specified in options (what options.device()
    does is overwrite the device unconditionally; e.g., if device is
    nullopt, unset device from options)
  - The other major sink for DispatchKey was `internal_new_from_data`,
    but it turns out it only really extracts the device type from
    the dispatch key.  Now it just pulls out the device from
    TensorOptions.
- To actually do the translation of DispatchKey to TensorOptions, I
  introduce new functions dispatchKeyToLayout (replicating
  layout_from_backend--there are still a few uses of this function
  so I couldn't delete it) and dispatchKeyToDeviceType (replacing
  computeDeviceType)
- In all internal functions, whenever DispatchKey is taken as an argument,
  I instead take TensorOptions as an argument, and pass it along.
- Anywhere `legacyExtractDispatchKey(other.key_set())` equality was
  previously used, I now do `other.options().type_equal()`, which
  is the intended BC for doing "backend to backend" comparisons
- There are a few places in the sparse constructors where we allocated
  a tensor for values, and then read out the dispatch key from the
  result to allocate the keys.  As best as I can tell, this is totally
  equivalent to just passing in the options to both values and indices
  (the only difference is dtype, which is captured via a separate
  argument)

This refactor doesn't really go far enough: for example, there are now
functions that take both TensorOptions and ScalarType, when really
the TensorOptions can capture this all.  I kept it solely just
s/DispatchKey/TensorOptions/ to reduce the number of possible bugs;
also, a lot of this will be mooted by a proper fix to #53124.

Even with this limited refactor, the payoff is sweet.  I can delete:

- backendToCPU
- backendToXPU
- backendToCUDA
- backendToHIP
- backendToBackendOfDeviceType

The reason I can do this is because I can simply overwrite layout in TensorOptions
to do the conversion, rather than having to type out each backend case
explicitly.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D27109509

Pulled By: ezyang

fbshipit-source-id: 91d16cfbc390127770362ac04fb43f7e070077e9
2021-03-19 09:08:32 -07:00
Edward Yang
7e7533b2e2 Delete denseTypeIdWithDefault and toDense (#54016)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54016

I managed to convince myself that typeIdWithDefault was sufficient for
the sparse constructor case.  Here is the reasoning.

The surface reading of the use site of denseTypeIdWithDefault is
to convert what could be a sparse dispatch key into the dense version
so we can properly allocate underlying dense tensors for the sparse
constructor call.  But WHERE does this dispatch key come from?
Inspection of call sites reveals that dispatch key is provided by
torch::tensors::get_default_dispatch_key().  This key is NEVER
sparse, as that would correspond to setting sparse tensors to be
the default tensor via torch.set_default_tensor_type() (which is
forbidden, and even if it worked most of everything in PyTorch would
break).  That means that typeIdWithDefault is a sufficient replacmenet.

With denseTypeIdWithDefault removed, we can also delete toDense
as this was the sole use of that function.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D27109511

Pulled By: ezyang

fbshipit-source-id: c698eff0ab54c0c101fe9f55be3b7657584c4372
2021-03-17 12:28:55 -07:00
Edward Yang
99098c1d70 Delete dead Backend toSparse (#53116)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53116

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D26753226

Pulled By: ezyang

fbshipit-source-id: 2941876d546c39ee3913c2ffffdb0a0ea7360f0c
2021-03-03 11:22:03 -08:00
Lance Ware
fdd25f82c9 Update to replace AT_ERROR with TORCH_CHECK (#52711)
Summary:
Fixes #{52699}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52711

Reviewed By: ailzhang

Differential Revision: D26654677

Pulled By: malfet

fbshipit-source-id: 97079250d144c9b1c69028f35e4a23a34481b2a5
2021-02-25 19:51:29 -08:00
Bel H
30cb6ac53c Introduce mlc device (ML Compute device) to PyTorch's device list (#50634)
Summary:
Apple recently announced ML Compute, a new framework available in macOS Big Sur, which enables users to accelerate the training of neural networks on Mac hardware. This PR is the first on a series of PRs that will enable the integration with ML Compute. Most of the integration code will live on a separate subrepo named `mlc`.
The integration with `mlc` (ML Compute) will be very similar to that of xla. We rely on registering our ops through:

TORCH_LIBRARY_IMPL(aten, PrivateUse1, m) {
 m.impl_UNBOXED(<op_schema_name>, &customized_op_kernel)
 ...
}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50634

Reviewed By: malfet

Differential Revision: D26614213

Pulled By: smessmer

fbshipit-source-id: 3b492b346c61cc3950ac880ac01a82fbdddbc07b
2021-02-24 22:39:11 -08:00
chengjun
4a8ef4525e Add new backend type for Intel heterogeneous computation platform. (#49786)
Summary:
Add a new device type 'XPU' ('xpu' for lower case) to PyTorch. Changes are needed for code related to device model and kernel dispatch, e.g. DeviceType, Backend and DispatchKey etc.

https://github.com/pytorch/pytorch/issues/48246

Pull Request resolved: https://github.com/pytorch/pytorch/pull/49786

Reviewed By: mrshenli

Differential Revision: D25893962

Pulled By: ezyang

fbshipit-source-id: 7ff0a316ee34cf0ed6fc7ead08ecdeb7df4b0052
2021-01-20 08:15:18 -08:00
Tao Xu
a277c097ac [iOS][GPU] Add Metal/MPSCNN support on iOS (#46112)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46112

### Summary

This PR adds the support of running torchscript models on iOS GPU via Metal (Inference only). The feature is currently in prototype state, API changes are expected. The tutorial and the documents will be added once it goes to beta.

allow-large-files

- Users API

```
  auto module = torch::jit::load(model);
  module.eval();
  at::Tensor input = at::ones({1,3,224,224}, at::ScalarType::Float).metal();
  auto output = module.forward({input}).toTensor().cpu();
```
- Supported Models
    - Person Segmentation v106 (FB Internal)
    - Mobilenetv2

- Supported Operators
    - aten::conv2d
    - aten::addmm
    - aten::add.Tensor
    - aten::sub.Tensor
    - aten::mul.Tensor
    - aten::relu
    - aten::hardtanh
    - aten::hardtanh_
    - aten::sigmoid
    - aten::max_pool2d
    - aten::adaptive_avg_pool2d
    - aten::reshape
    - aten::t
    - aten::view
    - aten::log_softmax.int
    - aten::upsample_nearest2d.vec

- Supported Devices
    - Apple A9 and above
    - iOS 10.2 and above

- CMake scripts
    - `IOS_ARCH=arm64 ./scripts/build_ios.sh -DUSE_METAL=ON`

### Test Plan

- Circle CI

ghstack-source-id: 114155638

Test Plan:
1. Sandcastle CI
2. Circle CI

Reviewed By: dreiss

Differential Revision: D23236555

fbshipit-source-id: 98ffc48b837e308bc678c37a9a5fd8ae72d11625
2020-10-13 01:46:56 -07:00
Ailing Zhang
224232032c Move Autograd to an alias dispatch key (#43070)
Summary:
This PR moves `DispatchKey::Autograd` to an alias dispatch key mapping to `AutogradCPU, AutogradCUDA, AutogradXLA, AutogradOther, AutogradPrivate*` keys.

A few things are handled in this PR:
- Update alias dispatch key mapping and precompute dispatchTable logic
- Move `Autograd` key from `always_included` set to TensorImpl constructor.
- Update `dummyTensor` constructor to take `requires_grad` as optional argument so that it's closer to the real application in op_registration_test.
- Use `BackendSelect` key for both backend select before and after autograd layer. (1 liner in backend_select codegen)

A few planned followups ordered by priority:
- [cleanup] Update `test_dispatch.py` to include testing `Autograd`.
- [cleanup] Add Math alias key and move catchAll to Math. (to remove 2.2 in `computeDispatchTableEntryWithDebug`)
- [new feature] Add support for Math in native_functions.yaml
- [cleanup] Add iterator like functionality to DispatchKeySet
- [cleanup/large] Only add Autograd backend keys when tensor requires grad. (cc: ljk53 ?)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43070

Reviewed By: ezyang

Differential Revision: D23281535

Pulled By: ailzhang

fbshipit-source-id: 9ad00b17142e9b83304f63cf599f785500f28f71
2020-09-01 09:05:29 -07:00
Ailing Zhang
7cb8d68ae1 Rename XLAPreAutograd to AutogradXLA. (#43047)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43047

Reviewed By: ezyang

Differential Revision: D23134326

Pulled By: ailzhang

fbshipit-source-id: 5fcbc23755daa8a28f9b03af6aeb3ea0603b5c9a
2020-08-17 10:47:43 -07:00
Dylan Bespalko
c767d65caf Added FPGA DispatchKey, DeviceType, Backend (#38938)
Summary:
ezyang,

I have added the changes to DispatchKey, DeviceType, Backend to support the out-of-tree FPGA.

cc. tataetae
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38938

Differential Revision: D21748955

Pulled By: ezyang

fbshipit-source-id: fe76d9730818205961430d2a0e00727b5c547b32
2020-06-03 07:28:14 -07:00
Ivan Kobzarev
b460465a18 [Mobile GPU][Integration] Vulkan backend integration (#36491)
Summary:
This PR contains the initial version of Vulkan (GPU) Backend integration.
The primary target environment is Android, but the desktop build is also supported.

## CMake
Introducing three cmake options:
USE_VULKAN:
The main switch, if it is off, all other options do not affect.
USE_VULKAN_WRAPPER:
ON - Vulkan will be used loading it at runtime as "libvulkan.so" using libdl, every function call is wrapped in vulkan_wrapper.h.
OFF - linking with libvulkan.so directly
USE_VULKAN_SHADERC_RUNTIME:
ON - Shader compilation library will be linked, and shaders will be compiled runtime.
OFF - Shaders will be precompiled and shader compilation library is not included.

## Codegen
if `USE_VULKAN_SHADERC_RUNTIME` is ON:
Shaders precompilation () starts in cmake/VulkanCodegen.cmake, which calls `aten/src/ATen/native/vulkan/gen_glsl.py` or `aten/src/ATen/native/vulkan/gen_spv.py` to include shaders source or SPIR-V bytecode inside binary as uint32_t array in spv.h,spv.cpp.
if `USE_VULKAN_SHADERC_RUNTIME` is OFF:
The source of shaders is included as `glsl.h`,`glsl.cpp`.

All codegen results happen in the build directory.

## Build dependencies
cmake/Dependencies.cmake
If the target platform is Android - vulkan library, headers, Vulkan wrapper will be used from ANDROID_NDK.
Desktop build requires the VULKAN_SDK environment variable, and all vulkan dependencies will be used from it.
(Desktop build was tested only on Linux).

## Pytorch integration:
Adding 'Vulkan" as new Backend, DispatchKey, DeviceType.
We are using Strided layout without supporting strides at the moment, but we plan to support them in the future.
Using OpaqueTensorImpl where OpaqueHandle is copyable VulkanTensor,
more details in comments in `aten/src/ATen/native/vulkan/Vulkan.h`

Main code location: `aten/src/ATen/native/vulkan`
`aten/src/ATen/native/vulkan/VulkanAten.cpp` - connection link between ATen and Vulkan api (Vulkan.h) that converts at::Tensor to VulkanTensor.

`aten/src/ATen/native/Vulkan/Vulkan.h` - Vulkan API that contains VulkanTensor representation and functions to work with it. Plan to expose it for clients to be able to write their own Vulkan Ops.

`aten/src/ATen/native/vulkan/VulkanOps.cpp` - Vulkan Operations Implementations that uses Vulkan.h API

## GLSL shaders
Located in `aten/src/ATen/native/vulkan/glsl` as *.glsl files.
All shaders use Vulkan specialized constants for workgroup sizes with ids 1, 2, 3

## Supported operations
Code point:
conv2d no-groups
conv2d depthwise
addmm
upsample nearest 2d
clamp
hardtanh

## Testing
`aten/src/ATen/test/vulkan_test.cpp` - contains tests for
copy from CPU to Vulkan and back
all supported operations
Desktop builds supported, and testing can be done on a desktop that has Vulkan supported GPU or with installed software implementation of Vulkan, like https://github.com/google/swiftshader

## Vulkan execution
The initial implementation is trivial and waits every operator's execution.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36491

Differential Revision: D21696709

Pulled By: IvanKobzarev

fbshipit-source-id: da3e5a770b1a1995e9465d7e81963e7de56217fa
2020-05-26 08:30:13 -07:00
Jerry Zhang
385165ec67 [reland][quant] QuantizedCUDA implementation (#36936) (#37081)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37081

Closes https://github.com/pytorch/pytorch/issues/30813

Relanding of https://github.com/pytorch/pytorch/pull/35463

1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized.
2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included).
3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them.
4. Minor changes in many files to register QuantizedCUDA backend.
5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible.

Test Plan: Imported from OSS

Differential Revision: D21206694

Pulled By: jerryzh168

fbshipit-source-id: c7433aad9c095a34c57e6dddd128b5c5d9292373
2020-04-24 10:21:59 -07:00
Mike Ruberry
4bbc49f53a Revert D21143025: [reland][quant] QuantizedCUDA implementation
Test Plan: revert-hammer

Differential Revision:
D21143025

Original commit changeset: 11405e2e8f87

fbshipit-source-id: ce471ec95c1fc6abff6d1bbdba11bef02f3a0d62
2020-04-21 20:36:12 -07:00
Jerry Zhang
97d3a8495d [reland][quant] QuantizedCUDA implementation (#36936)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36936

Closes https://github.com/pytorch/pytorch/issues/30813

Relanding of https://github.com/pytorch/pytorch/pull/35463

1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized.
2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included).
3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them.
4. Minor changes in many files to register QuantizedCUDA backend.
5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible.

Test Plan: Imported from OSS

Differential Revision: D21143025

Pulled By: jerryzh168

fbshipit-source-id: 11405e2e8f87e48fadc0a084c51db15f85ccb500
2020-04-21 13:18:52 -07:00
Alban Desmaison
49b10c58a3 Revert D20896697: [pytorch][PR] QuantizedCUDA implementation
Test Plan: revert-hammer

Differential Revision:
D20896697

Original commit changeset: 163554efa23d

fbshipit-source-id: e3e370ef7c8be68ea34368dfcc7a7efc9d1f8761
2020-04-19 12:41:51 -07:00
Aleksandr Fedorov
f6daa6220e QuantizedCUDA implementation (#35463)
Summary:
Closes https://github.com/pytorch/pytorch/issues/30813

1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized.
2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included).
3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them.
4. Minor changes in many files to register QuantizedCUDA backend.
5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35463

Differential Revision: D20896697

Pulled By: jerryzh168

fbshipit-source-id: 163554efa23d11a2b10bbc2492439db4798eb26b
2020-04-19 08:33:16 -07:00
Edward Yang
dd64e738c5 Expunge TensorId from all DispatchKey names. (#36240)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36240

It's annoying, historical, and unnecessary (enum class is already
namespaced).  I did this codemod with:

```
git grep -l 'CPUTensorId' | xargs sed -i 's/CPUTensorId/CPU/g'
git grep -l 'CUDATensorId' | xargs sed -i 's/CUDATensorId/CUDA/g'
git grep -l 'VariableTensorId' | xargs sed -i 's/VariableTensorId/Autograd/g'
git grep -l 'HIPTensorId' | xargs sed -i 's/HIPTensorId/HIP/g'
git grep -l 'MSNPUTensorId' | xargs sed -i 's/MSNPUTensorId/MSNPU/g'
git grep -l 'XLATensorId' | xargs sed -i 's/XLATensorId/XLA/g'
git grep -l 'PrivateUse1_TensorId' | xargs sed -i 's/PrivateUse1_TensorId/PrivateUse1/g'
git grep -l 'PrivateUse2_TensorId' | xargs sed -i 's/PrivateUse2_TensorId/PrivateUse2/g'
git grep -l 'PrivateUse3_TensorId' | xargs sed -i 's/PrivateUse3_TensorId/PrivateUse3/g'
git grep -l 'AutocastTensorId' | xargs sed -i 's/AutocastTensorId/Autocast/g'
git grep -l '_PreAutogradTensorId' | xargs sed -i 's/_PreAutogradTensorId/_PreAutograd/g'
git grep -l 'TESTING_ONLY_GenericWrapperTensorId' | xargs sed -i 's/TESTING_ONLY_GenericWrapperTensorId/TESTING_ONLY_GenericWrapper/g'
git grep -l 'TESTING_ONLY_GenericModeTensorId' | xargs sed -i 's/TESTING_ONLY_GenericModeTensorId/TESTING_ONLY_GenericMode/g'
```

Then I did a git grep for remaining TensorId occurrences, and manually
killed those (mostly in codegen, and some docs that needed updating).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20929255

Pulled By: ezyang

fbshipit-source-id: dc371b6aa6e6ea7c0a5660137c14debde806a09d
2020-04-13 23:33:44 -07:00
Ailing Zhang
48fd410e44 Try fix XLAPreAutograd with *_like functions. (#33848)
Summary:
In *_like functions we call
`globalLegacyTypeDispatch().initForDispatchKeySet(c10::detail::multi_dispatch_key_set(self, options));` -> `dispatchKeyToBackend` and thus this change.
`self` has both `XLAPreAutograd` and `XLATensorId` in key set.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33848

Differential Revision: D20135898

Pulled By: ailzhang

fbshipit-source-id: a8585f39f3fa77b53718f20d3144f4f2f3cb8e53
2020-02-27 15:28:40 -08:00
anjali411
016d73bd74 remove Complex CPU/CUDA backend enum keys (#33267)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33267

Test Plan: Imported from OSS

Differential Revision: D19907696

Pulled By: anjali411

fbshipit-source-id: 78cc55344313387c4b05bb003688915cee64e3be
2020-02-18 13:38:39 -08:00
Edward Yang
3d0a470d89 Rename DispatchKey::UndefinedTensorId to Undefined (#32728)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32728

It doesn't have much to do with tensors anymore.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19628093

Pulled By: ezyang

fbshipit-source-id: 4d57111cdf44ba347bec8a32bb5b4b47a83c1eaf
2020-01-30 11:47:40 -08:00
Pavel Belevich
62b06b9fae Rename TensorTypeId to DispatchKey (#32154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32154

TensorTypeId -> DispatchKey
	c10/core/TensorTypeId.h -> c10/core/DispatchKey.h
	c10/core/TensorTypeId.cpp -> c10/core/DispatchKey.cpp
	TensorTypeId::* -> DispatchKey::*
	TensorTypeId type_id -> DispatchKey dispatch_key
		type_id -> dispatch_key
	TensorTypeId::NumTensorIds -> DispatchKey::NumDispatchKeys
	RealTensorTypeId -> RealDispatchKey
TensorTypeSet -> DispatchKeySet
	TensorTypeIds -> DispatchKeys
	c10/core/TensorTypeSet.h -> c10/core/DispatchKeySet.h
	c10/core/TensorTypeSet.cpp -> c10/core/DispatchKeySet.cpp
	type_set() -> key_set()
	type_set_ -> key_set_
	typeSet -> keySet
ExcludeTensorTypeIdGuard -> ExcludeDispatchKeyGuard
IncludeTensorTypeIdGuard -> IncludeDispatchKeyGuard
LocalTensorTypeSet -> LocalDispatchKeySet
	c10/core/impl/LocalTensorTypeSet.h -> c10/core/impl/LocalDispatchKeySet.h
	c10/core/impl/LocalTensorTypeSet.cpp -> c10/core/impl/LocalDispatchKeySet.cpp
	tls_local_tensor_type_set -> tls_local_dispatch_key_set
	tls_is_tensor_type_id_excluded -> tls_is_dispatch_key_excluded
	tls_set_tensor_type_id_excluded -> tls_set_dispatch_key_excluded
	tls_is_tensor_type_id_included -> tls_is_dispatch_key_included
	tls_set_tensor_type_id_included -> tls_set_dispatch_key_included
MultiDispatchTensorTypeSet -> MultiDispatchKeySet
	multi_dispatch_tensor_type_set -> multi_dispatch_key_set
tensorTypeIdToBackend -> dispatchKeyToBackend
backendToTensorTypeId -> backendToDispatchKey
initForTensorTypeSet -> initForDispatchKeySet
inferred_type_set -> inferred_key_set
computeTensorTypeId -> computeDispatchKey
PODLocalTensorTypeSet raw_local_tensor_type_set -> PODLocalDispatchKeySet raw_local_dispatch_key_set
get_default_tensor_type_id -> get_default_dispatch_key
inferred_type_id -> inferred_dispatch_key
actual_type_id -> actual_dispatch_key
typeSetToDispatchKey_ -> dispatchKeySetToDispatchKey_
get_type_id() -> get_dispatch_key()
legacyExtractTypeId -> legacyExtractDispatchKey
extractTypeId -> extractDispatchKey

Test Plan: Imported from OSS

Differential Revision: D19398900

Pulled By: pbelevich

fbshipit-source-id: 234ad19f93d33e00201b61e153b740a339035776
2020-01-15 11:16:08 -08:00
Edward Yang
58a0dee749 Replace open registration TensorTypeId with closed enum. (#25252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25252

Our model going forward for extensions will be that you will have to
get an allocation of an ID in our system.  This is how things work
in practice today; we're just simplifying our underlying registration
since there is no need to have distributed registration.

There are some codemods in this diff:

```
codemod --extensions cpp,h,cc,cuh,py,in --exclude-paths=c10/core/TensorTypeId.h '([A-Za-z]+?)TensorId\(\)' 'TensorTypeId::\1TensorId'
codemod --extensions cpp,h,cc,cuh,py,in 'TensorTypeIds::undefined\(\)' 'TensorTypeId::UndefinedTensorId'
codemod --extensions cpp 'TensorType1\(\)' 'TensorTypeId::CPUTensorId'
codemod --extensions cpp 'TensorType2\(\)' 'TensorTypeId::CUDATensorId'
codemod --extensions cpp 'TensorType3\(\)' 'TensorTypeId::XLATensorId'
codemod --extensions cpp 'TensorType1' 'CPUTensorId'
codemod --extensions cpp 'TensorType2' 'CUDATensorId'
codemod --extensions cpp 'TensorType3' 'XLATensorId'
```

The main hand-written changes are in c10/core/TensorTypeId.h

Other manual fixes:

- aten/src/ATen/core/op_registration/op_registration.cpp - stop using
  std::string operator+
- aten/src/ATen/function_wrapper.py - handle a hardcoded TypeId() that
  wasn't caught by codemod
- torch/csrc/tensor/python_tensor.h - fix now incorrect forward declaration
  of TensorTypeId
- aten/src/ATen/core/op_registration/ - remove out-of-line registration

Differential Revision: D17072001

Test Plan: ossci and sandcastle

Pulled By: ezyang

fbshipit-source-id: c641515fd0604c045c54fbb1d6b1b950f45e89d1
2019-08-29 08:55:58 -07:00
Jerry Zhang
56fb5e03b5 refactor registerStoragePyTypeObject (#20467)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20467

for upcoming changes in Storage for QInt8

Reviewed By: ezyang

Differential Revision: D15330865

fbshipit-source-id: 2840e59c0bf088983f792fd724de41b3bb3dec55
2019-05-14 18:22:33 -07:00
Roy Li
189f30603c Make complex its own backend (#19275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19275
ghimport-source-id: 73fd40b02152aed6f24225a88d7ffde7f700899e

Differential Revision: D14948582

Pulled By: li-roy

fbshipit-source-id: a1be6e57057defc74a007c5351c5edb2b9dcaf30
2019-04-21 21:16:10 -07:00
Jerry Zhang
33e7977154 move const defs of DeviceType to DeviceType.h (#19185)
Summary:
Stack:
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **#19185 [c10][core][ez] move const defs of DeviceType to DeviceType.h**&nbsp;&nbsp;[💛](https://our.intern.facebook.com/intern/diff/D14909415/)

att
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19185

Differential Revision: D14909415

Pulled By: jerryzh168

fbshipit-source-id: 876cf999424d8394f5ff20e6750133a4e43466d4
2019-04-16 20:02:21 -07:00
Jerry Zhang
1c836e7bb9 Add Quantized Backend (#18546)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18546

We'll expose all combinations of various ways of quantization in the top level dispatch key, that is we have AffineCPUTensor, PerChannelAffineCUDATensor, etc.

QTensor method added:
- is_quantized()
- item()

Differential Revision: D14637671

fbshipit-source-id: 346bc6ef404a570f0efd34e8793056ad3c7855f5
2019-04-12 12:55:49 -07:00
jgong5
3ad710b837 Add MKL-DNN Tensor (#17748)
Summary:
This is a minimalist PR to add MKL-DNN tensor per discussion from Github issue: https://github.com/pytorch/pytorch/issues/16038

Ops with MKL-DNN tensor will be supported in following-up PRs to speed up imperative path.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17748

Reviewed By: dzhulgakov

Differential Revision: D14614640

Pulled By: bddppq

fbshipit-source-id: c58de98e244b0c63ae11e10d752a8e8ed920c533
2019-04-08 21:41:38 -07:00
Edward Yang
aed7c9bc96 Improve Backend comment. (#18567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18567
ghimport-source-id: 1e50e611a3afcfae86828b7afe06c3fdc6a7bef7

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18567 Improve Backend comment.**

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Reviewed By: dzhulgakov

Differential Revision: D14666189

fbshipit-source-id: 64a41c4a998b1a59ff780d1ae06fa16e5ef3c7c4
2019-04-02 08:06:48 -07:00
Gregory Chanan
3a85f88efd Remove deviceTypeToBackend, which is underspecified. (#18135)
Summary:
There are multiple backends for a device type, so we just kill this function.
Also, kill an getNonVariableType instance which was also underspecified.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18135

Differential Revision: D14507474

Pulled By: gchanan

fbshipit-source-id: fc791a76d4b851b23d09a070725f3838621eb13d
2019-03-19 07:53:28 -07:00
Alex Şuhan
9811a4220d Add XLA / TPU device type, backend type and type id (#16763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16763

Replicate the easy bits in https://github.com/pytorch/pytorch/pull/15153 with TPU / XLA instead of MSNPU. Also don't initialize the storage for XLA tensors for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16585

Reviewed By: ezyang

Differential Revision: D13912118

Pulled By: gchanan

fbshipit-source-id: 4889177e2478768fb281ed075b71146d1d850bd9
2019-02-05 12:56:44 -08:00
Roy Li
7e642dfff3 Introduce backend extensions (overriding operators on custom backends)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15153

Reviewed By: gchanan

Differential Revision: D13445571

fbshipit-source-id: 62e2ebe0a6e81c4983b47cddb57ee5eb78e96708
2019-02-01 11:00:16 -08:00