Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72411
Fixes
```
caffe2/caffe2/operators/max_pool_with_index.cu(16): warning: type qualifier specified more than once
caffe2/caffe2/operators/max_pool_with_index.cu(28): warning: type qualifier specified more than once
caffe2/caffe2/operators/max_pool_with_index.cu(61): warning: type qualifier specified more than once
caffe2/caffe2/operators/max_pool_with_index.cu(62): warning: type qualifier specified more than once
caffe2/caffe2/operators/max_pool_with_index.cu(74): warning: type qualifier specified more than once
```
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D34034382
fbshipit-source-id: 2b73c55358632090baf673b32b800656ae874040
(cherry picked from commit ab3f3f9a79)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17764
Original commit changeset: f1923fdca4a1
reverted int8 ops fixes the original runtime regression.
We'll ignore the memory regression since it is flaky, see D14228484
Reviewed By: dzhulgakov
Differential Revision: D13885233
fbshipit-source-id: ccbe4b94acb44b7b4cb3ae4d73e3f6091e1e1195
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16273
Previously we have SetOutputSize which accept a partially initialized Output Tensor and set it to the correct size,
the diff change this to GetOutputSize that returns the correct size instead.
e.g.
```
auto* Y = Output(0);
ConvPoolOp<Context>::SetOutputSize(X, Y, channels);
...
Y->mutable_data<T>...
```
-->
```
auto sizes = ConvPoolOp<Context>::GetOutputSize(X, channels);
auto* Y = Output(0, sizes, at::dtype<T>());
```
Reviewed By: dzhulgakov
Differential Revision: D13736281
fbshipit-source-id: 64abce3dbaed0b375098463333dfd0ea5a3b1945
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13
Pull Request resolved: https://github.com/pytorch/translate/pull/166
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125
Closes https://github.com/pytorch/pytorch/pull/9125
Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later
Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:
1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change
Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.
Reviewed By: ezyang, houseroad
Differential Revision: D9024330
fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13
Pull Request resolved: https://github.com/pytorch/translate/pull/166
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125
Closes https://github.com/pytorch/pytorch/pull/9125
Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later
Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:
1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change
Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.
Reviewed By: xw285cornell
Differential Revision: D8121878
fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
Summary:
The goal of this PR is to add an infrastructure; to convert(hipify) CUDA ops into [HIP](https://github.com/ROCm-Developer-Tools/HIP) ops , at **compile** time.
Note that HIP ops, which are portable c++ code, can run on AMD and NVIDIA platform.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9322
Differential Revision: D8884707
Pulled By: bddppq
fbshipit-source-id: dabc6319546002c308c10528238e6684f7aef0f8
Summary: Add check that every time we register a caffe operator to CPU or GPU that documentation is added for the particular operator.
Reviewed By: dzhulgakov
Differential Revision: D5443110
fbshipit-source-id: 3793c3d29bea1228078cb30bdf8243ac0ab90664
Summary: Add lint rule to check that every time we register a caffe operator to CPU or GPU that documentation is added for the particular operator.
Reviewed By: dzhulgakov
Differential Revision: D5348078
fbshipit-source-id: c3fa22fc7ca8066d5fc8fa780b23d7867fd3380e
Summary:
cuDNN versions of dropout and LRN (for native fp16 support), port of Caffe's max pooling algo that uses an explicit mask to store locations (also supports fp16 storage)
Closes https://github.com/caffe2/caffe2/pull/396
Reviewed By: akyrola
Differential Revision: D4990880
Pulled By: asaadaldien
fbshipit-source-id: a716acffb656843e9b31e3e6808bd2d8aa959d03