Summary:
Instead of a mixture of direct calls to library provided atomicAdd calls, such as float atomicAdd(float*, float) and calls provided internally, such as void atomicAdd(long*, long), abstract to one API void gpuAtomicAdd(T*, T) in THCAtomics.cuh for the PyTorch backend.
The advantage of this approach is that it allows us to more easily distinguish between capabiltiies of different platforms (and their versions). Additionally, the abstraction of void returning atomicAdds allows us to, in the future, support fast HW instructions on some platforms that will not return the previous value.
Call sites that do not satisfy above conditions and are either highly platform specific (__half2 atomicAdd fast path in one operator) or require the return explicitly (some int atomicAdd invocations) are left untouched. The Caffe2 backend also remains untouched.
While here, add a bunch of includes of THCAtomics.cuh that were missing before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31992
Differential Revision: D19330220
Pulled By: ezyang
fbshipit-source-id: d6ab73ec5168c77e328faeef6c6f48eefba00861
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30530
Switch some mentions of "C++11" in the docs to "C++14"
ghstack-source-id: 95812049
Test Plan: testinprod
Differential Revision: D18733733
fbshipit-source-id: b9d0490eb3f72bad974d134bbe9eb563f6bc8775
Summary:
`at::ArrayRef` / `torch::IntArrayRef` should be discouraged in user code, because users might not be aware of the fact that it doesn't own the underlying data, which already leads to memory access bugs when they try to write the following:
```cpp
auto expected_sizes = torch::IntArrayRef({2, 16, 6}); // The memory that represents `{2, 16, 6}` is released after this line
ASSERT_EQ(output.sizes(), expected_sizes); // `expected_sizes` is pointing to invalid memory region
```
This PR changes all usage of `at::ArrayRef` and `torch::IntArrayRef` to the corresponding `std::vector` version, so that users won't pick up the habit of using `ArrayRef` by looking at the test code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27884
Differential Revision: D17921646
Pulled By: yf225
fbshipit-source-id: 461e79fc22b598aac230d36cc028085ce6cbe937
Summary:
This is a continuation of efforts into packed accessor awareness.
A very simple example is added, along with the mention that the template can hold more arguments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19464
Differential Revision: D15012564
Pulled By: soumith
fbshipit-source-id: a19ed536e016fae519b062d847cc58aef01b1b92
Summary:
Fixed a few C++ API callsites to work with v1.0.1.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16221
Differential Revision: D13759207
Pulled By: yf225
fbshipit-source-id: bd92c2b95a0c6ff3ba5d73cb249d0bc88cfdc340
Summary:
Fix submitted by huntzhan in https://github.com/pytorch/cppdocs/pull/4. The source is in this repo so the patch has to be applied here.
soumith ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15701
Differential Revision: D13591302
Pulled By: goldsborough
fbshipit-source-id: 796957696fd560a9c5fb42265d7b2d018abaebe3
Summary:
Removes aten/README.md (and some other files dating from when aten was its own repo), and moves the not outdated documentation into a note called "Tensor Basics". I updated the text lightly but did not overhaul the content.
CC zdevito
ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13601
Differential Revision: D12934480
Pulled By: goldsborough
fbshipit-source-id: 012a4267b4d6f27e4d5d55d6fc66363ddca10b41