Commit Graph

130 Commits

Author SHA1 Message Date
Vitaly Fedyunin
c484cf43a0 Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors. (#18455)
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.

Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```

Part of the bigger: `Remove Storage` plan.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455

Reviewed By: ezyang

Differential Revision: D14672084

Pulled By: VitalyFedyunin

fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124
2019-04-02 08:48:19 -07:00
Mark Pare
fba89b2ae1 fix typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18653

Differential Revision: D14713920

Pulled By: ezyang

fbshipit-source-id: 170295a162dd23916c1dcc9330918d33277cc9ed
2019-04-02 07:51:30 -07:00
Vishwak Srinivasan
d859031ebf Rename btrifact* to lu (#18435)
Summary:
Changelog:

- Renames `btrifact` and `btrifact_with_info` to `lu`to remain consistent with other factorization methods (`qr` and `svd`).
- Now, we will only have one function and methods named `lu`, which performs `lu` decomposition. This function takes a get_infos kwarg, which when set to True includes a infos tensor in the tuple.
- Rename all tests, fix callsites
- Create a tentative alias for `lu` under the name `btrifact` and `btrifact_with_info`, and add a deprecation warning to not promote usage.
- Add the single batch version for `lu` so that users don't have to unsqueeze and squeeze for a single square matrix (see changes in determinant computation in `LinearAlgebra.cpp`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18435

Differential Revision: D14680352

Pulled By: soumith

fbshipit-source-id: af58dfc11fa53d9e8e0318c720beaf5502978cd8
2019-03-29 00:34:30 -07:00
Xiang Gao
2ba41c5550 Add some missing docs for tensor methods and attributes, new unittest to enforce tensors.rst no longer miss anything (#16057)
Summary:
This depend on https://github.com/pytorch/pytorch/pull/16039

This prevent people (reviewer, PR author) from forgetting adding things to `tensors.rst`.

When something new is added to `_tensor_doc.py` or `tensor.py` but intentionally not in `tensors.rst`, people should manually whitelist it in `test_docs_coverage.py`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16057

Differential Revision: D14619550

Pulled By: ezyang

fbshipit-source-id: e1c6dd6761142e2e48ec499e118df399e3949fcc
2019-03-26 18:05:56 -07:00
vishwakftw
291746f110 Rename trtrs to triangular_solve (#18213)
Summary:
Changelog:
- Renames `trtrs` to `triangular_solve` to remain consistent with `cholesky_solve` and `solve`.
- Rename all tests, fix callsites
- Create a tentative alias for `triangular_solve` under the name `trtrs`, and add a deprecation warning to not promote usage.
- Move `isnan` to _torch_docs.py
- Remove unnecessary imports
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18213

Differential Revision: D14566902

Pulled By: ezyang

fbshipit-source-id: 544f57c29477df391bacd5de700bed1add456d3f
2019-03-21 14:27:21 -07:00
Gao, Xiang
7e6220393f Cleanup arg{min, max} (#17103)
Summary:
Why do we need this workaround? `PythonArgParser` handles these two cases well.

The discussion started at https://github.com/pytorch/pytorch/pull/6201#issuecomment-378724406. The conclusion at that time by goldsborough was:

> Because we wanted to allow `dim=None` in Python and route to a different function. Essentially the problem was wanting to wrap the C++ function in Python. AFAIK there is no way of translating `dim=None` behavior into C++? So Richard and I came up with this strategy

Maybe at that time `PythonArgParser` was not powerful enough to handle the routing of two function with same name but different C++ signature.

Will keep an eye on the CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17103

Differential Revision: D14523503

Pulled By: VitalyFedyunin

fbshipit-source-id: cae3e2678062da2eccd93b51d4050578c7a9ab80
2019-03-20 16:28:27 -07:00
Vishwak Srinivasan
421b508d55 Rename gesv to solve (#18060)
Summary:
Changelog:

- Renames `gesv` to `solve` to remain consistent with `cholesky_solve`.
- Rename all tests, fix callsites
- Create a tentative alias for `solve` under the name `gesv`, and add a deprecated warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18060

Differential Revision: D14503117

Pulled By: zou3519

fbshipit-source-id: 99c16d94e5970a19d7584b5915f051c030d49ff5
2019-03-18 16:04:24 -07:00
Vishwak Srinivasan
3f1d0ee5d5 Deprecate torch.pstrf (#17866)
Summary:
Changelog:
- Add deprecation warning to torch.pstrf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17866

Differential Revision: D14405527

Pulled By: soumith

fbshipit-source-id: 73f3b7d61c60eb57e4bffd08112e552ae3e6dfdc
2019-03-11 12:27:52 -07:00
zou3519
68c5c66800 Warn about memory overlaps on expanded tensors (#17576)
Summary:
Eventually we should remove these when we're certain that all our ops
handle memory overlaps correctly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17576

Differential Revision: D14349990

Pulled By: zou3519

fbshipit-source-id: c3a09f6113b9b1bf93e7f13c0b426c45b2cdf21f
2019-03-06 17:44:04 -08:00
Krishna Kalyan
59ece70201 Missing argument description (value) in scatter_ function documentation (#17467)
Summary: Update the docs to include the value parameter that was missing in the `scatter_` function.

Differential Revision: D14209225

Pulled By: soumith

fbshipit-source-id: 5c65e4d8fbd93fcd11a0a47605bce6d57570f248
2019-02-25 14:39:26 -08:00
Gao, Xiang
722cbe3064 Move argsort to C++
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17099

Differential Revision: D14165671

Pulled By: ezyang

fbshipit-source-id: 3871de6874fe09871ebd9b8943c13c9af325bf33
2019-02-21 07:59:27 -08:00
Krishna Kalyan
9ebc433bda Clarification of Lerp operation on tensors (#17253)
Summary: The `tensor` be used as `end` clarified in the docs.

Differential Revision: D14132212

Pulled By: ezyang

fbshipit-source-id: e9bca14d5079e5f7adfc18afcb1eec832ef86e9e
2019-02-19 11:27:02 -08:00
Xiang Gao
4fcab92d6c Move outplace ops to ATen (#16788)
Summary:
Based on https://github.com/pytorch/pytorch/pull/12413, with the following additional changes:

-  Inside `native_functions.yml` move those outplace operators right next to everyone's corresponding inplace operators for convenience of checking if they match when reviewing
- `matches_jit_signature: True` for them
- Add missing `scatter` with Scalar source
- Add missing `masked_fill` and `index_fill` with Tensor source.
- Add missing test for `scatter` with Scalar source
- Add missing test for `masked_fill` and `index_fill` with Tensor source by checking the gradient w.r.t source
- Add missing docs to `tensor.rst`

Differential Revision: D14069925

Pulled By: ezyang

fbshipit-source-id: bb3f0cb51cf6b756788dc4955667fead6e8796e5
2019-02-15 15:58:10 -08:00
ZhuBaohe
aae6b53c5b DOC: correct docstring for torch and torch.Tensor package (#16842)
Summary:
This PR is a simple fix for the mistake in the  "tensor"  and "torch.Tensor"doc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16842

Differential Revision: D14020300

Pulled By: ezyang

fbshipit-source-id: 3ab04f1223d6e60f8da578d04d759e385d23acbb
2019-02-10 14:37:29 -08:00
Edward Yang
6c04224cd8 Revert "Move outplace ops to ATen (#12413)" (#16731)
Summary:
This reverts commit f660d3ae19.

cc zasdfgbnm

Reasoning at https://github.com/pytorch/pytorch/pull/12413#issuecomment-460424129
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16731

Differential Revision: D13948022

Pulled By: ezyang

fbshipit-source-id: b10669cf03679e306850314b7b5b08bed0839e19
2019-02-04 19:30:04 -08:00
Xiang Gao
f660d3ae19 Move outplace ops to ATen (#12413)
Summary:
So that things like below can be JITable, and available in C++ API:

```python
import torch

torch.jit.script
def f(x, y, z):
    x.index_add(0, y, z)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12413

Differential Revision: D13899948

Pulled By: suo

fbshipit-source-id: b0006b4bee2d1085c813733e1037e2dcde4ce626
2019-01-31 16:09:45 -08:00
ParticularlyPythonicBS
16e2e4f29f added example to clear ambiguity in torch.Tensor.view (#16563)
Summary:
Added example to the documentation of [torch.Tensor.view](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view) to avoid the misunderstanding referenced in issue [#16560](https://github.com/pytorch/pytorch/issues/16560)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16563

Differential Revision: D13885008

Pulled By: soumith

fbshipit-source-id: b7e7fbea1f16124bc4e679ae9c50ab619e1f043d
2019-01-30 16:53:31 -08:00
Derek Kim
19717224c5 Miscellaneous broken RSTs fixed (#16033)
Summary:
https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_
https://pytorch.org/docs/master/torch.html#torch.addmm
https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033

Differential Revision: D13671202

Pulled By: soumith

fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d
2019-01-15 09:50:12 -08:00
vishwakftw
95febdfacc Add is_floating_point to docs (#15704)
Summary:
Fixes #15700 .

Changelog:

- Expose torch.*.is_floating_point to docs

Differential Revision: D13580734

Pulled By: zou3519

fbshipit-source-id: 76edb4af666c08237091a2cebf53d9ba5e6c8909
2019-01-07 10:43:22 -08:00
vishwakftw
41e7e1bc40 Rename potrs to cholesky_solve (#15334)
Summary:
Changelog:
- Renames `potrs` to `cholesky_solve` to remain consistent with Tensorflow and Scipy (not really, they call their function chol_solve)
- Default argument for upper in cholesky_solve is False. This will allow a seamless interface between `cholesky` and `cholesky_solve`, since the `upper` argument in both function are the same.
- Rename all tests
- Create a tentative alias for `cholesky_solve` under the name `potrs`, and add deprecated warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15334

Differential Revision: D13507724

Pulled By: soumith

fbshipit-source-id: b826996541e49d2e2bcd061b72a38c39450c76d0
2018-12-19 12:31:24 -08:00
Ailing Zhang
6ab2e7442d Autograd using torchscript (#14604)
Summary:
This PR enables autodiff to use the forward/backward graph compiled from python code, instead of using symbolic gradients(modifying the original graph directly).

We put the map in a separate .h file for now to wait for the native_functions.yaml and derivatives.yaml merge. This should ideally go into native_functions.yaml eventually.

This PR should be enough to unblock us for now, we can start writing gradients for aten functions in python.

Differential Revision: D13494635

Pulled By: ailzhang

fbshipit-source-id: f8d51a15243ac46afd09d930c573ccdfcd9fdaaf
2018-12-18 19:10:57 -08:00
vishwakftw
fc30e2782c Remove deprecated info argument in btrifact (#14935)
Summary:
As specified in title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14935

Differential Revision: D13394449

Pulled By: soumith

fbshipit-source-id: 569d59414f3a1a43ea641bded4b5433eb53e3490
2018-12-09 15:59:30 -08:00
Sam Gross
c1c841a4e7 Changes based on @gchanan's review of #13420 (#14441)
Summary:
```
The most significant change is that this fixes the error message when
indexing an empty tensor with an out-of-bounds index. For example:

  x = torch.ones(10, 0)
  x[:, [3, 4]]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14441

Differential Revision: D13226737

Pulled By: colesbury

fbshipit-source-id: d1c4a35a30e3217e3d1727d13f6b354a4a3b2a24
2018-11-30 11:03:20 -08:00
albanD
f80d34a1c8 Update Tensor doc (#14339)
Summary:
Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`.
Update the `register_backward_hook` doc with a warning stating that it does not work in all cases.
Add support in the `_add_docstr` function to add docstring to attributes.

There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const.
I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const.
EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const.
Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it !
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339

Differential Revision: D13243266

Pulled By: ezyang

fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb
2018-11-28 15:28:17 -08:00
Ailing Zhang
e387d945c2 allow empty index for scatter_* methods (#14077)
Summary:
Fixes #2027
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14077

Differential Revision: D13095788

Pulled By: ailzhang

fbshipit-source-id: ad2c8bbf83d36e07940782b9206fbdcde8905fd3
2018-11-19 09:50:21 -08:00
Wei Yang
5dd153b1c2 speed up torch.sparse_mask() cpu kernel (#13290)
Summary:
- `sparse_mask(D, S)` is useful to implement backward for `sparse_addmm()`
- previous `sparse_mask(D, S)` cpu kernel is not parallelized
- this PR speed up the cpu kernel for two separated cases:
  - `D.dim == S.sparse_dim`: simply parallelize the kernel
  - `D.dim > S.sparse_dim`: simply use CUDA kernel implementation
- performance:

`D.dim == S.sparse_dim`
```
>>> nnz = 100000
>>> dims = [1000, 1000]
>>> I = torch.cat([torch.randint(0, dims[0], size=(nnz,)),
               torch.randint(0, dims[1], size=(nnz,))], 0).reshape(2, nnz)
>>> V = torch.randn(nnz)
>>> size = torch.Size(dims)

>>> S = torch.sparse_coo_tensor(I, V, size).coalesce()
>>> D = torch.randn(dims)

>>> %timeit D.sparse_mask(S)

======= before change =======
6.4 ms ± 684 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

======= after change =======
333 µs ± 89.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```

`D.dim > S.sparse_dim`
```
>>> nnz = 100000
>>> dims = [1000, 1000, 2, 2]
>>> I = torch.cat([torch.randint(0, dims[0], size=(nnz,)),
               torch.randint(0, dims[1], size=(nnz,))], 0).reshape(2, nnz)
>>> V = torch.randn(nnz, dims[2], dims[3])
>>> size = torch.Size(dims)

>>> S = torch.sparse_coo_tensor(I, V, size).coalesce()
>>> D = torch.randn(dims)
%timeit D.sparse_mask(S)

======= before change =======
495 ms ± 41.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

======= after change =======
594 µs ± 68.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13290

Differential Revision: D12878336

Pulled By: weiyangfb

fbshipit-source-id: 10b5981af382f7c6095a42c0fee7297d6438ce37
2018-11-07 20:02:17 -08:00
Thomas Viehmann
f0ed927b62 Add diag_embed to ATen and torch (#12447)
Summary:
Fixes: #12160
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12447

Differential Revision: D12916234

Pulled By: SsnL

fbshipit-source-id: 512a04efb0c2e0a54295b857a61be66c3aae13da
2018-11-05 08:55:28 -08:00
Brian Vaughan
07f8b61cc6 Roll operator t32802531 (#13261)
Summary:
Adding a roll operator
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13261

Differential Revision: D12922575

Pulled By: nairbv

fbshipit-source-id: ff05c075d9c484a615011192b023debf47da4017
2018-11-05 08:33:36 -08:00
vishwakftw
d714ecf879 Rename potrf to cholesky (#12699)
Summary:
This PR performs a renaming of the function `potrf` responsible for the Cholesky
decomposition on positive definite matrices to `cholesky` as NumPy and TF do.

Billing of changes
- make potrf cname for cholesky in Declarations.cwrap
- modify the function names in ATen/core
- modify the function names in Python frontend
- issue warnings when potrf is called to notify users of the change

Reviewed By: soumith

Differential Revision: D10528361

Pulled By: zou3519

fbshipit-source-id: 19d9bcf8ffb38def698ae5acf30743884dda0d88
2018-11-01 15:10:55 -07:00
Doug Friedman
bc352ace7c dense.to_sparse() re: #8853 (#12171)
Summary:
Here is my stab at ```dense.to_sparse```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12171

Differential Revision: D10859078

Pulled By: weiyangfb

fbshipit-source-id: 5df72f72ba4f8f10e283402ff7731fd535682664
2018-10-26 21:48:52 -07:00
Tongzhou Wang
46162ccdb9 Autograd indices/values and sparse_coo ctor (#13001)
Summary:
Reopen of #11253 after fixing bug in index_select
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13001

Differential Revision: D10514987

Pulled By: SsnL

fbshipit-source-id: 399a83a1d3246877a3523baf99aaf1ce8066f33f
2018-10-24 10:00:22 -07:00
Jan Schlüter
373b5080da Warn that tensor.resize_() resets strides (#12816)
Summary:
As discussed in #1570, this adds a warning to the docstring of `tensor.resize_()` to prevent people from naively using it as an in-place view or reshape.

For your convenience, the updated docstring renders as follows:
![torch_resize_docstring](https://user-images.githubusercontent.com/629706/47148782-f1b57900-d2d1-11e8-9749-e9c7387113ed.png)

Fixes #1570.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12816

Differential Revision: D10457755

Pulled By: ezyang

fbshipit-source-id: dd4b3a821e8c76dc534d81c53084abdb336e690a
2018-10-18 22:47:30 -07:00
Thomas Viehmann
0521c47c91 Amend nondeterminism notes (#12217)
Summary:
include atomicAdd commentary as this is less well known

There is some discussion in #12207

Unfortunately, I cannot seem to get the ..include working in `_tensor_docs.py` and `_torch_docs.py`. I could use a hint for that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12217

Differential Revision: D10419739

Pulled By: SsnL

fbshipit-source-id: eecd04fb7486bd9c6ee64cd34859d61a0a97ec4e
2018-10-16 23:59:26 -07:00
vishwakftw
0740a5d521 compute_uv for SVD (#12517)
Summary:
Adds a `compute_uv` argument that defaults to `True` for optionally computing the singular vectors during SVD.

Closes https://github.com/pytorch/pytorch/issues/12420 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12517

Differential Revision: D10384554

Pulled By: SsnL

fbshipit-source-id: 704998a257afa815eda901b8ae830e8a661695be
2018-10-15 12:35:56 -07:00
Thomas Viehmann
0cf3c1ce66 Add copy= keyword to Tensor.to (#12571)
Summary:
Fixes: #12454
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12571

Differential Revision: D10356994

Pulled By: SsnL

fbshipit-source-id: d87416078a5a8e5ffa690cd73c09fa6b4e16aa25
2018-10-12 02:10:44 -07:00
Doug Friedman
c2f8f5076c add narrow() support for sparse tensors re: #8853 (#11342)
Summary:
Couple questions:

1) I used the log1p implementation in #8969 as a guide especially for testing.  I'm not sure what the ```skipIfROCM``` annotation is for, so unsure if i need it for my test.

2) I implemented the branching logic in the narrow function itself; is this the right place to do so?  I noticed that there a number of places where sparse-specific logic is handled with just an if statement in this file.  Or should I implement a separate dispatch in native_functions.yml as in the log1p?

And of course, happy to make any any other updates/changes that I may have missed as well.  This is my first PR to the project.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11342

Differential Revision: D9978430

Pulled By: weiyangfb

fbshipit-source-id: e73dc20302ab58925afb19e609e31f4a38c634ad
2018-09-26 12:24:54 -07:00
Wei Yang
817e83fc01 fix PR #11061 (#11815)
Summary:
- fix PR https://github.com/pytorch/pytorch/pull/11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data `
- with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args
- `torch.as_tensor` retains its behavior as documented

gchanan apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11815

Differential Revision: D9932713

Pulled By: weiyangfb

fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259
2018-09-21 11:04:19 -07:00
Tongzhou Wang
24e958a0a7 Move bernoulli into ATen (#10273)
Summary:
+ https://github.com/pytorch/pytorch/issues/10236 : torch.bernoulli's out kwarg is broken
  fixed in moving `bernoulli_out` to ATen
+ https://github.com/pytorch/pytorch/issues/9917 : BUG torch.bernoulli(p.expand(shape)) is broken
  fixed in moving all `bernoulli` ops in ATen to use the modern apply utils methods
+ https://github.com/pytorch/pytorch/issues/10357 : torch.bernoulli inconsistent gpu/cpu results
  fixed by adding CUDA asserts

In order to use `curand_uniform4`, I made some changes to `CUDAApplyUtils.cuh`. Specifically, I introduced an optional template parameter `int step` to the `CUDA_tensor_applyN` methods, representing that we want to process `step` values at each time for each of the `N` tensors.

The calling convention for `step = 1` (default) isn't changed. But if `step > 1`, the given lambda `op` must take in `int n` as its first argument, representing the number of valid values, because there may not be full `step` values at the boundary. E.g., here is what the `bernoulli(self, p_tensor)` call look like:
```cpp

  // The template argument `4` below indicates that we want to operate on four
  // element at each time. See NOTE [ CUDA_tensor_applyN helpers ] for details.
  at::cuda::CUDA_tensor_apply2<scalar_t, prob_t, 4>(
      ret, p,
      [seeds] __device__(
          int n, scalar_t& v1, scalar_t& v2, scalar_t& v3, scalar_t& v4,
          const prob_t& p1, const prob_t& p2, const prob_t& p3, const prob_t& p4) {
        curandStatePhilox4_32_10_t state;
        curand_init(
            seeds.first,
            blockIdx.x * blockDim.x + threadIdx.x,
            seeds.second,
            &state);
        float4 rand = curand_uniform4(&state);
        switch (n) {
          case 4: {
            assert(0 <= p4 && p4 <= 1);
            v4 = static_cast<scalar_t>(rand.w <= p4);
          }
          case 3: {
            assert(0 <= p3 && p3 <= 1);
            v3 = static_cast<scalar_t>(rand.z <= p3);
          }
          case 2: {
            assert(0 <= p2 && p2 <= 1);
            v2 = static_cast<scalar_t>(rand.y <= p2);
          }
          case 1: {
            assert(0 <= p1 && p1 <= 1);
            v1 = static_cast<scalar_t>(rand.x <= p1);
          }
        }
      }
    );
```

Benchmarking on `torch.rand(200, 300, 400)` 20 times, each time with 20 loops:

post patch
```
➜  ~ numactl --cpunodebind 1 --membind 1 -- taskset -c 12,13,14,15,16,17,18,19,20,21,22,23 env CUDA_LAUNCH_BLOCKING=1 python bern.py
torch.bernoulli(x)
6.841588497161865 +- 0.05413117632269859
torch.bernoulli(xc)
0.05963418632745743 +- 0.0008014909108169377
x.bernoulli_()
0.4024486541748047 +- 0.0021550932433456182
xc.bernoulli_()
0.02167394384741783 +- 2.3818030967959203e-05

```

pre-patch
```
➜  ~ numactl --cpunodebind 1 --membind 1 -- taskset -c 12,13,14,15,16,17,18,19,20,21,22,23 env CUDA_LAUNCH_BLOCKING=1 python bern.py
torch.bernoulli(x)
12.394511222839355 +- 0.0966421514749527
torch.bernoulli(xc)
0.08970972150564194 +- 0.0038722590543329716
x.bernoulli_()
1.654480218887329 +- 0.02364428900182247
xc.bernoulli_()
0.058352887630462646 +- 0.003094920190051198

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10273

Differential Revision: D9831294

Pulled By: SsnL

fbshipit-source-id: 65e0655a36b90d5278b675d35cb5327751604088
2018-09-19 16:45:47 -07:00
Tongzhou Wang
3a39006d38 Fix some more doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11531

Differential Revision: D9776541

Pulled By: SsnL

fbshipit-source-id: 8725485639ea6e9479b6ea95a49f5b75a9457db7
2018-09-11 16:26:55 -07:00
Tongzhou Wang
de460c7ad3 Improvements on conv/pool/fold/stft/ParamDict docs (#11106)
Summary:
Also fixes some incorrect formula rendering.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11106

Differential Revision: D9752433

Pulled By: SsnL

fbshipit-source-id: 535fc8498638e8b645757fc7535d8771992b7d21
2018-09-11 08:56:21 -07:00
Tongzhou Wang
d3f98b5ffc Add matrix power (#11421)
Summary:
vishwakftw Your patch needed some updates because the default native function dispatches changed from `[function, method]` to `[function]`. The CI was run before that change happened so it still shows green, but the internal test caught it.

I did some changes when rebasing and updating so I didn't just force push to your branch. Let's see if this passes CI and internal test. If it does, let me know if you want me to force push to your branch or use this PR instead.

Note to reviewers: patch was already approved at #10068 .

cc yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11421

Differential Revision: D9733407

Pulled By: SsnL

fbshipit-source-id: cf2ed293bb9942dcc5158934ff4def2f63252599
2018-09-08 15:25:56 -07:00
vishwakftw
593d74061f Document torch.allclose (#11185)
Summary:
- Modify torch.autograd.gradcheck to use torch.allclose instead
- Expose doc strings

Closes #10355
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11185

Differential Revision: D9628016

Pulled By: soumith

fbshipit-source-id: 22a30622b9fe52e41b5b3540406137b59d8c5a75
2018-09-02 09:26:07 -07:00
Tongzhou Wang
e9eed8edb4 Add doc for Tensor.digamma_? (#11008)
Summary:
follow up for #10967

zou3519 vishwakftw
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11008

Differential Revision: D9559889

Pulled By: SsnL

fbshipit-source-id: a05d8fbad92a54bcdb93de6e62a7f94180da1d99
2018-08-29 14:11:16 -07:00
Tongzhou Wang
d043f83019 Add tests for Tensor.* nn.* F.* docs (#10311)
Summary:
Test only for existence for now. I had to skip a lot of them so there a FIXME in the test.

Also I'm not testing torch.* because of namespace issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10311

Differential Revision: D9196341

Pulled By: SsnL

fbshipit-source-id: 9c2ca1ffe660bc1cc664474993f8a21198525ccc
2018-08-14 11:39:46 -07:00
Vishwak Srinivasan
360c1bbd5b Add multivariate log-gamma (mvlgamma) (#9451)
Summary:
1. Add tests in test_cuda, test_torch
2. Add doc strings

Closes https://github.com/pytorch/pytorch/issues/9378 .

Differential Revision: D8859746

Pulled By: ezyang

fbshipit-source-id: 939c309d90940a7aa08f53004c9e7b3b1c9cf54e
2018-07-24 12:10:10 -07:00
Tongzhou Wang
2a0018f2a8 Add scatter_add_ doc (#9630)
Summary:
fixes #4176 cc vishwakftw

I didn't do `:math:` and `\neg` because I am using double ticks so they render more similarly with `:attr:`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9630

Differential Revision: D8933022

Pulled By: SsnL

fbshipit-source-id: 31d8551f415b624c2ff66b25d886f20789846508
2018-07-20 08:41:05 -07:00
vishwakftw
52cc073212 Implement reshape_as (#9452)
Summary:
1. Added tests
2. Added doc string
3. Remove view_as redundant definition from tensor.py

Closes #9416
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9452

Differential Revision: D8851794

Pulled By: ezyang

fbshipit-source-id: 0aa0430dd0a174e1a5caddbc50a7e2c9eb7802bc
2018-07-17 08:54:42 -07:00
Alican Bozkurt
d017e1798f add erfc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9366

Differential Revision: D8816768

Pulled By: soumith

fbshipit-source-id: 7d709f932cf156a2e7ec71c710837beb7f647d66
2018-07-12 08:32:02 -07:00
Xiang Gao
a615baa51f move unbind to ATen
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/8587

Differential Revision: D8764086

Pulled By: soumith

fbshipit-source-id: 7f311cf13c341040e1f2cf4a8f05723e32d38947
2018-07-08 16:46:35 -07:00
Vishwak Srinivasan
14cbd9adb8 Implement torch.pinverse : Pseudo-inverse (#9052)
Summary:
1. Used SVD to compute.
2. Tests in test_autograd, test_cuda and test_torch
3. Doc strings in _torch_docs.py and _tensor_docs.py

Closes #6187
Closes https://github.com/pytorch/pytorch/pull/9052

Reviewed By: soumith

Differential Revision: D8714628

Pulled By: SsnL

fbshipit-source-id: 7e006c9d138b9f49e703bd0ffdabe6253be78dd9
2018-07-05 09:11:24 -07:00