Commit Graph

18346 Commits

Author SHA1 Message Date
Tongzhou Wang
f051fbd4a8 Fix typo in test_dataloader
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21226

Differential Revision: D15592797

Pulled By: soumith

fbshipit-source-id: b9a83e574c7b10fb0d661332ab68e376409a4724
2019-06-01 10:30:14 -07:00
Natalia Gimelshein
d168a8533f compare scalar device with common device (#21236)
Summary:
I think there was a typo in #20690 here https://github.com/pytorch/pytorch/pull/20690/files#diff-b47a50873394e38a005b4c1acd151957R130.
Original conditional was ` common_backend == Backend::CUDA && op.tensor.type().backend() == Backend::CPU)`, now it is `op.device.is_cuda() && op.tensor.device().is_cpu()`. It seems that `op.device` and `op.tensor.device()` should be the same, so this conditional is never true. This leads to spurious h2d copies for operations between cuda tensors and cpu scalars, because cpu scalars are now sent to gpu, instead of being passed to lambdas directly.
Unfortunately, I don't know how to test this change, because functionally everything was fine after #20690, it was just a performance regression.

cc colesbury
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21236

Differential Revision: D15592754

Pulled By: soumith

fbshipit-source-id: 105bfecc61c222cfdb7294a03c9ecae3cc7f5817
2019-06-01 10:24:31 -07:00
Hans Lee
41b17e2458 Fix wrong type hints for Tensor.is_cuda, is_leaf (#21192)
Summary:
`Tensor.is_cuda` and `is_leaf` is not a predicate function but a `bool` attribute. This patch fixes the type hints in `torch/__init__.pyi` for those attributes.

```diff
- def is_cuda(self) -> bool: ...
+ is_cuda: bool
- def is_leaf(self) -> bool: ...
+ is_leaf: bool
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21192

Differential Revision: D15592766

Pulled By: soumith

fbshipit-source-id: 8c4ecd6939df8b8a8a19e1c9db6d40193bca7e4a
2019-06-01 10:04:52 -07:00
peter
be7fc40621 Fix sccache not being used on Windows (#21248)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/21167.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21248

Differential Revision: D15592742

Pulled By: soumith

fbshipit-source-id: 4add002698c13301f142526cd783c866d345bf5e
2019-06-01 09:47:39 -07:00
James Reed
619261d7a7 Add file-line info for jit.load and string frontend (#21217)
Summary:
This makes file-line reporting also work for things loaded using `torch.jit.load()` as well as the string frontend (via `CompilationUnit`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21217

Differential Revision: D15590838

Pulled By: jamesr66a

fbshipit-source-id: 6b6a12574bf9eca0b83f24f0b50535fda5863243
2019-05-31 23:43:15 -07:00
Owen Anderson
b663eec119 Lazily build error strings in schema matching using replay. (#21241)
Summary:
Saves ~20% (5.3s -> 4.3s) loading DenseNet on my laptop.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21241

Differential Revision: D15590338

fbshipit-source-id: 2c8aebc829d4ea46f358d74d396cc44f5f57fcf5
2019-05-31 23:34:20 -07:00
Sovvo
5bc7c1f83d fix contribution and governance links (#21243)
Summary:
Updated web links on contribution_guide and governance documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21243

Differential Revision: D15591065

Pulled By: soumith

fbshipit-source-id: fdcfc518605a08a2ac35a10c146122d7d0a3f609
2019-05-31 21:02:13 -07:00
Chunli Fu
85786bea7d Export feature length information for onnxifi operator (#21110)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21110

Export feature length information for onnxifi operator

Reviewed By: ipiszy

Differential Revision: D15548138

fbshipit-source-id: 460118648bb4467c096f79dea524060c9524f23d
2019-05-31 20:25:34 -07:00
Mingzhe Li
516ea33f6a add PT maxpool and avgpool ops to the benchmark suite (#21200)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21200

This diff adds MaxPool1d/2d/3d and AvgPool1d/2d/3d to the benchmark suite.

Reviewed By: hl475

Differential Revision: D15541980

fbshipit-source-id: 394d136ee94a16ee24285939323ca5fe317e99d3
2019-05-31 19:35:29 -07:00
Mingzhe Li
dceea73460 add PT conv and convtranspose ops to the benchmark suite (#21199)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21199

This diff adds Conv1d, ConvTranspose1d, Conv2d, ConvTranspose2d, Conv3d, and ConvTranspose3d operators to the benchmark suite.

Reviewed By: hl475

Differential Revision: D15520817

fbshipit-source-id: 5512afec2be8a1036fbcd170f70265c7e455fcde
2019-05-31 19:35:25 -07:00
Mingzhe Li
2d75d31398 add PT linear op to the benchmark suite (#21204)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21204

as title

Reviewed By: hl475

Differential Revision: D15484743

fbshipit-source-id: 7094a983e370e1c3952021146b58b844874b7d5e
2019-05-31 19:35:22 -07:00
Mingzhe Li
00b3e69211 add PT batchnorm op to the benchmark suite (#21201)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21201

as title

Reviewed By: hl475

Differential Revision: D15482581

fbshipit-source-id: d93713a35be41e76d077df419cb24585f69d72eb
2019-05-31 19:35:18 -07:00
Mingzhe Li
ed1078bde3 migrate matmul operator to the new interface (#21198)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21198

as title

Reviewed By: hl475

Differential Revision: D15325768

fbshipit-source-id: a5d7c6837cd09445e75846660d12807dd26af6cc
2019-05-31 19:35:15 -07:00
Michael Suo
c8dc707fee avoid multiple writes to files on export (#21186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21186
ghimport-source-id: 2f62fed50e0d74f4162b74b6a2f44b8baa376316

Differential Revision: D15581527

Pulled By: suo

fbshipit-source-id: b1150cfa47d8df6f217f048c742a5ba9fa7f7935
2019-05-31 19:14:46 -07:00
Junjie Bai
4c19421f16 Register gradient op with engine (#21205)
Summary:
cc dreiss
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21205

Differential Revision: D15578948

Pulled By: bddppq

fbshipit-source-id: ef285174e8637daef624c8088ebd903a70582345
2019-05-31 18:48:47 -07:00
James Reed
daa1e2de1a Add file:line:graph to graph printout (#21180)
Summary:
Example:

```
import torch

torch.jit.script
def foo(x):
    y = torch.neg(x)
    return x - y

print(foo.graph.debug_str())
```

```
graph(%x : Tensor):
  %2 : int = prim::Constant[value=1]()
  %y : Tensor = aten::neg(%x) # demo.py:5:9
  %3 : Tensor = aten::sub(%x, %y, %2) # demo.py:6:12
  return (%3)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21180

Differential Revision: D15583548

Pulled By: jamesr66a

fbshipit-source-id: 0c6dc2fb7555c01dde9c563b78422ef234b2681b
2019-05-31 18:14:18 -07:00
Aapo Kyrola
678dc44d4c use _sparse_coo_tensor_unsafe in coalesce for speedup (#21214)
Summary:
Studied why sparse tensor coalesce was slow:  issue #10757.

Using nv-prof, and writing a simple benchmark, I determined bulk of the time was used ``kernelTransformReduceInnermostDimIndex``, which is called when sparse tensor is constructed with sparse_coo_tensor when it does sanity check on the minimum and maximum indices. However, we do not need this sanity check because after coalescing the tensor, these min/maxs won't change.

On my benchmark with 1 million non-zeros, the runtime of coalesce. was about 10x from 0.52s to 0.005 sec.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21214

Reviewed By: bddppq

Differential Revision: D15584338

Pulled By: akyrola

fbshipit-source-id: a08378baa018dbd0b45d7aba661fc9aefd3791e0
2019-05-31 17:10:05 -07:00
Yinghai Lu
9e5f1db66b Reuse common options between ONNXIFI and TVM transformations (#21163)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21163

These two backend transformation share some common traits. Therefore we want to reuse the data struct/code as much as possible.

Reviewed By: hlu1

Differential Revision: D15561177

fbshipit-source-id: 35f5d63b2b5b3657f4ba099634fd27c3af545f1b
2019-05-31 17:01:36 -07:00
Mikhail Zolotukhin
b12a5f6155 schema_matching.cpp: mark internal functions as static. (#21140)
Summary:
Some of the functions are only used in this file - mark them `static`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21140

Differential Revision: D15578076

Pulled By: Krovatkin

fbshipit-source-id: 71ae67baabebd40c38ecb9292b5b8202ad2b9fc1
2019-05-31 16:40:16 -07:00
Mingzhe Li
668dbcc41b migrate intraop benchmarks to the new interface (#21202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21202

Migrate Ilia's op benchmarks to the new interface

Reviewed By: hl475

Differential Revision: D15322577

fbshipit-source-id: 8e75d51e7ddacbd56896c55f2996a9358491d83e
2019-05-31 16:19:04 -07:00
Mingzhe Li
c62d476206 migrate add operator to the new interface (#21152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21152

Migrate existing add benchmark to use the new op front-end

Reviewed By: zheng-xq

Differential Revision: D15325524

fbshipit-source-id: 34e969e1bd289913d881c476711bce9f8ac18a29
2019-05-31 16:19:00 -07:00
Jerry Zhang
fd19d06db4 remaining use of t.quantize_linear (#21219)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21219

att

Differential Revision: D15583802

fbshipit-source-id: 742e8b799d67485b2d48b1458839f3f3b000f200
2019-05-31 16:05:44 -07:00
Hong Xu
4dbeb87e52 PyTorch Dockerfile should update submodules recursively.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21216

Differential Revision: D15584114

Pulled By: bddppq

fbshipit-source-id: dbe0c3a54024a90fcd2c6689f8b9689ed0cd639b
2019-05-31 14:56:57 -07:00
Elias Ellison
0aeb971622 conditionally defined var better error message (#20911)
Summary:
i will do loops in a follow up after some other changes I am working on have landed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20911

Differential Revision: D15497205

Pulled By: eellison

fbshipit-source-id: 8cac197c6a6045b27b552cbb39e6fc86ca747b18
2019-05-31 14:32:03 -07:00
davidriazati
2f4824b2fb Add support for recursive compilation on Modules (#20708)
Summary:
Following on #19747, this implements most of the `torch.jit.script()` changes laid out in #20939.

Still to do:
* Accessing a method from Python does not add it as a `ScriptMethod` (so only `export`ed methods and `forward` are compiled)
* Calling a method other than `forward` on a submodule doesn't work

](https://our.intern.facebook.com/intern/diff/15560490/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20708

Pulled By: driazati

Differential Revision: D15560490

fbshipit-source-id: cc7ef3a1c2772eff9beba5f3e66546d2b7d7198a
2019-05-31 14:27:16 -07:00
Sebastian Messmer
834d678eb8 Remove old custom op implementation (#21085)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21085

Now that torch::jit::RegisterOperators() always passes through to torch::RegisterOperators() (see diffs stacked below this), we can remove the old custom op implementation.

Reviewed By: dzhulgakov

Differential Revision: D15542261

fbshipit-source-id: ef437e6c71950e58fdd237d6abd035826753c2e4
2019-05-31 13:51:14 -07:00
Sebastian Messmer
384d828ea5 Add aliasAnalysis to torch::RegisterOperators() (#21084)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21084

- Now AliasAnalysisKind can be set using the torch::RegisterOperators() API
- This also allows us to remove the last place in torch::jit::RegisterOperators that didn't use c10 yet.

Reviewed By: dzhulgakov

Differential Revision: D15542097

fbshipit-source-id: ea127ecf051a5c1e567e035692deed44e04faa9e
2019-05-31 13:51:07 -07:00
Sebastian Messmer
80556761c8 c10::OperatorOptions (#21181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21181

Implement c10::OperatorOptions as a class to store metadata about operators.
This is meant to replace torch::jit::OperatorOptions.

Reviewed By: dzhulgakov

Differential Revision: D15569897

fbshipit-source-id: 95bf0bf917c1ef2bdf32702405844e1a116d9a64
2019-05-31 13:51:00 -07:00
Sebastian Messmer
b91e0d14a7 registration options should only be callable on rvalues (#21079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21079

They're invalidating *this, so they shouldn't be callable on non-rvalues.

Reviewed By: dzhulgakov

Differential Revision: D15541583

fbshipit-source-id: a2a9dafb29af03477486ea2ce9029399f557c728
2019-05-31 13:50:54 -07:00
Owen Anderson
181792176d Implement various AliasAnalysis operations directly on top of MemoryLocations. (#21203)
Summary:
This reduces DenseNet load time by about 25% (down to 5.3s on my laptop) and gets AliasAnalysis out of the profile top hits entirely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21203

Differential Revision: D15578155

fbshipit-source-id: ddbb1ad25c9540b5214702830084aa51cc6fd3cb
2019-05-31 13:38:32 -07:00
Thor Johnsen
e098878d75 Cuda persistent softmax (#20827)
Summary:
Adds persistent cuda kernels that speed up SoftMax applied over the fast dimension, i.e. torch.nn.Softmax(dim=-1) and torch.nn.LogSoftmax(dim=-1). When the size is <= 1024, this code is 2-10x faster than the current code, speedup is higher for smaller sizes. This code works for half, float and double tensors with 1024 or fewer elements in the fast dimension. Numerical accuracy is on par with the current code, i.e. relative error is ~1e-8 for float tensors and ~1e-17 for double tensors. Relative error was computed against the CPU code.

The attached image shows kernel time in us for torch.nn.Softmax(dim=-1) applied to a half precision tensor of shape [16384,n], n is plotted along the horizontal axis. Similar uplifts can be seen for the backward pass and for LogSoftmax.

![image](https://user-images.githubusercontent.com/41591019/58212822-b63ebb00-7cb5-11e9-910d-1fc7d8585d58.png)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20827

Differential Revision: D15582509

Pulled By: ezyang

fbshipit-source-id: 65805db37487cebbc4ceefb1a1bd486d24745f80
2019-05-31 13:20:15 -07:00
Tao Xu
052bab7069 Move legacy TH functions(sinh,cosh) to TensorIterator + Vec256 (#21115)
Summary:
This is a follow up on Jame's PR: https://github.com/pytorch/pytorch/pull/19041. The idea is to replace the legacy `sinh` / `cosh` ops that are being dispatched to TH with the operations defined in `Vec256` for better performance.

benchmark(from Jame's script):

```python
import torch, time
ops = ['sinh', 'cosh']
x = torch.rand(1024, 1024)
NITER = 10000

print('op', 'time per iter (ms)', 'gops/s', 'GB/s', sep='\t')
for op in ops:
    s = time.time()
    for i in range(NITER):
        getattr(x, op)()
    elapsed_sec = ((time.time() - s) / NITER)
    print(op, elapsed_sec * 1000, (1024*1024/elapsed_sec)/1e9, (1024*1024*4*2) / elapsed_sec / 1e9, sep='\t')
```
code on master:

```
op	time per iter (ms)	gops/s	GB/s
sinh	3.37614369392395	0.3105839369002935	2.484671495202348
cosh	3.480502033233643	0.3012714803748572	2.4101718429988574
```
after change (on Macbook pro 2018):

```
op	time per iter (ms)	gops/s	GB/s
sinh	0.8956503868103027	1.1707425301677301	9.365940241341841
cosh	0.9392147302627564	1.1164390487217428	8.931512389773943
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21115

Reviewed By: ljk53

Differential Revision: D15574580

Pulled By: xta0

fbshipit-source-id: 392546a0df11ed4f0945f2bc84bf5dea2750b60e
2019-05-31 12:06:26 -07:00
Jerry Zhang
7f960a9c01 remove quantize_linear from Tensor method (#21196)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21196

we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend
Python
```
torch.quantize_linear(t, ...)
```
C++
```
at::quantize_linear(t, ...)
```

Differential Revision: D15577123

fbshipit-source-id: d0abeea488418fa9ab212f84b0b97ee237124240
2019-05-31 12:01:10 -07:00
Jongsoo Park
c185145d8c remove dependency to caffe2::math and eigen (#21169)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21169

We should minimize dependency from perfkernels (we were including eigen header files only in cc files not compiled with avx or avx2 options but better to be very strict because it's easy to introduce illegal instruction errors in perfkernels)

Reviewed By: salexspb

Differential Revision: D15563839

fbshipit-source-id: d4b1bca22d7f2e6f20f23664d4b99498e5984586
2019-05-31 11:55:16 -07:00
Brennan Vincent
8c927b208c improve test_docs_coverage error messages (#21029)
Summary:
Most important fix: Correct "tensor.rst" to "tensors.rst"

Secondary fix: some minor English spelling/grammar fixes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21029

Differential Revision: D15523230

Pulled By: umanwizard

fbshipit-source-id: 6052d8609c86efa41a4289cd3a099b2f1037c810
2019-05-31 11:13:39 -07:00
davidriazati
e13b483f58 Fix weak module cuda() _flat_weights bug (#21107)
Summary:
Dynamically creating a type at runtime was messing up the MRO and has been causing many other problems. I think it's best to delete it, this causes a regression since
```python
self.linear = nn.Linear(10, 10)
isinstance(self.linear, nn.Linear)
```
will now be `False` again, but this will be fixed once recursive script mode is the default (#20939)
](https://our.intern.facebook.com/intern/diff/15560549/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21107

Pulled By: driazati

Differential Revision: D15560549

fbshipit-source-id: 7bd6b958acb4f353d427d66196bb4ee577ecb1a6
2019-05-31 10:35:30 -07:00
Mingzhe Li
0223d3744a introduce a new intrace to add op [C2 changes] (#21148)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21148

The diff modifies the interface for Caffe2 operators in the benchmark suite

Reviewed By: zheng-xq

Differential Revision: D15433888

fbshipit-source-id: c264a95906422d7a26c10b1f9836ba8b35e36b53
2019-05-31 09:21:07 -07:00
Mingzhe Li
31089b02ce introduce a new interface to add op [core changes] (#21147)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21147

This diff introduces a new interface to add PT/C2 operators to the benchmark suite.

The following steps are needed to add a new operator:
1. Specify the input shapes, args to an operator in configs
2. Create a PT/C2 benchmark class which includes ```init``` (create tensors),  ```forward``` (specify the operator to be tested.), and ```backward```(gradient of an op.) methods
3. call generate_pt_test/generate_c2_test to create test cases based on configs

Reviewed By: zheng-xq

Differential Revision: D15250380

fbshipit-source-id: 1025a7cf60d2427baa0f3f716455946d3d3e6a27
2019-05-31 09:21:04 -07:00
Edward Yang
012069ca8f Revert D15454048: Move THCTensor_{normal, normal_means, normal_stddevs, normal_means_stddevs} to ATen
Differential Revision:
D15454048

Original commit changeset: 8bfc57bf015b

fbshipit-source-id: 98c562ab4cf7a00e9041b2aa50eb7fb0f0c48f69
2019-05-31 07:49:22 -07:00
Edward Yang
dc8f306b8e Revert D15454052: Move THCTensor_(cauchy) to ATen
Differential Revision:
D15454052

Original commit changeset: 4f4d33ec11cf

fbshipit-source-id: 832a738796e6b6bdf969a44bb2cdcf171cbd5f77
2019-05-31 07:49:18 -07:00
Ailing Zhang
be9ce6318e remove import torchvision when testing torch.hub (#21132)
Summary:
This should pass once https://github.com/pytorch/vision/pull/971 is merged.
To remove torchvision as baseline, we just compare to sum of all param.sum() in pretrained resnet18 model, which means we need to manually update the number only when that pretrained weights are changed, which is generally rare.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21132

Differential Revision: D15563078

Pulled By: ailzhang

fbshipit-source-id: f28c6874149a1e6bd9894402f6847fd18f38b2b7
2019-05-31 07:38:30 -07:00
Edward Yang
e161360b62 Revert D15558784: [reland][pt1][quant] remove quantize_linear from Tensor method
Differential Revision:
D15558784

Original commit changeset: 0b194750c423

fbshipit-source-id: d180a7f76bb05ad7470f17bc3d2bd614fab16529
2019-05-31 06:20:05 -07:00
Sebastian Messmer
5fcd37bd8f List (#21164)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21164

Write a List type to be used in operator kernels. This abstracts away from the concrete list type used (e.g. std::vector vs SmallVector)
and allows us to change these implementation details without breaking the kernel API.
Also, this class allows for handling List<bool>, which would not work with ArrayRef because vector<bool> is a bitset and can't be converted to ArrayRef<bool>.

Reviewed By: ezyang

Differential Revision: D15476434

fbshipit-source-id: 5855ae36b45b70437f996c81580f34a4c91ed18c
2019-05-31 04:15:39 -07:00
Jerry Zhang
f91f24764e remove quantize_linear from Tensor method (#21156)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21156

we'll add `quantize(quantizer)` as a tensor method later when we expose `quantizer` in Python frontend
Python
```
torch.quantize_linear(t, ...)
```
C++
```
at::quantize_linear(t, ...)
```

Differential Revision: D15558784

fbshipit-source-id: 0b194750c423f51ad1ad5e9387a12b4d58d969a9
2019-05-30 22:02:12 -07:00
Jerry Zhang
0a0ff83124 replace num_bits with quant_min and quant_max (#21097)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21097

att

Differential Revision: D15547166

fbshipit-source-id: 60bc7f7d82c424558b67881627fb74f1eff515af
2019-05-30 20:57:57 -07:00
Jerry Zhang
277bf69fa0 Add torch.load/torch.save for QTensor (#20830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20830

att

Reviewed By: dzhulgakov

Differential Revision: D15340701

fbshipit-source-id: 677038c8101f66dec4856c2eccf9f9e394012226
2019-05-30 20:52:19 -07:00
vishwakftw
eb4d43df3b Make CUDA triu / tril support batches of size > 65535 (#21067)
Summary:
In the previous implementation of triu / tril, we passed the batch size in the 2nd dimension of a grid. This is limited to 65535, which means that performing triu / tril on a tensor with batch size > 65535 will throw an error. This PR removes the dependence on the 2nd dimension, and corresponding non-contiguity constraints.

Changelog:
- Compute offset, row and col in the kernel
- Use 1st dimension of grid alone
- Remove unnecessary contiguity checks on tensors as a result of this change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21067

Differential Revision: D15572501

Pulled By: ezyang

fbshipit-source-id: 93851cb661918ce794d43eeb12c8a38762e1358c
2019-05-30 20:16:11 -07:00
Michael Suo
057ddab766 on import, register class before defining it (#21182)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21182
ghimport-source-id: 2457a4306c0a72888bb8359a267fcd12b43f103a

Differential Revision: D15571334

Pulled By: suo

fbshipit-source-id: 26ca9dddb25df1b1eac2e17c70f682e20e08cb6d
2019-05-30 20:09:01 -07:00
Syed Tousif Ahmed
d6438c956b Move THCTensor_(cauchy) to ATen (#20622)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20622
ghimport-source-id: b100d6cededf6f2c2020c3d7961271f16497bbdc

Differential Revision: D15454052

Pulled By: ezyang

fbshipit-source-id: 4f4d33ec11cf36b91c67759bd27252d1e457cff1
2019-05-30 18:13:16 -07:00
Syed Tousif Ahmed
26d16ae515 Move THCTensor_{normal, normal_means, normal_stddevs, normal_means_stddevs} to ATen (#20621)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20621
ghimport-source-id: f461d7f1eb6b5a8306dd8175cbb0a7fcc9f64c76

Differential Revision: D15454048

Pulled By: ezyang

fbshipit-source-id: 8bfc57bf015b85f57ed99a54176926386aab4e34
2019-05-30 18:01:31 -07:00