Commit Graph

31000 Commits

Author SHA1 Message Date
Xiao Wang
fe4f90c40b Cusolver inverse check info (#46625)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46557

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46625

Reviewed By: zou3519

Differential Revision: D24438577

Pulled By: ngimel

fbshipit-source-id: d00e6eb2eae4aa39ca6ecf5914fe9cf37c24b906
2020-10-21 21:46:33 -07:00
Yi Wang
adffd8eb6b Add const to the first arg 'grad' of Reducer::copy_grad_to_bucket (#46501)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46501

Gradients in this method will not be modified.
ghstack-source-id: 114851646

Test Plan: waitforbuildbot

Reviewed By: pritamdamania87

Differential Revision: D24374300

fbshipit-source-id: a2941891008f9f197a5234b50260218932d2d37d
2020-10-21 21:34:31 -07:00
Brian Hirsh
db83ddcb86 small doc fix (#46599)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46599

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D24426181

Pulled By: bdhirsh

fbshipit-source-id: d0900d5c43574c80f1bf614824eafd21ba6a9caf
2020-10-21 20:17:31 -07:00
Rahul Nambiar
adbb50ea67 Enabling alias annotation checks for all operations during autograd tests (#46601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46601

* except excluded tests and magic methods.

https://github.com/pytorch/pytorch/issues/38731

Previously, we'd only do run these tests for inplace operations. Since this is a lot more tests, fixed these issues that came up when running them -
- Updated schema of conj() to reflect existing behaviour.
- Updated deepEquals method in check_alias_annotation.cpp to re-use the overloaded == operator. Previous implementation did not cover all types of IValues.
- Corrected the order inputs are passed in during autograd testing of 'view' & 'reshape'.
- Subbed out atn::ger with the func its aliased to, atn::outer, for testing. The alias annotation checking code doesn't handle aliased operators properly.
ghstack-source-id: 114830903

Test Plan: Ran all tests in test:jit and verified they pass.

Reviewed By: eellison

Differential Revision: D24424955

fbshipit-source-id: 382d7e2585911b81b1573f21fff1d54a5e9a2054
2020-10-21 20:01:57 -07:00
Ailing Zhang
33e82c0269 Update error message to include link to readme. (#46613)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46613

Test Plan: CI

Reviewed By: ezyang

Differential Revision: D24430852

fbshipit-source-id: 811e4d10508d47ef830d2b8445f11592f342461f
2020-10-21 19:38:19 -07:00
Jerry Zhang
13decddae2 [reland][quant] Add FixedQParamsFakeQuantize module (#45538) (#46657)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46657

This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24451406

fbshipit-source-id: 26cc140c00f12bdec9a8f9dc880f4c425f4d4074
2020-10-21 16:47:11 -07:00
Jerry Zhang
746febdeac [quant][graphmode][fx] Add additional_object_mapping argument to convert (#46338)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46338

Should we merge quantized module and quantized operator configurations?

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24317435

fbshipit-source-id: 3575251fe9d80a6628b8c3243c2ed92ea5e921e3
2020-10-21 16:39:07 -07:00
Mingzhe Li
8908f6ad8e [op-bench] modify import path of configs (#46679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46679

Current way of import configs will have runtime error when a single benchmark is launched directly with buck(e.g. `/buck-out/gen/caffe2/benchmarks/operator_benchmark/pt/conv_test.par`). The diff fixed that issue.
ghstack-source-id: 114857978

Test Plan: waitforsandcastle

Reviewed By: vkuzo

Differential Revision: D24459631

fbshipit-source-id: 29df17e66962a8604dbb7b8b9106713c3c19bed5
2020-10-21 16:15:11 -07:00
Nikita Shulga
6011b36080 Fix type qualifiers ignored on return type warning (#46668)
Summary:
This fixes following warning:
```
../aten/src/ATen/cpu/vec256/vec256_float_neon.h:262:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
  262 |   const float operator[](int idx) const {
      |   ^~~~~
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46668

Reviewed By: seemethere, janeyx99

Differential Revision: D24454206

Pulled By: malfet

fbshipit-source-id: 8ba86a6d6c144f236a76bcef7ce794def7ea131f
2020-10-21 15:49:28 -07:00
Lee Newberg
e02a3e190e DOC: Building libtorch using CMake (#44196)
Summary:
I am adding documentation for building the C++-only libtorch.so without invoking Python in the build and install process.  This works on my Ubuntu 20.04 system and is designed to be operating system agnostic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44196

Reviewed By: zou3519

Differential Revision: D24421066

Pulled By: malfet

fbshipit-source-id: e77c222703353ff7f7383fb88f7bce705f88b7bf
2020-10-21 14:29:36 -07:00
Ivan Murashko
ff0e20b384 Config inheritance was added for pytorch project (#46584)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46584

The diff enables clang-tidy config inheritance for pytorch project.

Reviewed By: suo

Differential Revision: D24418191

fbshipit-source-id: 5cc0cf2d564236cedc4333af9324387d6d7a55cc
2020-10-21 14:06:35 -07:00
Ansley Ussery
475b4e30e6 Allow for source code comments at any level of indentation (#46548)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46548

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24434778

Pulled By: ansley

fbshipit-source-id: e24ed73d497381e02ef1155622641027ae34770a
2020-10-21 13:49:42 -07:00
Xiaodong Wang
e3b2bfa2a3 [pytorch] Early return in nn.EmbeddingBag when weight is empty (#46572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46572

When `num_samples == 0`, grid becomes zero. Although CUDA just silently proceeds, `cudaGetLastError()` will complain about the `Error: invalid configuration argument`. So it's actually failing in some future places that becomes really hard to debug.

Reviewed By: jianyuh

Differential Revision: D24409874

fbshipit-source-id: ca54de13b1ab48204bbad265e3f55b56b94a1a2f
2020-10-21 13:44:56 -07:00
Joel Lamy-Poirier
caed29a069 fix-process-group-counter (#46563)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46561

A minimal fix to issue https://github.com/pytorch/pytorch/issues/46561. Increment the global variable `_group_count` at the same time as the others so the global state remains consistent in case of a failure.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46563

Reviewed By: zou3519

Differential Revision: D24422354

Pulled By: mrshenli

fbshipit-source-id: 32493cc2001d21ad366c396d16c303936959434e
2020-10-21 13:03:53 -07:00
Xiang Gao
ce04e527b4 Bump up windows cudnn version (#46436)
Summary:
Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46436

Reviewed By: zou3519

Differential Revision: D24421785

Pulled By: ezyang

fbshipit-source-id: 5aab2ae673e9ae07344a5f3bf0dc374a91dd12b2
2020-10-21 12:30:12 -07:00
Andrei Vukolov
c3c249aa0b Workaround to pay attention for CUDA version (#46535)
Summary:
Added a workaround for the cases when NVCC tries to compile the object for sm_30 GPU compute capability to avoid the error message telling that `__ldg` intrinsic is not defined.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46535

Reviewed By: zou3519

Differential Revision: D24422445

Pulled By: ezyang

fbshipit-source-id: 66e8eb1cbe42d848cfff46d78720d72100e628f8
2020-10-21 12:00:47 -07:00
Hugo van Kemenade
09896eda14 Fix version comparisons for Python 3.6, 3.10 and 4 (#32389)
Summary:
There's some code which uses `six.PY3`, similar to:

```python
if six.PY3:
    print("Python 3+ code")
else:
    print "Python 2 code"
```

Where:

```python
PY3 = sys.version_info[0] == 3
```

When run on Python 4, this will run the Python 2 code! Instead, use `six.PY2` and avoid `six.PY3`.

 ---

Similarly, there's some `sys.version_info[0] == 3` checks, better done as `sys.version_info[0] >= 3`.

 ---

Also, it's better to avoid comparing the `sys.version` string, as it makes assumptions that each version component is exactly one character long, which will break in Python 3.10:

```pycon
>>> sys.version
'3.8.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53) \n[Clang 6.0 (clang-600.0.57)]'
>>> sys.version < "3.3"
False
>>> fake_v3_10 = '3.10.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53) \n[Clang 6.0 (clang-600.0.57)]'
>>> fake_v3_10 < "3.3"
True
```

 ---

Finally, I think the intention here is to skip when the Python version is < 3.6:

```python
unittest.skipIf(sys.version_info[0] < 3 and sys.version_info[1] < 6, "dict not ordered")
```

However, it will really skip for Python 0.0-0.5, 1.0-1.5 and 2.0-2.5. It's best to compare to the `sys.version_info` tuple and not `sys.version_info[1]`:

```python
    unittest.skipIf(sys.version_info < (3, 6), "dict not ordered")
```

 ---

Found using https://github.com/asottile/flake8-2020:
```console
$ pip install -U flake8-2020
$ flake8 --select YTT
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/32389

Reviewed By: zou3519

Differential Revision: D24424662

Pulled By: ezyang

fbshipit-source-id: 1266c4dbcc8ae4d2e2e9b1d7357cba854562177c
2020-10-21 11:52:50 -07:00
Jithun Nair
65da50c099 Apply hip vs hipcc compilation flags correctly for building extensions (#46273)
Summary:
Fixes issues when building certain PyTorch extensions where the cpp files do NOT compile if flags such as `__HIP_NO_HALF_CONVERSIONS__` are defined.
cc jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46273

Reviewed By: zou3519

Differential Revision: D24422463

Pulled By: ezyang

fbshipit-source-id: 7a43d1f7d59c95589963532ef3bd3c68cb8262be
2020-10-21 11:40:40 -07:00
Ollin Boer Bohan
ac4ee0ef5d Fix typo in docs for interpolate (#46589)
Summary:
Removes a spurious backtick in [the docs for `torch.nn.functional.interpolate`](https://pytorch.org/docs/stable/nn.functional.html?highlight=grid_sample#torch.nn.functional.interpolate)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46589

Reviewed By: zou3519

Differential Revision: D24422550

Pulled By: ezyang

fbshipit-source-id: c1e6b7de4584b2a3f68b458801a33b3fc71c1944
2020-10-21 11:31:53 -07:00
Negin Raoof
96bc7faa50 [ONNX] Export var, var_mean and std_mean ops (#45678)
Summary:
Adding export for var, var_mean and std_mean ops

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45678

Reviewed By: houseroad

Differential Revision: D24398811

Pulled By: bzinodev

fbshipit-source-id: bf51422a9e035d521156c0fa6e77898aac83a380
2020-10-21 11:23:54 -07:00
Ivan Yashchuk
6de619e4a4 Allow converting parameters of nn.Module to complex dtypes (#44788)
Summary:
This PR makes it possible to cast the parameters of nn.Module to complex dtypes.
The following code works with the proposed changes.
```python
In [1]: import torch
In [2]: lin = torch.nn.Linear(5, 1).to(torch.complex64)
In [3]: lin(torch.zeros(3, 5, dtype=torch.complex64))
Out[3]:
tensor([[-0.1739+0.j],
        [-0.1739+0.j],
        [-0.1739+0.j]], grad_fn=<AddmmBackward>)
```
Fixes https://github.com/pytorch/pytorch/issues/43477.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44788

Reviewed By: zou3519

Differential Revision: D24307225

Pulled By: anjali411

fbshipit-source-id: dacc4f5c8c9a99303f74d1f5d807cd657b3b69b5
2020-10-21 08:54:59 -07:00
Howard Huang
611f028168 Add Batch-Updating Parameter Server Example to CI Tests (#46510)
Summary:
Resolves one item in https://github.com/pytorch/pytorch/issues/46321

This PR sets up DistExamplesTest which will be used as the class to implement future tests for examples. This class is run as part of CI tests. It also creates a dist_examples folder and includes the [batch server example](https://github.com/pytorch/examples/blob/master/distributed/rpc/batch/parameter_server.py) which is slightly modified to allow to be tested.

Run test:
pytest test/distributed/rpc/test_tensorpipe_agent.py -k test_batch_updating_parameter_server -vs
pytest test/distributed/rpc/test_process_group_agent.py -k test_batch_updating_parameter_server -vs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46510

Reviewed By: mrshenli

Differential Revision: D24379296

Pulled By: H-Huang

fbshipit-source-id: 1c102041e338b022b7a659a51894422addc0e06f
2020-10-21 08:46:46 -07:00
Brian Hirsh
cf3d7a2660 first cut of adding a dangling impl test. fix #45165 (#46484)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46484

Test Plan: Imported from OSS

Reviewed By: ezyang, izdeby

Differential Revision: D24392625

Pulled By: bdhirsh

fbshipit-source-id: a6ab9c53e3e580e5713e08b20682ee6f8ed3bd84
2020-10-21 08:39:40 -07:00
Gao, Xiang
62e714c9d9 Delete CUDAUnaryOps.cpp (#46280)
Summary:
This file is no longer used

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46280

Reviewed By: ezyang

Differential Revision: D24392749

Pulled By: heitorschueroff

fbshipit-source-id: 677e1ba8664e3c53448a962f8a5d05e806961c2d
2020-10-21 08:31:34 -07:00
Shen Li
cebe87fe3a Revert D24379422: [py][vulkan] Add is_vulkan to py api, add vulkan to device type parsing
Test Plan: revert-hammer

Differential Revision:
D24379422 (e8fbe54cf5)

Original commit changeset: afab89bb9e17

fbshipit-source-id: 743c77e453239f10c155c67490cba5a42ab42f58
2020-10-21 08:23:05 -07:00
Sebastian Messmer
8328630315 avoid inlining kernel lambdas on mobile (#46249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46249

This saves 15kb binary size on ios and increases binary size on android x86 for 30kb. It also reduces size a bit for android arm. I've talked to Martin and we should land this since Android binary size is much less important because of Voltron.

ghstack-source-id: 114177627

Test Plan: bsb

Reviewed By: ezyang

Differential Revision: D23057150

fbshipit-source-id: 43bd62901b81daf08ed96de561d711357689178f
2020-10-21 03:27:21 -07:00
Mehdi Mirzazadeh
8357e2edc3 Back out "Revert D24269034: [fx] Refactor Tracer so that find_module and root args creation could be overridden by implementations" (#46573)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46573

Original commit changeset: 7dd709b585f8
ghstack-source-id: 114730143

Test Plan: Verified on circleci that previously broken test is fixed.

Reviewed By: zdevito

Differential Revision: D24413096

fbshipit-source-id: 439568c631c4556b8ed6af20fcaa4b1375e554cf
2020-10-20 22:17:36 -07:00
Ivan Kobzarev
e8fbe54cf5 [py][vulkan] Add is_vulkan to py api, add vulkan to device type parsing (#46511)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46511

Test Plan: Imported from OSS

Reviewed By: AshkanAliabadi

Differential Revision: D24379422

Pulled By: IvanKobzarev

fbshipit-source-id: afab89bb9e17c50934083598262bbe14ea82e893
2020-10-20 20:04:24 -07:00
lixinyu
a651b876a7 preserve non-dense or overlapping tensor's layout in *_like functions (#46046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46046

*_like functions are used in pytorch to create a new tensor with the same shape of the input tensor. But we don’t always preserve the layout permutation of the tensor. Current behavior is that, for a dense and non-overlapping tensor, its layout permutation is preserved. For eg.  passing a channel last contiguous tensor t with ‘shape/stride’  (2, 4, 3, 2)/(24, 1, 8, 4) to empty_like(t) function will create a new tensor with exactly the same ‘shape/stride’ as the input tensor t. However, if the input tensor is non-dense or has overlap, we simply create a contiguous tensor based on input tensor’s shape, so the tensor layout permutation is lost.

This PR preserves the layout permutation for non-dense or overlapping tensor. The strides propagation rule that used in this PR is exactly the same as what is being used in TensorIterator.  The behavior changes are listed below:

| code                                                                                                                                                                                           | old                                                   | new                                                  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------|------------------------------------------------------|
| #strided tensors<br>a=torch.randn(2,3,8)[:,:,::2].permute(2,0,1)<br>print(a.stride())<br>print(a.exp().stride())<br>print((a+a).stride())<br>out = torch.empty(0)<br>torch.add(a,a,out=out)<br>print(out.stride()) | (2, 24, 8) <br>(6, 3, 1) <br>(1, 12, 4) <br>(6, 3, 1) | (2, 24, 8)<br>(1, 12, 4)<br>(1, 12, 4)<br>(1, 12, 4) |
| #memory dense tensors<br>a=torch.randn(3,1,1).as_strided((3,1,1), (1,3,3))<br>print(a.stride(), (a+torch.randn(1)).stride())<br>a=torch.randn(2,3,4).permute(2,0,1)<br>print(a.stride())<br>print(a.exp().stride())<br>print((a+a).stride())<br>out = torch.empty(0)<br>torch.add(a,a,out=out)<br>print(out.stride())                                                                                                                                                                                               |  (1, 3, 3) (1, 1, 1)<br>(1, 12, 4)<br>(6, 3, 1)<br>(1, 12, 4)<br>(6, 3, 1)                                                       |  (1, 3, 3) (1, 3, 3)<br>(1, 12, 4)<br>(1, 12, 4)<br>(1, 12, 4)<br>(1, 12, 4) |

This is to solve the non-dense tensor layout problem in #45505

TODO:
- [x] Fix all the BC broken test cases in pytorch
- [ ] Investigate if any fb internal tests are broken

This change will cover all kinds of non-dense tensors.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24288970

Pulled By: glaringlee

fbshipit-source-id: 320fd4e0d1a810a12abfb1441472298c983a368d
2020-10-20 19:49:49 -07:00
Ashkan Aliabadi
2181449068 Revert D24004795: [quant] Add FixedQParamsFakeQuantize module
Test Plan: revert-hammer

Differential Revision:
D24004795 (253918ec55)

Original commit changeset: fc4797f80842

fbshipit-source-id: 663169e90a2f58e5a89e4d382291ae41c24d0fee
2020-10-20 19:40:21 -07:00
Daya Khudia
f47231bf0e [caffe2][dnnlowp] Remove openmp usage in quantize dnnlowp op
Summary: It creates cpu overload issues when openmp gets enabled and OMP_NUM_THREADS=1 is not set.

Test Plan: buck test //caffe2/caffe2/quantization/server:quantize_dnnlowp_op_test

Reviewed By: jspark1105

Differential Revision: D24437305

fbshipit-source-id: 426209fc33ce0d4680c478f584716837ee62cb5e
2020-10-20 19:33:56 -07:00
Ashkan Aliabadi
6cd8b5e9a7 Provide CMake option to enable Vulkan API. (#46503)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46503

Test Plan: Imported from OSS

Reviewed By: IvanKobzarev

Differential Revision: D24379144

Pulled By: AshkanAliabadi

fbshipit-source-id: 8d8c57f96bbac2a44615828a3474c912704f3a85
2020-10-20 18:45:52 -07:00
Ashkan Aliabadi
3e041b503f Add Vulkan job dispatch and flush. (#46008)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46008

Test Plan: Imported from OSS

Reviewed By: IvanKobzarev

Differential Revision: D24291507

Pulled By: AshkanAliabadi

fbshipit-source-id: a3d02e76708a38e49398bb71e31bb2ad676d01af
2020-10-20 18:41:29 -07:00
Pritam Damania
cb3c1d17e4 Promote -Wcast-function-type to an error in builds. (#46356)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46356

Adding the flag `-Werror=cast-function-type` to ensure we don't allow
any invalid casts (ex: PyCFunction casts).

For more details see: https://github.com/pytorch/pytorch/issues/45419
ghstack-source-id: 114632980

Test Plan: waitforbuildbot

Reviewed By: albanD

Differential Revision: D24319759

fbshipit-source-id: 26ce4650c220e8e9dd3550245f214c7e6c21a5dc
2020-10-20 18:09:06 -07:00
Yanan Cao
42a70dc5a8 Implement all communication APIs in DistributedC10d new frontend (#46053)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46053

Reviewed By: wanchaol

Differential Revision: D24300487

Pulled By: gmagogsfm

fbshipit-source-id: 0d0b01c4f9d9e1d59dd17d7606ce47d54d61951d
2020-10-20 17:52:07 -07:00
Jerry Zhang
253918ec55 [quant] Add FixedQParamsFakeQuantize module (#45538)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45538

This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24004795

fbshipit-source-id: fc4797f80842daacd3b3584c5b72035774634edd
2020-10-20 17:43:25 -07:00
Lillian Johnson
f83cf2dab3 [JIT] adding torch.jit.isinstance support (#46062)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46062

Adds support for torch.jit.isinstance in both eager and script mode

Example use:

```
import torch
from typing import Any, List

class TestModule(torch.nn.Module):
    def __init__(self):
        super(TestModule, self).__init__()

    def call(self, input1: str, input2: str) -> str:
        return input1

    def forward(self, input: Any) -> None:
        if torch.jit.isinstance(input, List[str]):
            for el in input:
                print(el)

TestModule().forward(["1","2"])
scripted_module = torch.jit.script(TestModule())
scripted_module(["1", "2"])
```

Test Plan: Imported from OSS

Reviewed By: bertmaher, zou3519

Differential Revision: D24264415

Pulled By: Lilyjjo

fbshipit-source-id: 039c95bddd854c414027ac8332832e6bc830b5b9
2020-10-20 16:47:49 -07:00
Ansley Ussery
fdc5261a20 Support %-based string formatting (#45976)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45976

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D24374215

Pulled By: ansley

fbshipit-source-id: 2005fe7f09dc8d3c44c4bfdccab6b4dc46a5e517
2020-10-20 16:13:36 -07:00
Jerry Zhang
f9446cb15a [quant][refactor] Remove register api and rename get_*_mapping to get_default_*_mapping (#46337)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46337

We plan to pass around the mappings instead of using global registration api to keep
the mappings local to the transformations user is performing

Test Plan: Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24317436

fbshipit-source-id: 81569b88f05eeeaa9595447e482a12827aeb961f
2020-10-20 15:53:47 -07:00
Ashkan Aliabadi
4f5b55f722 Revert D24395956: [pytorch][PR] Replace flatten tensors with flatten loops.
Test Plan: revert-hammer

Differential Revision:
D24395956 (2f51ddb81f)

Original commit changeset: f3792903f206

fbshipit-source-id: ef70713f0f67f577b09674219631d22440ceec31
2020-10-20 15:42:23 -07:00
Pritam Damania
2b221a9599 Remove PyCFunction casts as much as possible. (#46227)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46227

Follow up from https://github.com/pytorch/pytorch/issues/45419, in
this PR I've removed as many PyCFunction casts as I could from the codebase.

The only ones I didn't remove were the ones with `METH_VARARGS | METH_KEYWORDS`
which have 3 parameters instead of 2 and had to be casted. Example: `
{"copy_", (PyCFunction)(void(*)(void))THPStorage_(copy_), METH_VARARGS |
METH_KEYWORDS, nullptr},`
ghstack-source-id: 114632704

Test Plan: waitforbuildbot

Reviewed By: albanD

Differential Revision: D24269435

fbshipit-source-id: 025cfd43a9a2a3e59f6b2951c1a78749193d77cf
2020-10-20 15:01:51 -07:00
Hao Lu
1a3ea46dbf [StaticRuntime] Threading model (#46219)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46219

- Refactor StaticRuntime and group common data structures, the jit graph, and the script module into a separate struct `InferenceModule`:
```
struct InferenceModule {
  explicit InferenceModule(const torch::jit::Module& m);
  explicit InferenceModule(std::shared_ptr<torch::jit::Graph> g);
  torch::jit::Module module;
  std::shared_ptr<torch::jit::Graph> graph;
  std::unique_ptr<c10::FunctionSchema> schema;

  std::unordered_map<Value*, size_t> value_to_reg;
  std::vector<size_t> input_regs; // inputs to the graph
  std::vector<size_t> output_regs; // outputs of the graph
  std::vector<size_t> internals;
};
```
which is stored in the PyTorchPredictor, as well as the static runtime, and shared across threads. Then this is what's left inside the Static Runtime:
```
  mutable std::vector<IValue> reg_;
  // The nodes we need to run
  std::vector<ProcessedNode> nodes_;
```
`reg_` holds all the weights and activations, which is different across threads during running. `nodes_` holds the op nodes and input/output registers, and is the same across threads for now. We could potentially put other stateful data structures in it, so I kept it inside the static runtime. It could be easily moved into the `InferenceModule` if we decide not to anything else into `ProcessedNode`.

- Added StaticRuntimeOptions so we can toggle certain optimizations on/off, for testing and benchmarking. `cleanup_activations` is an example.

- Integration with PyTorchPredictor. Added a lockfree stack in the PyTorchPredictor to hold all the static runtime instances. Benchmark shows that the `push` and `pop` combo takes about 80 ns, which is quite acceptable.

This diff focuses on threading model only. Benchmarks will be separate.

Reviewed By: bwasti

Differential Revision: D24237078

fbshipit-source-id: fd0d6347f02b4526ac17dec1f731db48424bade1
2020-10-20 14:37:30 -07:00
Xiang Gao
e18a8aba95 Add CUDA 11.1 docker build (#46283)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46283

Reviewed By: ezyang

Differential Revision: D24346026

Pulled By: malfet

fbshipit-source-id: f69558f35527833b867a7352c78b4e8ebc370db3
2020-10-20 13:35:31 -07:00
Jane Xu
187e23397c Remove non-existent trusty image references (#46594)
Summary:
Simplifies some parts of build.sh and removes old references in the code to non-existent trusty images.

There are other parts of the code where trusty is referenced for travis (most of them in third party directories) and I did not touch those. https://github.com/pytorch/pytorch/search?q=trusty

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46594

Reviewed By: seemethere

Differential Revision: D24426796

Pulled By: janeyx99

fbshipit-source-id: 428c52893d2d35c1ddd1fd2e65a4b6575f260492
2020-10-20 12:54:45 -07:00
Raghavan Raman
2f51ddb81f Replace flatten tensors with flatten loops. (#46539)
Summary:
This diff changes `TensorExprKernel::generateStmt` to use flatten loops instead of flatten tensors.

Checked all tests on CPU as well as CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46539

Reviewed By: nickgg

Differential Revision: D24395956

Pulled By: navahgar

fbshipit-source-id: f3792903f2069bda37b571c9f0a840e6fb02f189
2020-10-20 12:16:18 -07:00
Facebook Community Bot
9c02e2112e Automated submodule update: FBGEMM (#46578)
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).

New submodule commit: 23cb1db72b

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46578

Test Plan: Ensure that CI jobs succeed on GitHub before landing.

Reviewed By: YazhiGao

Differential Revision: D24415308

fbshipit-source-id: c353dcf86cfd833a571a509930a17d09277a73e4
2020-10-20 11:43:01 -07:00
Kurt Mohler
e6ed887908 Add view test for tensor_split (#46427)
Summary:
Fulfills Mike's suggestion here: https://github.com/pytorch/pytorch/pull/44868#discussion_r505095018

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46427

Reviewed By: ezyang

Differential Revision: D24355107

Pulled By: mruberry

fbshipit-source-id: bddef2f9c2c41b5c5ac47a17d5ecdda580072e99
2020-10-20 09:56:37 -07:00
Shen Li
5003fd189c Add an option to getWriteableTensorData to avoid copy CUDA tensor to CPU (#46524)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46524

Test Plan: Imported from OSS

Reviewed By: wanchaol

Differential Revision: D24392794

Pulled By: mrshenli

fbshipit-source-id: 21bf81dfc6c1d81689f8278d81f4c8776bc76ec1
2020-10-20 08:54:58 -07:00
acxz
5e0bfd7455 [Build] [CMake] [ROCm] find hsa-runtime64 properly (#45550)
Summary:
Properly Fixes https://github.com/pytorch/pytorch/issues/44384
similar in vein to https://github.com/pytorch/pytorch/issues/42064

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45550

Reviewed By: ezyang

Differential Revision: D24412674

Pulled By: malfet

fbshipit-source-id: f3d056c7069cb9d8a7d4174b604b9e3fbb14180b
2020-10-20 08:38:32 -07:00
Jane Xu
35a35c3498 Move Open MPI installation to Ubuntu CUDA Docker images (#46569)
Summary:
Instead of installing Open MPI for build and test jobs with environment *-xenial-cuda*, install Open MPI into the relevant Docker images. This would save time and remove duplication in our scripts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46569

Reviewed By: walterddr

Differential Revision: D24409534

Pulled By: janeyx99

fbshipit-source-id: 6152f2f5daf63744d907dd234bc12d2a5ec58f3d
2020-10-20 08:31:35 -07:00