Commit Graph

27862 Commits

Author SHA1 Message Date
Eli Uriegas
b31f58de6f
Revert "[release/1.6] .circleci: Don't use SCCACHE for windows release builds (#42024)"
This reverts commit 994b37b36e.
2020-07-24 15:18:09 -07:00
eellison
29fe90e2a2
[release/1.6] [JIT] Dont include view ops in autodiff graphs (#42029)
* Dont include view ops in autodiff graphs

* skip view ops in autodiff testing

* two more tests

* appease calng format

* Pacify clang-format

Co-authored-by: eellison <eellison@fb.com>
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>
2020-07-24 13:41:32 -07:00
Nikita Shulga
35ad2d8586 Revert "[jit] fix tuple alias analysis (#41992)"
This reverts commit 8aa878fc93.
2020-07-24 13:32:00 -07:00
Eli Uriegas
994b37b36e
[release/1.6] .circleci: Don't use SCCACHE for windows release builds (#42024)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-07-24 11:15:26 -07:00
Michael Suo
8aa878fc93
[jit] fix tuple alias analysis (#41992)
Previously when analyzing a TupleConstruct, we ignored the aliasing
information of the inputs and simply marked all elements of the returned
tuple as wildcards. But since we can fully reason about the contents of
a tuple statically, we should be able to assign them aliasing
information.

This analysis was not only incomplete but produced incorrect results,
since if `a` is not a wildcard, `a noalias wilcard`. So if we looked at
`tuple(a)` and reported the aliasing info as `tuple(wildcard)`, then
`tuple[0] noalias a`, which is...wrong.
2020-07-24 08:05:20 -07:00
gchanan
7c7c9c3aa6
scatter/gather - check that inputs are of the same dimensionality (#41890)
Co-authored-by: Nikita Vedeneev <nik@quansight.com>
2020-07-22 18:33:07 -07:00
Richard Zou
a2922f589d
[1.6.0] Mark torch.set_deterministic and torch.is_deterministic as experimental (#41870)
This PR:
- renames `torch.set_deterministic` to `torch._set_deterministic`
- renames `torch.is_deterministic` to `torch._is_deterministic`
- Modifies the docstrings for both to indicate that the feature is not
yet complete.

We would like to do this because this feature is experimental and the
docstrings before this PR are misleading.

This PR does not have an accompanying change in master. That is because
there still is discussion over what the eventual state of the feature
should be: https://github.com/pytorch/pytorch/issues/15359. I expect
that there will be a better plan for this once 1.7 rolls around.

Test Plan:
- wait for CI
2020-07-22 18:32:47 -07:00
Jessica Lin
8acfecaecb
[1.6] Add optimizer_for_mobile doc into python api root doc (#41491)
* Add optimizer_for_mobile doc into python api root doc

* Apply suggestions from code review

Remove all references to `optimization_blacklist` as it's missing in 1.6

Co-authored-by: Nikita Shulga <nshulga@fb.com>
2020-07-22 17:37:45 -07:00
anjali411
860e18a61b Update torch.set_default_dtype doc (#41263)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41263

Test Plan: Imported from OSS

Differential Revision: D22482989

Pulled By: anjali411

fbshipit-source-id: 2aadfbb84bbab66f3111970734a37ba74d817ffd
2020-07-22 14:50:15 -07:00
anjali411
8f804baaa9 Doc note for complex (#41252)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41252

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D22553266

Pulled By: anjali411

fbshipit-source-id: f6dc409da048496d72b29b0976dfd3dd6645bc4d
2020-07-22 14:49:51 -07:00
anjali411
a395e0903e Autograd Doc for Complex Numbers (#41012)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41012

Test Plan: Imported from OSS

Differential Revision: D22476911

Pulled By: anjali411

fbshipit-source-id: 7da20cb4312a0465272bebe053520d9911475828
2020-07-22 14:40:52 -07:00
Edward Z. Yang
2ca55430d2
Add reference documentation for torch/library.h (#41470) (#41602)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41470

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D22577426

Pulled By: ezyang

fbshipit-source-id: 4bfe5806061e74181a74d161c868acb7c1ecd1e4
2020-07-22 11:10:16 -07:00
Nikita Shulga
b8e77a42bd
Add CUDA11 build and test (#40452) (#41543)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40452

Differential Revision: D22316007

Pulled By: malfet

fbshipit-source-id: 94f4b4ba2a46ff3d3042ba842a615f8392cdc350

Co-authored-by: Gao, Xiang <qasdfgtyuiop@gmail.com>
2020-07-22 09:53:22 -07:00
gchanan
4081fdd3df
Revert "port masked_select from TH to ATen and optimize perf on CPU (#33269)" (#41829)
This reverts commit fe66bdb498.

This also makes a sense to THTensorEvenMoreMath because sumall was removed, see THTensor_wrap.
2020-07-22 09:52:30 -07:00
Ashkan Aliabadi
cefb9e0cd6
Update pthreadpool to pthreadpool:029c88620802e1361ccf41d1970bd5b07fd6b7bb. (#40524) (#41190)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40524

Reviewed By: ezyang

Differential Revision: D22215742

Pulled By: AshkanAliabadi

fbshipit-source-id: ef594e0901337a92b21ddd44e554da66c723eb7c
2020-07-10 09:11:32 -07:00
Luca Wehrstedt
d9e9e0087a
[v1.6] [RPC docs] Remove mention of TensorPipe's SHM and CMA backends as they're not built (#41229)
Summary:
In short, we messed up. The SHM and CMA backends of TensorPipe are Linux-specific and thus they are guarded by a #ifdef in the agent's code. Due to a mishap with CMake (due the fact that TensorPipe has two CMake files, one for PyTorch and a "standalone" one) we were not correctly propagating some flags and these #ifdefs were always false. This means that these two backends have always been disabled and have thus never been covered by our OSS CI. It would be irresponsible to enable them now in v1.6, so instead we remove any mention of them from the docs.

Note that this is perhaps not as bad as it sounds. These two backends were providing higher performance (latency) when the two endpoints were on the same machine. However, I suspect that most RPC users will only do transfers across machines, for which SHM and CMA wouldn't have played any role.

Original PR against master: #41200 (merged as dde3d5f4a8)

Test Plan: Docs only
2020-07-10 09:02:08 -07:00
Nikita Shulga
43d746305c
Preserve CUDA gencode flags (#41212)
Summary:
Add `torch._C._cuda_getArchFlags()` that returns list of architecture `torch_cuda` were compiled with
Add `torch.cuda.get_arch_list()` and `torch.cuda.get_gencode_flags()` methods that returns architecture list and gencode flags PyTorch were compiled with
Print warning if some of GPUs is not compatible with any of the CUBINs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41173

Differential Revision: D22459998

Pulled By: malfet

fbshipit-source-id: 65d40ae29e54a0ba0f3f2da11b821fdb4d452d95
2020-07-09 17:34:50 -07:00
Negin Raoof
9409e03903
[ONNX][1.6] Update interpolate recompute_scale_factor default (#41117)
* Update interpolate recompute_scale_factor default

* Update upsampling.h

* Update functional.py
2020-07-09 17:24:53 -07:00
Tongzhou Wang
c9a1853d2f
[1.6] Make IterableDataset DataLoader.__len__ warning clearer (#41185)
* make IterableDataset DataLoader.__len__ warning clearer

* typo
2020-07-09 14:07:58 -07:00
Nikita Shulga
7fa9b2923b
quantizer.cpp: fix cuda memory pinning (#41139) (#41194)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41139

Fixes the test case in https://github.com/pytorch/pytorch/issues/41115
by using PyTorch's CUDA allocator instead of the old Caffe2 one.

Test Plan:
run the test case from the issue:
https://gist.github.com/vkuzo/6d013aa1645cb986d0d4464a931c779b

let's run CI and see what it uncovers

Imported from OSS

Reviewed By: malfet

Differential Revision: D22438787

fbshipit-source-id: 0853b0115d198a99c43e6176aef34ea951bf5c2e

Co-authored-by: Vasiliy Kuznetsov <vasiliy@fb.com>
2020-07-09 14:06:11 -07:00
Nikita Shulga
40bf15a8ac
Remove copy_ warnings for angle and abs for complex tensors (#41152) (#41191)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41152

fixes https://github.com/pytorch/pytorch/issues/40838

Test Plan: Imported from OSS

Differential Revision: D22444357

Pulled By: anjali411

fbshipit-source-id: 2879d0cffc0a011c624eb8e00c7b64bd33522cc3

Co-authored-by: anjali411 <chourdiaanjali123@gmail.com>
2020-07-09 13:41:15 -07:00
Ailing
c164fc4d7f
Patch #40883 to 1.6 release. (#41033) 2020-07-09 10:25:39 -07:00
Nikita Shulga
e0b7480f34 Revert "make IterableDataset DataLoader.__len__ warning clearer (#41183)"
This reverts commit 89d7f194d8.
2020-07-09 08:05:24 -07:00
Tongzhou Wang
89d7f194d8
make IterableDataset DataLoader.__len__ warning clearer (#41183) 2020-07-09 08:00:00 -07:00
mrshenli
59bb44a8e8
Add a link in RPC doc page to point to PT Distributed overview (#41108) (#41156)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41108

Test Plan: Imported from OSS

Differential Revision: D22440751

Pulled By: mrshenli

fbshipit-source-id: 9e7b002091a3161ae385fdfcc26484ae8fc243bb
2020-07-09 07:49:10 -07:00
Mike Ruberry
8f4d01d9f1
Disables unary op casting to output dtype (#41097) (#41160)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41047.

Some CPU kernel implementations don't call `cast_outputs()`, so when CPU temporaries were created to hold their outputs they weren't copied back to the out parameters correctly. Instead of fixing that issue, for simplicity this PR disables the behavior. The corresponding test in test_type_promotion.py is expanded with more operations to verify that unary ops can no longer have out arguments with different dtypes than their inputs (except in special cases like torch.abs which maps complex inputs to float outputs and torch.deg2rad which is secretly torch.mul).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41097

Differential Revision: D22422352

Pulled By: mruberry

fbshipit-source-id: 8e61d34ef1c9608790b35cf035302fd226fd9421

Co-authored-by: Mike Ruberry <mruberry@devfair044.maas>
2020-07-08 22:06:48 -07:00
Rohan Varma
77ffb25925
Add guard for non-default stream in DDP's autograd engine callback (#40115) (#41151)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40115

Closes https://github.com/pytorch/pytorch/issues/37790
Closes https://github.com/pytorch/pytorch/issues/37944

A user may wish to run DDP's forward + backwards step under a non-default CUDA stream such as those created by `with torch.cuda.Stream(stream)`. In this case, the user should be responsible for synchronizing events on this stream with other streams used in the program (per the documentation at https://pytorch.org/docs/stable/notes/cuda.html#cuda-semantics), but currently DDP has a bug which causes DDP under non-default streams to fail.

If a user does the following:
```
model = DDP(...)
loss = model(inptut).sum()
loss.backward()
grad = model.module.weight.grad()
average = dist.all_reduce(grad)
```

There is a chance that `average` and `grad` will not be equal. This is because the CUDA kernels corresponding to the  `all_reduce` call may run before `loss.backward()`'s kernels are finished. Specifically, in DDP we copy the allreduced gradients back to the model parameter gradients in an autograd engine callback, but this callback runs on the default stream. Note that this can also be fixed by the application synchronizing on the current stream, although this should not be expected, since the application is not using the current stream at all.

This PR fixes the issue by passing the current stream into DDP's callback.

Tested by adding a UT `test_DistributedDataParallel_non_default_stream` that fails without this PR
ghstack-source-id: 106481208

Differential Revision: D22073353

fbshipit-source-id: 70da9b44e5f546ff8b6d8c42022ecc846dff033e
2020-07-08 21:08:17 -07:00
Nikita Shulga
af9600b1f5
[Caffe2] Move in-header virtual function implementation to .cc files (#41090)
* Move OperatorSchema default inference function implementations to .cc… (#40845)

Summary:
… file

This prevents implementation of those functions(as lambdas) to be embedded as weak symbol into every shared library that includes this header.

Combination of this and https://github.com/pytorch/pytorch/pull/40844 reduces size of `libcaffe2_module_test_dynamic.so` from 500kb to 50Kb.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40845

Differential Revision: D22334779

Pulled By: malfet

fbshipit-source-id: 64706918fc2947350a58c0877f294b1b8b085455

* Move `OperatorBase::AddRelatedBlobInfo` implementation to .cc file (#40844)

Summary:
If virtual function is implemented in header file, it's implementation will be included as a weak symbol to every shared library that includes this header along with all of it's dependencies.

This was one of the reasons why size of libcaffe2_module_test_dynamic.so  was 500Kb (AddRelatedBlobInfo implementation pulled a quarter of libprotobuf.a with it)

Combination of this and https://github.com/pytorch/pytorch/issues/40845 reduces size of `libcaffe2_module_test_dynamic.so` from 500kb to 50Kb.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40844

Differential Revision: D22334725

Pulled By: malfet

fbshipit-source-id: 836a4cbb9f344355ddd2512667e77472546616c0
2020-07-07 21:17:11 -07:00
Nikita Shulga
83262b1ba1
torch._six.PY37 should be true for Python-3.8 as well (#40868) (#41091)
Summary:
Right now it is used to check whether `math.remainder` exists, which is the case for both Python-3.7 and 3.8
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40868

Differential Revision: D22343454

Pulled By: malfet

fbshipit-source-id: 6b6d4869705b64c4b952309120f92c04ac7e39fd
2020-07-07 17:15:01 -07:00
Nikita Shulga
f862a6ba4d
Remove unused Logger in get_matching_activations (#41023) (#41087)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41023

Remove Logger in get_matching_activations since it's not used.
ghstack-source-id: 107237046

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic'

Differential Revision: D22394957

fbshipit-source-id: 7d59e0f35e9f4c304b8487460d48236ee6e5a872

Co-authored-by: Haixin Liu <haixin@fb.com>
2020-07-07 16:09:37 -07:00
Nikita Shulga
f3c1ea7455
[PyTorch Numeric Suite] Remove unnecessary Logger in input arguments (#40890) (#41086)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40890

Remove unnecessary Logger in input arguments and simplify the API.
ghstack-source-id: 107110487

Test Plan:
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static'
buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic'

Differential Revision: D22345477

fbshipit-source-id: d8b4eb3d6cb3049aa3296dead8ba29bf5467bd1c

Co-authored-by: Haixin Liu <haixin@fb.com>
2020-07-07 16:09:11 -07:00
mrshenli
2ed3ad2891
fix autodoc for torch.distributed.launch (#40963) (#41089)
Summary:
The doc for `torch.distributed.launch` is missing since v1.2.0 (see issue https://github.com/pytorch/pytorch/issues/36386) because PR https://github.com/pytorch/pytorch/issues/22501 added some imports at the first line.
542ac74987/torch/distributed/launch.py (L1-L5)
I move it below the docstring to make the autodoc in Sphinx work normally.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40963

Differential Revision: D22380816

Pulled By: mrshenli

fbshipit-source-id: ee8406785b9a198bbf3fc65e589854379179496f

Co-authored-by: Xin Yao <yaox12@outlook.com>
2020-07-07 14:23:31 -07:00
Jerry Zhang
a857af50a4
[quant][graphmode][fix] cloning schema in insert_observers (#40624) (#40934)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40624

Previously we didn't clone schema, so the default schema is used, this is
causing issue for some models

Test Plan: Imported from OSS

Differential Revision: D22259519

fbshipit-source-id: e2a393a54cb18f55da0c7152a74ddc22079ac350
2020-07-07 13:27:36 -07:00
Jerry Zhang
d0045e5520
Some fixes for graph mode quantization (#40935)
* [quant] aten::repeat work for quantized tensor (#40644)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40644

Test Plan: Imported from OSS

Differential Revision: D22268558

fbshipit-source-id: 3bc9a129bece1b547c519772ecc6b980780fb904

* [quant][graphmode][fix] remove unsupported ops in the list (#40653)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40653

(Note: this ignores all push blocking failures!)

Test Plan: Imported from OSS

Differential Revision: D22271413

fbshipit-source-id: a01611b5d90849ac673fa5a310f910c858e907a3
2020-07-07 13:26:27 -07:00
Jerry Zhang
0406b69b79
[quant][graphmode][fix] Fold conv bn (#40865) (#40970)
* [quant][graphmode][fix] Fold conv bn (#40865)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40865

1. applied filter for the module types
2. removed the assumption that the conv bn are immediate child of parent module

Test Plan:
python test/test_quantization.py TestQuantizeJitPasses

Imported from OSS

Differential Revision: D22338074

fbshipit-source-id: 64739a5e56c0a74249a1dbc2c8454b88ec32aa9e

* [quant][graphmode][fix] Print the node in error message (#40889)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40889

Test Plan: Imported from OSS

Differential Revision: D22348266

fbshipit-source-id: eed2ece5c94fcfaf187d6770bed4a7109f0c0b4a
2020-07-07 13:25:39 -07:00
Jerry Zhang
6220cc4380
[quant][graphmode][fix] dequantize propagation for {add/mul}_scalar + aten::repeat (#40933)
* [quant][graphmode][fix] dequantize propagation for {add/mul}_scalar (#40596)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40596

Previously the fusion patterns for {add/mul}_scalar is inconsistent since the op pattern
produces a non-quantized tensor and the op replacement graph produces a quantized tensor

Test Plan: Imported from OSS

Differential Revision: D22251072

fbshipit-source-id: e16eb92cf6611578cca1ed8ebde961f8d0610137

* [quant][graphmode] Support quantization for `aten::apend` (#40743)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40743

`aten::append` modifies input inplace and the output is ignored, these ops are not
supported right now, so we'll need to first make `aten::append` non-inplace
by change
```
ignored = aten::append(list, x)
```
to
```
x_list = aten::ListConstruct(x)
result = aten::add(list, x_list)
```
and then quantize the aten::add instead.

Test Plan:
TestQuantizeJitOps.test_general_shape_ops

Imported from OSS

Differential Revision: D22302151

fbshipit-source-id: 931000388e7501e9dd17bec2fad8a96b71a5efc5
2020-07-07 13:25:02 -07:00
mcarilli
eaf3f2fd34
Added index_put to promotelist (#41036)
* Added index_put to promotelist

* docstring

Co-authored-by: Michael Carilli <mcarilli@nvidia.com>
2020-07-07 13:00:32 -07:00
eellison
c35b4c770b
Bucket of shape analysis fixes (#41044)
* [JIT] fix unfold shape analysis (#40749)

Summary:
unfold on a 0-dimensioned tensor returns a 1-dim tensor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40749

Differential Revision: D22361481

Pulled By: eellison

fbshipit-source-id: 621597e5f97f6e39953eb86f8b85bb4142527a9f

* shape analysis fix for default dtype'

ghstack-source-id: 723aa27c2685417715a0891f5ca1ae885d4c9832
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40938

* fix grad thrashing of shape analysis

ghstack-source-id: dd8742b1da52d17e9d6ab6c81ff0b27520b09417
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40939

Co-authored-by: Elias Ellison <eellison@fb.com>
2020-07-07 12:59:47 -07:00
Eli Uriegas
11baccf1b5
[release/1.6] .circleci: Output binary sizes, store binaries (#41075)
We need an easy to way to quickly visually grep binary sizes from builds
and then have a way to test out those binaries quickly.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
(cherry picked from commit 66813515d4dec66f319442ba967c64b87c0286cd)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
2020-07-07 11:27:00 -07:00
raghuramank100
f0f0cbdd4a
Docstring changes for dynamic quantized classes (#40931) (#41032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40931

Fix docstrings for dynamic quantized Linear/LSTM and associated classes
ghstack-source-id: 107064446

Test Plan: Docs show up in correctly

Differential Revision: D22360787

fbshipit-source-id: 8e357e081dc59ee42fd7f12ea5079ce5d0cc9df2
2020-07-06 21:37:53 -07:00
Mikhail Zolotukhin
11b70b0041
[JIT] Switch executor from Simple to Legacy. (#41017)
* properly skip legacy tests regardless of the default executor (#40381)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40381

Differential Revision: D22173938

Pulled By: Krovatkin

fbshipit-source-id: 305fc4484977e828cc4cee6e053a1e1ab9f0d6c7

* [JIT] Switch executor from Simple to Legacy.

This is done for 1.6 only in order to recover performance regressions
caused by the Legacy->Simple switch that was done in 1.5. On master we
still plan to use Simple executor and fix the performance issues in 1.7
without falling back to the Legacy executor.

Co-authored-by: Nikolay Korovaiko <korovaikon@gmail.com>
2020-07-06 21:35:02 -07:00
James Reed
01e9562313
[1.6 cherrypick] Fix delegating to jit.load from torch.load (#41013) 2020-07-06 16:55:00 -07:00
Nick Korovaiko
3f13c9a2c8
infer tensor properties based on an input tensor rather than defaults for xxx_like ctors (#40895) (#41016)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40895

Reviewed By: eellison

Differential Revision: D22358878

Pulled By: Krovatkin

fbshipit-source-id: 2db2429aa89c180d8e52a6bb1265308483da46a2
2020-07-06 16:52:59 -07:00
Nick Korovaiko
63a94c021a
shape inference of undefined for prim::grad (#40866) (#41015)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40866

Reviewed By: pbelevich

Differential Revision: D22358988

Pulled By: Krovatkin

fbshipit-source-id: 7118d7f8d4eaf056cfb71dc0d588d38b1dfb0fc7
2020-07-06 16:51:37 -07:00
Nick Korovaiko
2b175ba909
update requires_gard on loop inputs correctly (master) (#40926) (#41014)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40926

Reviewed By: eellison

Differential Revision: D22359471

Pulled By: Krovatkin

fbshipit-source-id: 823e87674e2d2917f075255ec926e0485972f4e2
2020-07-06 16:30:14 -07:00
Ashkan Aliabadi
8c3f662224
Update FP16 to FP16:4dfe081cf6bcd15db339cf2680b9281b8451eeb3. (#40956) 2020-07-06 06:59:41 -07:00
Ashkan Aliabadi
0ffdd5aa1d
Update cpuinfo to cpuinfo:63b254577ed77a8004a9be6ac707f3dccc4e1fd9. (#40955) 2020-07-06 06:59:30 -07:00
Ashkan Aliabadi
d53427c541
Update FXdiv to FXdiv:b408327ac2a15ec3e43352421954f5b1967701d1. (#40954) 2020-07-06 06:59:17 -07:00
Ashkan Aliabadi
b44b1d868e
Update psimd to psimd:072586a71b55b7f8c584153d223e95687148a900 (#40953) 2020-07-06 06:59:01 -07:00
Ashkan Aliabadi
9184c9832e
Re-apply PyTorch pthreadpool changes (#40951)
* Re-apply PyTorch pthreadpool changes

Summary:
This re-applies D21232894 (b9d3869df3) and D22162524, plus updates jni_deps in a few places
to avoid breaking host JNI tests.

Test Plan: `buck test @//fbandroid/mode/server //fbandroid/instrumentation_tests/com/facebook/caffe2:host-test`

Reviewed By: xcheng16

Differential Revision: D22199952

fbshipit-source-id: df13eef39c01738637ae8cf7f581d6ccc88d37d5

* Enable XNNPACK ops on iOS and macOS.

Test Plan: buck run aibench:run_bench -- -b aibench/specifications/models/pytorch/pytext/pytext_mobile_inference.json --platform ios --framework pytorch --remote --devices D221 (9788a74da8)AP-12.0.1

Reviewed By: xta0

Differential Revision: D21886736

fbshipit-source-id: ac482619dc1b41a110a3c4c79cc0339e5555edeb

* Respect user set thread count. (#40707)

Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40707

Test Plan: Imported from OSS

Differential Revision: D22318197

Pulled By: AshkanAliabadi

fbshipit-source-id: f11b7302a6e91d11d750df100d2a3d8d96b5d1db

* Fix and reenable threaded QNNPACK linear (#40587)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40587

Previously, this was causing divide-by-zero only in the multithreaded
empty-batch case, while calculating tiling parameters for the threads.
In my opinion, the bug here is using a value that is allowed to be zero
(batch size) for an argument that should not be zero (tile size), so I
fixed the bug by bailing out right before the call to
pthreadpool_compute_4d_tiled.

Test Plan: TestQuantizedOps.test_empty_batch

Differential Revision: D22264414

Pulled By: dreiss

fbshipit-source-id: 9446d5231ff65ef19003686f3989e62f04cf18c9

* Fix batch size zero for QNNPACK linear_dynamic (#40588)

Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40588

Two bugs were preventing this from working.  One was a divide by zero
when multithreading was enabled, fixed similarly to the fix for static
quantized linear in the previous commit.  The other was computation of
min and max to determine qparams.  FBGEMM uses [0,0] for [min,max] of
empty input, do the same.

Test Plan: Added a unit test.

Differential Revision: D22264415

Pulled By: dreiss

fbshipit-source-id: 6ca9cf48107dd998ef4834e5540279a8826bc754

Co-authored-by: David Reiss <dreiss@fb.com>
2020-07-06 06:58:25 -07:00