Commit Graph

12368 Commits

Author SHA1 Message Date
James Reed
9bc8f071a3 [WIP] Move torch.fx into its own target (#46658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46658

ghstack-source-id: 115213192

Test Plan: waitforsadcastle

Reviewed By: zdevito, vkuzo

Differential Revision: D24374723

fbshipit-source-id: 2b5708001f5df2ffb21ea5e586e26030653ccdcf
2020-10-29 17:03:08 -07:00
Leon Gao
7190155408 [Transposed Conv]add ConvTranspose3d with FBGEMM as backend (#46608)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46608

introduce frontend API for quantized transposed convolution with only FBGEMM as backend.
ghstack-source-id: 115289210

Test Plan: https://www.internalfb.com/intern/testinfra/testconsole/testrun/7599824394184104/

Reviewed By: z-a-f

Differential Revision: D24369831

fbshipit-source-id: b8babd3ddbe0df8f4c8bc652bb745f85e0813797
2020-10-29 16:18:43 -07:00
Nikolay Korovaiko
71c0133e23 enable PE everywhere but mobile (#47001)
Summary:
enable PE everywhere but mobile

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47001

Reviewed By: eellison

Differential Revision: D24596252

Pulled By: Krovatkin

fbshipit-source-id: 3e3093a43287e1ff838cb03ec0e53c11c82c8dd2
2020-10-29 14:22:56 -07:00
Basil Hosmer
377a09c8e8 reland fast TypeMeta/ScalarType conversion (#45544)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45544

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24006482

Pulled By: bhosmer

fbshipit-source-id: 5da2401ab40bbf58da27a5d969e00bcee7562ed6
2020-10-29 14:07:39 -07:00
shubhambhokare1
1ea14e30f5 [ONNX] Enable NoneType inputs to export API (#45792)
Summary:
Enables the use of NoneType arguments to inputs tuple in the export API

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45792

Reviewed By: heitorschueroff

Differential Revision: D24312784

Pulled By: bzinodev

fbshipit-source-id: 1717e856b56062add371af7dc09cdd9c7b5646da
2020-10-29 13:56:52 -07:00
Wang Xu
c556d4550c fix_combine_two_partition_size (#47053)
Summary:
fix combine_two_partitions in Partitioner.py to calculate new partition used memory size after combining two partitions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47053

Reviewed By: gcatron

Differential Revision: D24624270

Pulled By: scottxu0730

fbshipit-source-id: a0e2a8486e012d02ea797d6ba36ab304d27cc93f
2020-10-29 13:40:44 -07:00
BowenBao
129b41226e [ONNX] Support nd mask index in opset >= 11 (#45252)
Summary:
Fixes below pattern for opset >= 11

`return tensor[tensor > 0]`

where rank of `tensor` > 1.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45252

Reviewed By: VitalyFedyunin

Differential Revision: D24116945

Pulled By: bzinodev

fbshipit-source-id: 384026cded1eb831bb5469e31ece4fcfb6ae8f2a
2020-10-29 13:32:59 -07:00
kshitij12345
1d233d7d1f [fix] torch.nn.functional.embedding -> padding_idx behavior (#46714)
Summary:
Reference https://github.com/pytorch/pytorch/issues/46585

Fix for second snippet in the mentioned issue.
```python
predefined_weights = torch.rand(10, 3)
result = torch.nn.functional.embedding(torch.LongTensor([1,2,0]), predefined_weights, padding_idx=0)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46714

Reviewed By: VitalyFedyunin

Differential Revision: D24593352

Pulled By: albanD

fbshipit-source-id: 655b69d9ec57891871e26feeda2aa0dcff73beba
2020-10-29 13:29:00 -07:00
Rohan Varma
d850b5c98c Fix DDP issue where parameters share same grad_accumulator (#46755)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46755

As reported in https://github.com/pytorch/pytorch/issues/41324, there is a bug in DDP when `find_unused_parameters=True` and 2 or more parameters share the same gradient accumulator.

In the reducer, we currently keep a mapping of grad accumulator to index and populate it with map[accumulator] = index, but this overwrites indices when the accumulator is the same. To fix this, switch the mapping values to a vector of indices to hold all such indices that share the same accumulator.
ghstack-source-id: 115453567

Test Plan: Added UT

Reviewed By: pritamdamania87

Differential Revision: D24497388

fbshipit-source-id: d32dfa9c5cd0b7a8df13c7873d5d28917b766640
2020-10-29 12:23:06 -07:00
Xiong Wei
74d730c0b5 implement NumPy-like functionality column_stack, row_stack (#46313)
Summary:
Related https://github.com/pytorch/pytorch/issues/38349

This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`.

Todo

- [x] docs
- [x] alias pattern for `row_stack`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46313

Reviewed By: ngimel

Differential Revision: D24585471

Pulled By: mruberry

fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c
2020-10-29 12:14:39 -07:00
tmanlaibaatar
fee585b5a3 Correctly mark unannotated NamedTuple field to be inferred TensorType (#46969)
Summary:
If there is no annotation given, we want to show users that the type is inferred

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46969

Test Plan:
Added a new test case that throws an error with the expected error message

Fixes https://github.com/pytorch/pytorch/issues/46326

Reviewed By: ZolotukhinM

Differential Revision: D24614450

Pulled By: gmagogsfm

fbshipit-source-id: dec555a53bfaa9cdefd3b21b5142f5e522847504
2020-10-29 12:07:40 -07:00
mfkasim91
6eaa324c9f Implement torch.igamma (#46183)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41637
This is regularized lower incomplete gamma function, equivalent to scipy's `gammainc` and tensorflow `igamma`.

cc fritzo mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46183

Reviewed By: gchanan

Differential Revision: D24479126

Pulled By: mruberry

fbshipit-source-id: fdf8ea289fe4ca1b408810732192411e948fcdfe
2020-10-29 11:40:18 -07:00
Nikita Shulga
2b6a720eb1 Update pybind to 2.6.0 (#46415)
Summary:
Preserve PYBIND11 (63ce3fbde8) configuration options in `torch._C._PYBIND11 (63ce3fbde8)_COMPILER_TYPE` and use them when building extensions

Also, use f-strings in `torch.utils.cpp_extension`

"Fixes" https://github.com/pytorch/pytorch/issues/46367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46415

Reviewed By: VitalyFedyunin

Differential Revision: D24605949

Pulled By: malfet

fbshipit-source-id: 87340f2ed5308266a46ef8f0317316227dab9d4d
2020-10-29 10:53:47 -07:00
Ivan Yashchuk
f629fbe235 Added torch.linalg.tensorsolve (#46142)
Summary:
This PR adds `torch.linalg.tensorsolve` function that matches `numpy.linalg.tensorsolve`.

Ref https://github.com/pytorch/pytorch/issues/42666.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46142

Reviewed By: izdeby

Differential Revision: D24539400

Pulled By: mruberry

fbshipit-source-id: 6e38364fe0bc511e739036deb274d9307df119b2
2020-10-29 10:29:28 -07:00
Rohan Varma
ecdbea77bc Fix DDP documentation (#46861)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46861

Noticed that in the DDP documentation:
https://pytorch.org/docs/master/generated/torch.nn.parallel.DistributedDataParallel.html?highlight=distributeddataparallel
there were some examples with `torch.nn.DistributedDataParallel`, fix this to
read `torch.nn.parallel.DistributedDataParallel`.
ghstack-source-id: 115453703

Test Plan: ci

Reviewed By: pritamdamania87, SciPioneer

Differential Revision: D24534486

fbshipit-source-id: 64b92dc8a55136c23313f7926251fe825a2cb7d5
2020-10-29 09:13:47 -07:00
James Reed
d0df29ac22 [FX] Put inf and nan in globals instead of with an import string (#47035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47035

Chillee thought the `from math import inf, nan` string at the top of `.code` was annoying so here's an alternative way to do it by putting those values in `globals` before we `exec`

Test Plan: Imported from OSS

Reviewed By: dzhulgakov

Differential Revision: D24611278

Pulled By: jamesr66a

fbshipit-source-id: c25ef89e649bdd3e79fe91aea945a30fa7106961
2020-10-29 00:35:41 -07:00
Yi Wang
cab32d9cdf [RPC Framework] Support remote device format "<workername>/<device>" (#46773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46773

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Original PR issue: RemoteDevice Format #46554
ghstack-source-id: 115448051

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: pritamdamania87

Differential Revision: D24482562

fbshipit-source-id: 5acfc73772576a4b674df27625bf560b8f8e67c1
2020-10-29 00:14:56 -07:00
Kshiteej K
5c8aad1141 [numpy] torch.cos, torch.tan : promote integer inputs to float (#46706)
Summary:
References https://github.com/pytorch/pytorch/issues/42515

cc: mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46706

Reviewed By: izdeby

Differential Revision: D24537262

Pulled By: mruberry

fbshipit-source-id: e57377a625814a3f34a765ce6bfd63a33c02a5d9
2020-10-28 22:02:52 -07:00
Nikita Shulga
42a51148c1 Use f-strings in torch.utils.cpp_extension (#47025)
Summary:
Plus two minor fixes to `torch/csrc/Module.cpp`:
 - Use iterator of type `Py_ssize_t` for array indexing in `THPModule_initNames`
 - Fix clang-tidy warning of unneeded defaultGenerator copy by capturing it as `const auto&`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47025

Reviewed By: samestep

Differential Revision: D24605907

Pulled By: malfet

fbshipit-source-id: c276567d320758fa8b6f4bd64ff46d2ea5d40eff
2020-10-28 21:32:33 -07:00
Wang Xu
a86b3438eb add support for different memory sizes on size_based_partition (#46919)
Summary:
WIP: add support for different memory sizes on size_based_partition, so the size_based_partition could support different logical devices with different memory sizes. Compared to the original size_based_partition, the new one also supports partition to logical device mapping. Multiple partitions can be mapped into one device if the memory size is allowed. A test unit test_different_size_partition is also added.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46919

Reviewed By: gcatron, VitalyFedyunin

Differential Revision: D24603511

Pulled By: scottxu0730

fbshipit-source-id: 1ba37338ae054ad846b425fbb7e631d3b6c500b6
2020-10-28 21:11:41 -07:00
Jerry Zhang
c2a3951352 [quant][graphmode][fx] Remove inplace option for convert_fx (#46955)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46955

Initially we were thinking of adding a `invalidate_quantized_float_parameters` option to free the memory
of quantized floating parameters, but it turns out we will do module swap just like in eager mode for the modules
that are quantized, so the old floating point module will not be referenced after quantization. therefore this feature
is only needed for functionals, since most people are using quantization with modules we may not need this.

we'll revisit after we find there is a need for this.

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D24579400

fbshipit-source-id: fbb0e567405dc0604a2089fc001573affdade986
2020-10-28 21:07:19 -07:00
Rohan Varma
c7183c9878 Fix object-based collectives API to use torch.cuda.current_device instead of (#46897)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46897

These APIs implicitly assumed that gpu for rank == rank index, but
that is not necessarily true. For example, the first GPU could be used for a
different purpose and rank 0 could use GPU 1, rank 1 uses GPU 2, etc. Thus, we
mandate that the user specify the device to use via `torch.cuda.set_device()`
before making calls to this API. This expectation should be okay since we
clearly document it, and we expect the user to set this for
DistributedDataParallel as well.

Also adds/tidies up some documentation.
ghstack-source-id: 115359633

Test Plan: Modified unittests

Reviewed By: divchenko

Differential Revision: D24556177

fbshipit-source-id: 7e826007241eba0fde3019180066ed56faf3c0ca
2020-10-28 18:12:50 -07:00
Michael Suo
dc8176356e Various cleanups to ir_emitter and friends (#46686)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46686

I was trying to page this code back in after a while and some things
stuck out as unnecessarily confusing.

1. Improve documentation of closures and fork stuff to be more accurate
to how we use them today.
2. Change `prim::LocalVariableScope` to `prim::ListComprehension`. It is
only ever used for a list comprehensions, and in general the nodes
emitted by `ir_emitter` should correspond to concrete operations or
language features rather than semantic constraints.
3. Change the somewhat mysterious "inputs" and "attributes" argument
names throughout the codebase to be the more obvious "args" and "kwargs"
that they generally represent (I think "inputs" and "attributes" come
from the AST naming).

Test Plan: Imported from OSS

Reviewed By: navahgar, jamesr66a

Differential Revision: D24464197

Pulled By: suo

fbshipit-source-id: 1f4b1475b58b5690a0b204e705caceff969533b4
2020-10-28 16:28:05 -07:00
Nikita Vedeneev
cd26d027b3 [doc] Fix info on the shape of pivots in torch.lu + more info on what and how they encode permutations. (#46844)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46844

Reviewed By: VitalyFedyunin

Differential Revision: D24595538

Pulled By: ezyang

fbshipit-source-id: 1bb9c0310170124c3b6e33bd26ce38c22b36e926
2020-10-28 14:56:31 -07:00
James Reed
bf08814b73 [FX] Kill functional transforms name (#47004)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47004

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D24597581

Pulled By: jamesr66a

fbshipit-source-id: 9213d58f4a53ea55e97e6ca0572fdcf5e271bdc3
2020-10-28 11:59:28 -07:00
David Reiss
23bce17baa Add inputsSize to Python IR, like outputsSize (#46779)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46779

Test Plan: Used it in some notebooks.

Reviewed By: suo

Differential Revision: D24574005

Pulled By: dreiss

fbshipit-source-id: 78ba7a2bdb859fef5633212b73c7a3eb2cfbc380
2020-10-28 11:35:39 -07:00
James Reed
069232a574 [FX] Fix corner case in name sanitization (#46958)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46958

Test Plan: Imported from OSS

Reviewed By: dzhulgakov

Differential Revision: D24580474

Pulled By: jamesr66a

fbshipit-source-id: 2f8d252998c72e1e79d6a5f7766c2d51a271cc83
2020-10-28 10:22:33 -07:00
Richard Barnes
353e7f940f Ensure kernel launches are checked (#46474)
Summary:
Caffe2 and Torch currently does not have a consistent mechanism for determining if a kernel has launched successfully. The result is difficult-to-detect or silent errors. This diff provides functionality to fix that. Subsequent diffs on the stack fix the identified issues.

Kernel launch errors may arise if invalid launch parameters (number of blocks, number of threads, shared memory, or stream id) are specified incorrectly for the hardware or for other reasons. Interestingly, unless these launch errors are specifically checked for CUDA will silently fail and return garbage answers which can affect downstream computation. Therefore, catching launch errors is important.

Launches are currently checked by placing
```
AT_CUDA_CHECK(cudaGetLastError());
```
somewhere below the kernel launch. This is bad for two reasons.
1. The check may be performed at a site distant to the kernel launch, making debugging difficult.
2. The separation of the launch from the check means that it is difficult for humans and static analyzers to determine whether the check has taken place.

This diff defines a macro:
```
#define TORCH_CUDA_KERNEL_LAUNCH_CHECK() AT_CUDA_CHECK(cudaGetLastError())
```
which clearly indicates the check.

This diff also introduces a new test which analyzes code to identify kernel launches and determines whether the line immediately following the launch contains `TORCH_CUDA_KERNEL_LAUNCH_CHECK();`.

A search of the Caffe2 codebase identifies 104 instances of `AT_CUDA_CHECK(cudaGetLastError());` while the foregoing test identifies 1,467 launches which are not paired with a check. Visual inspection indicates that few of these are false positives, highlighting the need for some sort of static analysis system.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46474

Test Plan:
The new test is run with:
```
buck test //caffe2/test:kernel_launch_checks -- --print-passing-details
```
And should be launched automatically with the other land tests. (TODO: Is it?)

The test is currently set up only to provide warnings but can later be adjusted to require checks.

Otherwise, I rely on the existing test frameworks to ensure that changes resulting from reorganizing existing launch checks don't cause regressions.

Reviewed By: ngimel

Differential Revision: D24309971

Pulled By: r-barnes

fbshipit-source-id: 0dc97984a408138ad06ff2bca86ad17ef2fdf0b6
2020-10-28 09:27:48 -07:00
frgfm
c886c7f6dd fix: Fixed typing of bool in _ConvNd (#46828)
Summary:
Hello there 👋

I do believe there is some typo in the typing of the `bool` argument of `_ConvNd`constructor.
The typing of the attribute is correct, but the constructor argument, while being the same way, is not the value that will be assigned to `self.bias`.

This PR simply corrects that.

Any feedback is welcome!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46828

Reviewed By: izdeby

Differential Revision: D24550435

Pulled By: ezyang

fbshipit-source-id: ab10f1a5b29a912cb23fc321a51e78b04a8391e3
2020-10-28 08:08:53 -07:00
Jerry Zhang
cd8ed93287 [quant][graphmode][fx][api] Remove inplace option from prepare_fx (#46954)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46954

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D24579401

fbshipit-source-id: adce623ce819fa220f7bb08d1ff3beaa69850621
2020-10-28 08:00:12 -07:00
Alban Desmaison
46b252b83a Revert D24262885: [pytorch][PR] Added foreach_zero_ API
Test Plan: revert-hammer

Differential Revision:
D24262885 (8e37dcb1f3)

Original commit changeset: 144c283dd009

fbshipit-source-id: 451b202e23bc1fcb11b20d26c11d9a1329789d22
2020-10-28 06:48:59 -07:00
Bram Wasti
ddbdbce623 [jit] Prevent caching of graph attribute. (#46960)
Summary:
`graph` is automatically cached even when the underlying graph changes -- this PR hardcodes a fix to that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46960

Reviewed By: mrshenli

Differential Revision: D24582185

Pulled By: bwasti

fbshipit-source-id: 16aeeba251830886c92751dd5c9bda8699d62803
2020-10-27 23:56:52 -07:00
Jerry Zhang
d92bf921db [quant][graphmode][fx] Remove inplace option for fuse_fx (#46953)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46953

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D24579402

fbshipit-source-id: 5e0b8abf682287ab3c7dd54c2fc2cf309295e147
2020-10-27 22:34:11 -07:00
Yi Wang
e299393fd5 [Gradient Compression] Provide 2 default C++ comm hooks (#46701)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46701

Provide 2 built-in implementations of C++ comm hook.

Original PR issue: C++ DDP Communication Hook https://github.com/pytorch/pytorch/issues/46348
ghstack-source-id: 115319061

Test Plan: waitforbuildbot

Reviewed By: pritamdamania87

Differential Revision: D24382504

fbshipit-source-id: 1c1ef56620f91ab37a1707c5589f1d0eb4455bb3
2020-10-27 21:43:15 -07:00
Yi Wang
e077a2a238 [Gradient Compression] Add CppCommHook subclass for supporting the C++ API of communication hook. (#46566)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46566

Only provides an interface. Some built-in implementations will be provided in a follow-up commit.

Original PR issue: C++ DDP Communication Hook https://github.com/pytorch/pytorch/issues/46348
ghstack-source-id: 115319038

Test Plan: waitforbuildbot

Reviewed By: pritamdamania87

Differential Revision: D24379460

fbshipit-source-id: 8382dc4185c7c01d0ac5b3498e1bead785bccec5
2020-10-27 21:43:12 -07:00
Jerry Zhang
998b9b9e68 [quant][graphmode][fx] custom_module support static/dynamic/weight_only quant (#46786)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46786

Previously we only support static quant, this PR added support for other types of quantization.

Note qat is actually orthogonal to these quant types, this is referring to the convert step where we
convert the observed module to a quantized module.

for qat, user will provide a CustomModule -> FakeQuantizedCustomModule in prepare_custom_config_dict
and FakeQuantizedCustomModule -> static/dynamic/weight_only quantized CustomModule in convert_custom_config_dict.

Test Plan: Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D24514701

fbshipit-source-id: 2918be422dd76093d67a6df560aaaf949b7f338c
2020-10-27 21:41:33 -07:00
Jerry Zhang
5a8198eb3c [quant][graphmode][fx][fix] scalar as first input for add/mul (#46751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46751

Currently we assume the first input for add/mul is node (Tensor), but it might not be the case

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_quantized_add
python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul
python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_relu
python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_relu

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D24494456

fbshipit-source-id: ef5e23ba60eb22a57771791f4934306b25c27c01
2020-10-27 19:59:28 -07:00
Yang Wang
810c68fb1d [OpBench] fix jit tracing with quantized op/tensor by enabling _compare_tensors_internal to compare quantized tensors (#46772)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46772

When running `buck run caffe2/benchmarks/operator_benchmark/pt:qactivation_test -- --use_jit`, I encountered the following error P146518683. The error was traced down to the fact that `torch.allclose` does not work with quantized tensors (the error was triggered by this particular multiplication https://fburl.com/diffusion/8vw647o6 since native mul can not work with a float scalar and a quantized tensor.)

Minimum example to reproduce:
```(Pdb) input = torch.ones(5)
(Pdb) aa = torch.quantize_per_tensor(input, scale=1.0, zero_point=0, dtype=torch.quint8)
(Pdb) bb = torch.quantize_per_tensor(input, scale=1.0, zero_point=0, dtype=torch.quint8)
(Pdb) torch.allclose(aa, bb)
Comparison exception: 	promoteTypes with quantized numbers is not handled yet; figure out what the correct rules should be, offending types: QUInt8 Float
```

Here the proposed fix is to compare quantized tensors strictly within `_compare_tensors_internal`.

The other two possible fixes are:
1. convert quantized tensors to float tensors first before sending them to `torch.allclose`
2. change `torch.allclose` to handle quantized tensor.

Test Plan: buck run caffe2/benchmarks/operator_benchmark/pt:qactivation_test -- --use_jit

Reviewed By: kimishpatel

Differential Revision: D24506723

fbshipit-source-id: 6426ea2a88854b4fb89abef0edd2b49921283796
2020-10-27 18:53:13 -07:00
iurii zdebskyi
8e37dcb1f3 Added foreach_zero_ API (#46215)
Summary:
Adding Added foreach_zero_(TensorList) API

Tested via unit tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46215

Reviewed By: zhangguanheng66

Differential Revision: D24262885

Pulled By: izdeby

fbshipit-source-id: 144c283dd00924083096d6d92eb9085cbd6097d3
2020-10-27 18:03:34 -07:00
James Reed
67c1dc65a3 [FX] Fix handling of inf and nan literals (#46894)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46894

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D24555136

Pulled By: jamesr66a

fbshipit-source-id: 22765a4d9d373711e9e6d7b1d3898080ecbcf2f5
2020-10-27 17:55:35 -07:00
Vasiliy Kuznetsov
8066e89f64 quant: fix bug with copy.deepcopy of FX prepared quantization models (#46895)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46895

Bug: models after the FX graph mode quant prepare step lost information,
such as the extra attributes defined in `Quantizer.save_state`,
if the user performed `copy.deepcopy` on them.  The information was lost
because `GraphModule` does not copy attributes which are not present on
`nn.Module` by default.

Fix: define a custom `__deepcopy__` method on observed models and
whitelist the attributes we care about.

This is needed because users sometimes run `copy.deepcopy` on their
models during non-quantization related preparations, and we should make
sure that quantization related state survives these calls.

Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_deepcopy
python test/test_quantization.py TestQuantizeFx.test_standalone_module
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D24556035

fbshipit-source-id: f7a6b28b6d2225fa6189016f967f175f6733b124
2020-10-27 16:05:35 -07:00
Guilherme Leobas
717e6d8081 add type annotations to comm.py (#46736)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46736

Reviewed By: albanD

Differential Revision: D24565554

Pulled By: mrshenli

fbshipit-source-id: 4e40e4232ebf256af228f9c742ea4d28c626c616
2020-10-27 14:27:06 -07:00
kshitij12345
21e60643c0 [numpy] torch.log{2,10} : promote integer inputs to float (#46810)
Summary:
References https://github.com/pytorch/pytorch/issues/42515

cc: mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46810

Reviewed By: izdeby

Differential Revision: D24536187

Pulled By: mruberry

fbshipit-source-id: b7dd7678d4e996f3dea0245c65055654e02be459
2020-10-27 13:07:44 -07:00
Wang Xu
8640905088 add sparse_nn_partition (#46390)
Summary:
WIP: This PR adds sparse_nn_partition into Partitioner class. It includes logical device assignment for all dag nodes. The basic idea is to do size_based_partition separately for embedding nodes and non-embedding nodes. A test unit is also added in test_fx_experimental.py.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46390

Reviewed By: gcatron

Differential Revision: D24555415

Pulled By: scottxu0730

fbshipit-source-id: 8772af946d5226883759a02a1c827cfdfce66097
2020-10-27 00:11:58 -07:00
Raghavan Raman
4b6e307191 Replace flatten tensors with flatten loops. (#46737)
Summary:
This is the second attempt at replacing flatten tensors with flatten loops in `TensorExprKernel::generateStmt`. The first attempt (https://github.com/pytorch/pytorch/pull/46539) resulted in a build failure due to an exception that gets thrown during inline.

The reason for the build failure was because there was an inline step, which was supposed to happen on the unflattened tensors. This was necessary earlier because for every flattened tensor there was an unflattened tensor which had to be inlined. That is no longer necessary since we do not have 2 tensors (flattened and unflattened) now. Removed this inline.

Checked python and cpp tests on CPU as well as CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46737

Reviewed By: anjali411, izdeby

Differential Revision: D24534529

Pulled By: navahgar

fbshipit-source-id: 8b131a6be076fe94ed369550d9f54d3879fdfefd
2020-10-27 00:01:20 -07:00
Jerry Zhang
6b50ccc41c [quant][graphmode][fx] Support sigmoid/hardsigmoid/tanh in qat (#46738) (#46871)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46871

Test Plan:
Imported from OSS

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D24547180

fbshipit-source-id: d2eb9aa74c6e5436204376b1a2ebcc6188d3562f
2020-10-26 23:52:07 -07:00
Jack Montgomery
60eded6c0f Add single element tuple output from to_backend/to_glow (#5029)
Summary:
Pull Request resolved: https://github.com/pytorch/glow/pull/5029

Support single element tuples in to_backend

Test Plan: new unit test for to_glow

Reviewed By: andrewmillspaugh

Differential Revision: D24539869

fbshipit-source-id: fb385a7448167b2b948e70f6af081bcf78f338dc
2020-10-26 22:29:04 -07:00
Linbin Yu
37da6d26ff add fburl link to error message (#46795)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46795

add fburl link to the error message of missing ops so user can debug themselves.

Test Plan: fburl.com/missing_ops

Reviewed By: iseeyuan

Differential Revision: D24519992

fbshipit-source-id: d2d16db7e9d9c84ce2c4600532eb253c30b31971
2020-10-26 21:05:49 -07:00
Jeffrey Wan
9858b012ec Fix TripletMarginWithDistanceLoss example code (#46853)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45210

Removes `requires_grad=True` from all the `randint`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46853

Reviewed By: izdeby

Differential Revision: D24549483

Pulled By: soulitzer

fbshipit-source-id: c03576571ed0b2dbb281870f29a28eb6f6209c65
2020-10-26 21:02:54 -07:00
Yi Wang
a6cd294c9b [Gradient Compression] Refactor CommHookInterface and PythonCommHook. (#46512)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46512

1. Merge 1-line PythonCommHook constructor into the header for simplicity.
2. Move the implementation of PythonCommHook destructor from the header file to cpp file.
3. Rename processFuture method as parseHookResult for readability.
4. Simplify some comments.

Original PR issue: C++ DDP Communication Hook https://github.com/pytorch/pytorch/issues/46348
ghstack-source-id: 115161086

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_ddp_comm_hook_allreduce_hook_nccl

buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_ddp_comm_hook_sparse_gradients

buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_ddp_comm_hook_allreduce_with_then_hook_nccl

buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_ddp_comm_hook_future_passing_gpu_gloo

Reviewed By: jiayisuse

Differential Revision: D24374282

fbshipit-source-id: c8dbdd764bca5b3fa247708f1218cb5ff3e321bb
2020-10-26 18:07:58 -07:00