Commit Graph

681 Commits

Author SHA1 Message Date
HDCharles
28a5cd9480 [ao] fixing public v private for quantize_jit.py (#86024)
Summary: just needed to add __all__

Test Plan: python test/test_public_bindings.py

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86024
Approved by: https://github.com/jerryzh168
2022-10-05 22:11:43 +00:00
HDCharles
14db44ad72 [ao] fixing public v private for quantize.py (#86023)
Summary: just needed to add __all__

Test Plan: python test/test_public_bindings.py

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86023
Approved by: https://github.com/jerryzh168
2022-10-05 19:40:42 +00:00
HDCharles
c21caff876 [ao] correctly set public v private for fake_quantize.py (#86022)
Summary: biggest issue was that the constructors for the fake_quantize
classes use custom partials that live in the observer module and so
the module for these needed to be set correctly in the constructor class
method

Test Plan: python test/test_public_bindings.py

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86022
Approved by: https://github.com/jerryzh168
2022-10-05 19:30:50 +00:00
Zafar
adb12438c1 [AO] Cubic sparsity level scheduler (#85232)
The scheduler updates the levels of sparsity based on https://arxiv.org/abs/1710.01878.

 ## Implementation

The update rule is defined as:

$$
\begin{aligned}
s_t &= s_f + (s_i - s_f)\left( 1 - \frac{t - t_0}{n\Delta t} \right)^3  \\
\mbox{for} ~ t &\in \left\\{ t_0, t_0+\Delta t, \dots, t_0 + n\Delta t \right\\} \end{aligned}
$$

There is a minor difference compared to the original paper. By providing `initially_zero` argument, one can set the level of sparsity before step $t_0$: If `False`, the sparsity level before $t_0$ is set to $s_i$, otherwise 0.

 ## Tests

```
python test/test_ao_sparsity.py -- TestCubicScheduler
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85232
Approved by: https://github.com/junesg, https://github.com/jerryzh168
2022-10-04 22:44:15 +00:00
Xia, Weiwen
4b86a9359a [Quant] Make x86 backend default when querying qconfig (#85461)
This PR is a follow-up of #84329 [[Quant] Add unified x86 quant backend](https://github.com/pytorch/pytorch/pull/84329)
It makes `x86` backend default when querying `qconfig`. Users get x86's qconfig/qconfig_mappings if backend is not specified.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85461
Approved by: https://github.com/jgong5, https://github.com/vkuzo
2022-09-30 23:44:45 +00:00
andrewor14
24fc680ee4 [Quant] Enable XNNPACK ops in QNNPACK BackendConfig (#85863)
**Summary:** This commit enforces the following constraints on the
QNNPACK BackendConfig:

- `quant_min_lower_bound` = -127 for qint8 weight
- `quant_max_upper_bound` = 127 for qint8 weight
- `scale_min_lower_bound` = 2 ** -12 for qint8 activations and weight

These constraints will enable users to use this BackendConfig with
faster XNNPACK quantized ops. They are also consistent with the
existing settings in `default_symmetric_qnnpack_qconfig` and its
per_channel and QAT variants. For more detail on why these exact
values were chosen, please see the description of
https://github.com/pytorch/pytorch/pull/74396.

Note that there are currently no restrictions on the qscheme in
DTypeConfig. This should be added in the future to further enforce
the restriction that the weights must be quantized with either
per_tensor_symmetric or per_channel_symmetric.

Existing default QConfigs such as `get_default_qconfig("qnnpack")`
and `get_default_qat_qconfig("qnnpack")` will continue to be
supported, but only for the existing dtypes, e.g. quint8 activations
for weighted ops like linear and conv. In the future, we should
revisit whether to enable XNNPACK ops using these QConfigs as well.

**Test Plan:**

python test/test_quantization.py TestQuantizeFx.test_qnnpack_backend_config

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85863
Approved by: https://github.com/jerryzh168
2022-09-30 22:53:38 +00:00
Digant Desai
071f875046 [quant] Fix per channel weight observer (#85883)
Summary: `per_channel_weight_observer_range_neg_127_to_127` now correctly uses `PerChannelMinMaxObserver` instead of `MinMaxObserver`

Test Plan:
Adds a new test `quantization.core.test_top_level_apis
` to instansiate and run `forward()` on all `default` observers

Differential Revision: D39916482

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85883
Approved by: https://github.com/salilsdesai
2022-09-30 22:02:44 +00:00
Xia, Weiwen
3a3e2002d8 [Quant] Add unified x86 quant backend (#84329)
## Description

Implement unified quantization backend 'X86' for x86 platforms. It combines the advantages of FBGEMM and ONEDNN. It selects kernels during weight prepacking and hide the details from end users. It will be the default backend in place of FBGEMM.

For details, please refer to this RFC: [[RFC] Unified quantization backend for x86 CPU platforms](https://github.com/pytorch/pytorch/issues/83888)

## Validation
**Correctness**
Covered by UT

**Accuracy**
By running torchvision models on imagenet, no accuracy difference is found between FBGEMM and the unified X86 backend:
[torchvision_accuracy_comparison_fbgemm_vs_x86.xlsx](https://github.com/pytorch/pytorch/files/9598114/torchvision_accuracy_comparison_fbgemm_vs_x86.xlsx)

**Performance**
Depends on https://github.com/pytorch/pytorch/pull/84470 which improves performance.
For early PoC results, please refer to https://github.com/pytorch/pytorch/files/9399202/unified_qengine_poc_performance_bechmark.xlsx

With the two PRs combined, we collected some data on Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
Method: Run multi-instances with 4 cores per instance on whole socket. Using JeMalloc and Intel OMP.
Models/throughput | fbgemm | x86 | improvement
-- | -- | -- | --
wide_resnet101_2 | 173.5675 | 241.815 | 39.32%
resnext101_32x8d | 174.365 | 339.8175 | 94.89%
resnet50 | 573.155 | 1174.14 | 104.86%
vgg19_bn | 260.335 | 337.92 | 29.80%
vgg19 | 257.935 | 333.265 | 29.21%
inception_v3 | 601.1175 | 1309.33 | 117.82%
densenet161 | 296.645 | 435.5625 | 46.83%
mnasnet1_0 | 1216.7 | 4057.515 | 233.49%
squeezenet1_0 | 1220.085 | 5153.3875 | 322.38%
alexnet | 2294.91 | 2624.6375 | 14.37%
fbnetc_100 | 976.2825 | 3110.1825 | 218.57%
shufflenet_v2_x0_5 | 1555.76 | 3026.125 | 94.51%
spnasnet_100 | 1059.065 | 3502.0975 | 230.68%
pytorch-unet | 192.76 | 246.77 | 28.02%
acgan | 257.32 | 333.7325 | 29.70%
cgan | 7790.6925 | 7803.1025 | 0.16%
sgan | 257.565 | 338.8875 | 31.57%
se_resnet50 | 492.3725 | 916.5175 | 86.14%
vggm | 300.2875 | 316.2075 | 5.30%

Environment:
- PyTorch version: 1.13.0a0+gitcdd625b
- Is debug build: False
- CUDA used to build PyTorch: None
- ROCM used to build PyTorch: N/A
- OS: Ubuntu 20.04.3 LTS (x86_64)
- GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- Clang version: Could not collect
- CMake version: version 3.22.5
- Libc version: glibc-2.31
- Python version: 3.9.12 (main, Jun  1 2022, 11:38:51)  [GCC 7.5.0] (64-bit runtime)
- Python platform: Linux-5.11.0-27-generic-x86_64-with-glibc2.31
- Is CUDA available: False
- CUDA runtime version: No CUDA
- GPU models and configuration: No CUDA
- Nvidia driver version: No CUDA
- cuDNN version: No CUDA
- HIP runtime version: N/A
- MIOpen runtime version: N/A
- Is XNNPACK available: True

Versions of relevant libraries:
- [pip3] intel-extension-for-pytorch==1.13.0+cpu
- [pip3] numpy==1.23.3
- [pip3] pytorch-widedeep==0.3.7
- [pip3] torch==1.13.0a0+git48b423b
- [pip3] torchvision==0.14.0a0+ebb68f3
- [conda] blas                      1.0                         mkl
- [conda] intel-extension-for-pytorch 1.13.0+cpu               pypi_0    pypi
- [conda] mkl                       2021.4.0           h06a4308_640
- [conda] mkl-include               2022.1.0                 pypi_0    pypi
- [conda] mkl-service               2.4.0            py39h7f8727e_0
- [conda] mkl-static                2022.1.0                 pypi_0    pypi
- [conda] mkl_fft                   1.3.1            py39hd3c417c_0
- [conda] mkl_random                1.2.2            py39h51133e4_0
- [conda] numpy                     1.23.3                   pypi_0    pypi
- [conda] numpy-base                1.22.3           py39hf524024_0
- [conda] torch                     1.13.0a0+git48b423b          pypi_0    pypi
- [conda] torchvision               0.14.0a0+ebb68f3          pypi_0    pypi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84329
Approved by: https://github.com/jerryzh168
2022-09-29 00:44:40 +00:00
zaf
d542aab5c1 [quant][ao_migration] nn.intrinsic migration to ao (#84842)
All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.modules`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline.

Differential Revision: [D39419733](https://our.internmc.facebook.com/intern/diff/D39419733/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39419733/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84842
Approved by: https://github.com/jerryzh168
2022-09-28 23:54:29 +00:00
andrewor14
4ca125a9e1 [Quant][fx] Add quant and scale ranges to BackendConfig (#85200)
**Summary:** This commit adds the following constraints to
BackendConfig:

    quant_min_lower_bound
    quant_max_upper_bound
    scale_min_lower_bound
    scale_max_upper_bound

This is motivated by QNNPACK constraints on qint8 weight
values and the min scale value. Actually enforcing these
constraints in the QNNPACK BackendConfig will follow in a
future commit.

Today, users can also specify the above constraints through
QConfigs, and these settings may not necessarily match the
ones specified in the BackendConfig. In this case, we will
handle the discrepancy as follows:

(1) Require QConfig quant ranges to fall within the backend's
(2) Require QConfig min scale value (eps) >= backend's
(3) Require QConfig to specify quant range if the backend
    specified one
(4) Require QConfig to specify min scale value (eps) if the
    backend specified one

Public API changes:

* Previous API, still supported after this commit:
```
dtype_config = DTypeConfig(
    input_dtype=torch.quint8,
    output_dtype=torch.quint8,
    weight_dtype=torch.qint8,
    bias_dtype=torch.float,
)
```
* New API:
```
dtype_config = DTypeConfig(
    input_dtype=DTypeWithConstraints(
        dtype=torch.quint8,
        quant_min_lower_bound=0,
        quant_max_upper_bound=127,
        scale_min_lower_bound=2 ** -12,
    ),
    output_dtype=DTypeWithConstraints(
        dtype=torch.quint8,
        quant_min_lower_bound=0,
        quant_max_upper_bound=127,
        scale_min_lower_bound=2 ** -12,
    ),
    weight_dtype=DTypeWithConstraints(
        dtype=torch.qint8,
        quant_min_lower_bound=-128,
        quant_max_upper_bound=127,
        scale_min_lower_bound=2 ** -12,
    ),
    bias_dtype=torch.float,
)
```
* Additionally, the following `DTypeConfig` attributes
have new types with helper getters:
```
# These have type DTypeWithConstraints
dtype_config.input_dtype
dtype_config.output_dtype
dtype_config.weight_dtype

# These return Optional[torch.dtype]
dtype_config.get_input_dtype()
dtype_config.get_output_dtype()
dtype_config.get_weight_dtype()
```

Note that scale_max is currently not used because there is
no existing mechanism to enforce this on the observer. In the
future, we can validate this as well if there is a use case.

**Test Plan:**

python test/test_quantization.py
TestBackendConfig.test_dtype_with_constraints

python test/test_quantization.py
TestQuantizeFx.test_backend_config_scale_min

python test/test_quantization.py
TestQuantizeFx.test_backend_config_quantization_range

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85200
Approved by: https://github.com/jerryzh168
2022-09-28 00:33:29 +00:00
andrewor14
2e81710366 [Quant] Add initial Executorch BackendConfig (#85527)
Summary: This commit adds the initial BackendConfig for backends
PyTorch lowers to through the Executorch stack. This initial
version is only intended to cover the following set of ops:

    quantized::linear_dynamic,
    quantized::add,
    quantized::batch_norm2d,
    quantized::conv2d.new,
    quantized::linear,
    quantized::conv2d_relu.new,
    aten::relu_,
    aten::_adaptive_avg_pool2d,
    aten::_reshape_alias_copy,
    aten::squeeze.dim,
    aten::permute

For now, the `BackendPatternConfig` for each of these ops is
the same as the ones for the corresponding ops in the FBGEMM
`BackendConfig`, though this may change in the future.

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85527
Approved by: https://github.com/jerryzh168
2022-09-23 21:24:59 +00:00
andrewor14
034f2b4d23 [Quant][fx] Enable FX static quantization for LSTM (#85068)
**Summary:** This commit enables the custom module LSTM path for
FX graph mode static quantization. This has the same flow as eager
mode, which was already previously supported:

```
     torch.nn.LSTM
           | (prepare_fx)
           v
torch.ao.nn.quantizable.LSTM
           | (convert_fx)
           v
 torch.ao.nn.quantized.LSTM
```

The main reason why custom module LSTM is not supported in FX
graph mode quantization today is because its inputs and outputs
are nested tuples, and existing constructs such as observers,
"quantize" nodes, and "dequantize" nodes do not understand how
to handle complex structures.

Note that the approach taken in this commit is only intended to
be a short-term solution highly tailored to the input and output
formats of custom module LSTM. In the future, for the longer-term
solution, we should design a more general QConfig that allows users
to specify complex input and output formats, and enable FX graph
mode quantization to understand arbitrary nested structures and
automatically infer how to transform the graph accordingly.

**Context:**

Today, in FX graph mode static quantization, custom modules are
assumed to have quantized inputs and quantized outputs, with the
exact dtypes derived from the associated QConfig (default quint8).
Since custom modules are currently not handled through the reference
model flow, their observer replacement logic are a little different
from normal operators:

```
# (1) Original model
input -> custom_module -> output

# (2) Observed model (after prepare)
input -> obs0 -> custom_module -> obs1 -> output

# (3) Quantized model (after convert)
input -> quant -> quantized_custom_module -> dequant -> output
```

In the last step, input observers are replaced with "quantize"
and output observers are replaced with "dequantize", in contrast
to other non-custom-module patterns where observers are replaced
with "quantize-dequantize" pairs instead. Note that, conceptually,
the output observer `obs1` is really just a DeQuantStub, since no
observation is actually needed.

**Custom module LSTM:**

The reason why custom module LSTM cannot be handled in the same
way is because, unlike other custom modules, its inputs and outputs
are nested tuples instead of single tensors. This is how the existing
custom module code would try to handle LSTMs:

```
# (1) Original model
# input format: (input, (hidden0, hidden1))
# output format:  (output, (hidden0, hidden1))
 input -> lstm -> output
hidden0 -/    \-> hidden0
hidden1 -/    \-> hidden1

# (2) Observed model (after prepare)
 input -> obs0 -> lstm -> obs1  # fails
        hidden0 -/  # missing observer
        hidden1 -/  # missing observer
```

However, this fails today because 1) we assume there is only one input
to the custom module, and so we never end up quantizing `hidden0` and
`hidden1`, and 2) the output observer `obs1` is fed a tuple, which it
does not understand how to handle.

**Short-term fix:**

This commit addresses the above by specifically handling the input
and output structures used by custom module LSTM. For the inputs,
we manually insert observers for `hidden0` and `hidden1` to ensure
all input tensors are quantized.

For the outputs, we split the tuple into its internal nodes, attach
a DeQuantStub to each node, and recombine these DeQuantStubs
according to the original structure. Finally, we must also reroute
consumers of the original LSTM tuple (and its internal nodes, e.g.
`lstm[0]`) to these DeQuantStubs:

```
# (1) Original model
 input -> lstm -> output -> linear0
hidden0 -/    \-> hidden0 -> linear1
hidden1 -/    \-> hidden1 -> linear2

# (2) Observed model (after prepare)
 input -> obs0 -> lstm -> output -> dqstub -> linear0 -> obs3
hidden0 -> obs1 -/    \-> hidden0 -> dqstub -> linear1 -> obs4
hidden1 -> obs2 -/    \-> hidden1 -> dqstub -> linear2 -> obs5

# (3) Reference model (after convert)
 input -> quant -> qlstm -> output -> dequant -> linear0 -> quant -> dequant
hidden0 -> quant -/    \-> hidden0 -> dequant -> linear1 -> quant -> dequant
hidden1 -> quant -/    \-> hidden1 -> dequant -> linear2 -> quant -> dequant

# (4) Quantized model (after lowering)
 input -> quant -> qlstm -> output -> quantized_linear0 -> dequant
hidden0 -> quant -/    \-> hidden0 -> quantized_linear1 -> dequant
hidden1 -> quant -/    \-> hidden1 -> quantized_linear2 -> dequant
```

Note that we choose to insert DeQuantStubs here instead of observers
because these will ultimately be replaced by "dequantize" nodes. This
matches the general custom module behavior, where output observers
are replaced only with "dequantize" nodes (as opposed to the normal
"quantize-dequantize" pair), since custom module outputs are assumed
to already be quantized. Using DeQuantStubs instead of observers also
simplifies the "dequantize" insertion logic. In the future, we should use
DeQuantStubs in place of output observers for custom modules in general.

**Test plan:**
python test/test_quantization.py TestQuantizeFx.test_static_lstm
python test/test_quantization.py
TestQuantizeFx.test_static_lstm_consume_tuple

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85068
Approved by: https://github.com/jerryzh168
2022-09-23 13:53:39 +00:00
Jerry Zhang
4523ac7aa1 [quant][docs][ez] Fix formatting for qconfig_mapping (#85306)
Summary:
att

Test Plan:
visual inspection of generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85306
Approved by: https://github.com/vkuzo, https://github.com/andrewor14
2022-09-22 02:09:36 +00:00
Zafar
0336308be5 [AO] Callable norm function for sparsifier (#85236)
The `WeightNormSparsifier` currently only supports L2-norm. This allows the users specify the function that is applied to compute the norm. In addition, L1-norm is also added, as an `.abs` function.

 ## Implementation details

- The functions that are referred to as "norms", are not strictly such. For example, L2-norm of `x` is computed as `F.avg_pool(x * x, ...)`. Similarly, L1-norm of `x` is computed as `F.avg_pool(x.abs(), ...)`.
- When passing callable functions for the norm, the above assumption must hold: `F.avg_pool(norm_fn(x), ...)` will be applied.

 ## Example:

```python
>>> # L3-norm
>>> l3 = lambda T: T * T * T
>>> sparsifier = WeightNormSparsifier(norm=l3)
>>>
>>> # L0-norm
>>> l0 = lambda T: (torch.logical_or(torch.zeros(T.shape), T != 0).to(T.dtype)
>>> sparsifier = WeightNormSparsifier(norm=l0)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85236
Approved by: https://github.com/jcaip
2022-09-21 22:46:25 +00:00
Jerry Zhang
2c285f3e9b [quant][docs] README for FX Graph Mode Quantization (#85070)
Summary:
This is a developer-oriented design doc/README for FX Graph Mode Quantization, the goal for the doc is for new developers of
FX Graph Mode Quantization to get familiarized with the high level algorithm of FX Graph Mode Quantization and ramp up quickly

Test Plan:
no test needed

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85070
Approved by: https://github.com/vkuzo
2022-09-21 16:13:44 +00:00
Vasiliy Kuznetsov
09965957cd quantization: align observer dtype with reference model spec (#85345)
Summary:

Before this PR, the `dtype` attribute of observers was not clearly
defined.  It originally meant `interface_dtype` in the eager mode
workflow, which is how the codebase before this PR is using it.

In the new reference model spec, `dtype` attribute of an observer
represents the `dtype` value which needs to be passed into a `quantize`
function in the reference model spec. This PR aligns the codebase
to this definition of dtype.  In detail:
1. change util functions to interpret `dtype` using the reference model definition
2. change `prepare` to interpret `dtype` using the reference model definition
3. change observers for dynamic quantization to interpret `dtype` using the reference
   model definition.

A future PR (left out of this one to keep LOC small) will deprecate the
`compute_dtype` field and instead expose `is_dynamic` on observers.
"

Test plan:

```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Differential Revision: [D39675209](https://our.internmc.facebook.com/intern/diff/D39675209)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85345
Approved by: https://github.com/z-a-f, https://github.com/jerryzh168
2022-09-21 06:34:26 +00:00
Feisi Fu
d8eae6283d Rename 'torch/ao/nn/quantized._reference' to 'torch/ao/nn/quantized/reference'. (#84974)
Currently, the path for reference modules contains _ which means it's private (https://github.com/pytorch/pytorch/tree/master/torch/ao/nn/quantized/_reference), but we would like to make it public since the reference module is now enabled by default in the fx graph mode quantization flow and it will be added to eager mode flow as well in the future.

To make '_reference' public, it should satisfy the [public API rules](https://github.com/pytorch/pytorch/wiki/Public-API-definition-and-documentation).
I did in the first commit (prepare '_reference' to be public):
1: add __all__ to public modules and packages;
2. made functions, that are only used in the file that the function is defined, private by adding _ at their names.

Fixes #83090. (we rename the 'torch/ao/nn/quantized/_reference', because of migration #81667.)

This is a dup for the #84786.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84974
Approved by: https://github.com/andrewor14, https://github.com/z-a-f
2022-09-16 17:49:07 +00:00
Jerry Zhang
44c30c5d1c [quant][docs] Add example for the error message for fixed qparam ops (#84666)
Summary:
att, since example makes it clearer what the user needs to do

Test Plan:
local test for the error message

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84666
Approved by: https://github.com/vkuzo, https://github.com/andrewor14
2022-09-14 03:43:00 +00:00
Jesse Cai
d6b2f5c643 [Quant][fx] Remove remove_quant_dequant_pairs and fix tests (#84203)
Summary:
- `remove_quant_dequant_pairs` removes ops when a `quant` is followed by a `dequant`
- It looks like the quantized implementation of `layer_norm` only supports float weights, so updated the default qconfig to avoid quantizing the weight param.
-  Fixes broken test, `test_norm_weight_bias`. This was the only test that broke, because the default qconfig dict we pass in quantizes the weight. I just pulled the native qconfig object and converted it to a dict.
- Adds in qconfig and backend config support for layernorm

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
```

Reviewers:

Subscribers:

Tasks: Fixes https://github.com/pytorch/pytorch/issues/83110

Tags: quant, fx

Differential Revision: [D39395141](https://our.internmc.facebook.com/intern/diff/D39395141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84203
Approved by: https://github.com/jerryzh168
2022-09-12 16:32:15 +00:00
Vasiliy Kuznetsov
1dabb51a16 quant: add extra_repr to HistogramObserver (#84760)
Summary:

Adds `extra_repr` to `HistogramObserver`. This is useful when debugging
PTQ models because it allows to quickly check whether a `HistogramObserver`
has received data or not.

Test plan:
```
>>> import torch
>>> obs = torch.ao.quantization.HistogramObserver()
>>> obs(torch.randn(1, 3, 224, 224))
  ...
>>> print(obs)
// before - hard to tell if observer has seen data
HistogramObserver()
// after
HistogramObserver(min_val=-4.778339862823486, max_val=4.311892986297607)
>>>
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84760
Approved by: https://github.com/andrewor14
2022-09-09 21:21:03 +00:00
Jerry Zhang
214a6500e3 [quant][docs] Additonal fixes for quantize_fx docs (#84587)
Summary:
Some more clarifications for the arguments, including linking to object docs (QConfigMapping, BackendConfig) and adding types
in the doc

Test Plan:
```
cd docs
make html
```
and

visual inspection for the generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84587
Approved by: https://github.com/vkuzo
2022-09-09 15:23:23 +00:00
Zafar
521d1071f8 [quant] Subpackage import in nn.quantized (#84141)
Some of the subpackages were not included in the 'torch.nn.quantized'.
That would cause some specific cases fail.
For example, `from torch.nn.quantized import dynamic` would work,
but `import torch; torch.nn.quantized.dynamic` would fail.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84141
Approved by: https://github.com/andrewor14
2022-09-01 11:35:03 +00:00
Jesse Cai
eabe34cc40 [Quant] Remove warnings from using torch.tensor(value) (#84277)
Summary:

I think zafar made an earlier pull for these changes [here](ce0786add2), but they didn't seem to make it through the migration.

Test Plan:
```
python test/test_quantization.py
```

Reviewers:

Subscribers:

Tasks: https://github.com/pytorch/pytorch/issues/73566

Tags: quant

Differential Revision: [D39145070](https://our.internmc.facebook.com/intern/diff/D39145070)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84277
Approved by: https://github.com/z-a-f
2022-08-30 22:10:14 +00:00
Jesse Cai
d144594512 [Quant][fx] Remove WEIGHT_INDEX_DICT and BIAS_INDEX_DICT (Part 2) (#83853)
Summary:
- Finishes the second part of https://github.com/pytorch/pytorch/pull/83263
- Removes WEIGHT_INDEX_DICT and BIAS_INDEX_DICT from utils.py
- Moves two funcitons, `node_arg_is_weight` and `node_arg_is_bias` into utils.py from prepare.py
convert.py and _equalize.py now use node_arg_is_weight instead of the dictionaries
- Adds in quantization support for `F.groupnorm`.

Add in missing BackendPatternConfigs for layernorm, instancenorm, and groupnorm

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 2b157e0dc4f1553be1f4813b4693db952e6fc558
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83848

Fixes #83093
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83853
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14
2022-08-29 18:08:36 +00:00
Kimish Patel
eebdcb5a2e [Pytorch][quantization][ondevice] Add a wrapper API for server side prep (#83742)
for ondevice quantization

Summary:
THis diff just wraps existing API for ondevice quantization

Test Plan:
test/quantization/jit/test_ondevice_quantization.py

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D38868647](https://our.internmc.facebook.com/intern/diff/D38868647)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83742
Approved by: https://github.com/jerryzh168
2022-08-29 17:55:26 +00:00
Kimish Patel
5c7e801c50 [pytorch][on device quant] Finalize method for ondevice quant (#83571)
Summary:
After inserting quant dequant nodes in the graph, we need
1. Insert packed param creation and quantized op
2. Create packed_params attribute in the top module. For this we need
graph that inlined except for calculate_qparams method calls. But they
can be inlined too. So perhaps we need to make sure no other callmethods
exist.
3. Insert SetAttr for the packed param
4. Insert GetAttr for the packed param
5. Use GetAttr output for quantized op where applicable, e.g.
linear_dynamic

The above is added to quantize_<method-name> method created inprevious
step. Once the above steps are done clone the method into
quantized_<method-name>

Modify quantize_<method-name>:
1. Remove all outputs from the method.
2. Run dce
3. Remove all inputs from the method except self.

Modify quantized_<method-name>:
1. Remove all packed_param setAttr nodes.
2. Run dce.

This should result in removal of all nodes that generate packed param.

Test Plan: To be written

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D38771416](https://our.internmc.facebook.com/intern/diff/D38771416)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83571
Approved by: https://github.com/jerryzh168
2022-08-29 17:53:11 +00:00
Kimish Patel
446afb5f9f [On Device Quantization][pytorch]Make insert_quant_dequant support ondevice ptq (#83570)
Summary:
This diff adds a way to:
- clone previously observed method
- Add calls to observer's calculate_qparams methods
- Extract the scale and zero point
- Use them to insert quant dequant nodes

Now for forward method we have
- observe_forward
- quantize_forward

observe_forward is used post training to observer statistics. In the
case of dynamic PTQ this requires just running that method once to
update weight observer statistics.

quantize_forward method will be used to use the observer
statistics to calculate quantization parameters and apply that to quant
dequant op.

Subsequent diffs will replace dequant + op with their quantized op
counter parts and replace quantize ops with relevant packed params class
where possible

Test Plan:
To be written

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D38771419](https://our.internmc.facebook.com/intern/diff/D38771419)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83570
Approved by: https://github.com/jerryzh168
2022-08-29 17:51:00 +00:00
Kimish Patel
9189edb3b3 [Quantization][Pytorch] On device quantization support part 1 (#83568)
Summary:
TO support on device quantization this diff introduces observer
insertion. Specifically observers are inserted by adding new method with
prefix observ_.

Intent is that post training, this method will be run to record
statistics

Test Plan:
test_ondevice_quantization.py

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D38771417](https://our.internmc.facebook.com/intern/diff/D38771417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83568
Approved by: https://github.com/jerryzh168
2022-08-29 17:22:30 +00:00
zaf
2f04ba2c7c [quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat`
    - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)!

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:38 +00:00
zaf
29e83b6599 [quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- `torch/ao/nn/__init__.py` → Changing the imports to lazy.

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)!

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:37 +00:00
zaf
b1455f9424 [quant][ao_migration] torch.nn.quantized._referencetorch.ao.nn.quantized._reference (#78715)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] [Current PR] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36860927](https://our.internmc.facebook.com/intern/diff/D36860927/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860927/)!

Differential Revision: [D36860927](https://our.internmc.facebook.com/intern/diff/D36860927)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78715
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:36 +00:00
zaf
d32a762147 [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:34 +00:00
zaf
c92e5ac95b [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012/)

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:33 +00:00
XiaobingSuper
31f151767b add qscheme check for quantization observer (#80126)
Motivation: each quantization observer only supports a limit qschemes, we need to do this check at the initiation step, rather than at the running step, such as MinMaxObserver with set qscheme with **torch.per_channel_affine**, there will have a runtime error at the running the calibration step:

```
AttributeError: 'MinMaxObserver' object has no attribute 'ch_axis'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80126
Approved by: https://github.com/jerryzh168
2022-08-25 10:03:19 +00:00
Sergii Dymchenko
591222f5d9 Fix use-dict-literal lint (#83718)
Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718
Approved by: https://github.com/albanD
2022-08-24 00:26:46 +00:00
Vasiliy Kuznetsov
58170fb8aa Remove DBR quantization from the codebase (#83642)
Summary:

DBR quantization is a no-go for now because it does not align well with
PyTorch 2.0 plans and we do not want to build yet another tracing system.

Deleting it from the codebase for now since there are no plans to develop
this in the near future. We can bring it back at a later time if necessary.

Test plan:

CI

Differential Revision: [D38839556](https://our.internmc.facebook.com/intern/diff/D38839556)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83642
Approved by: https://github.com/andrewor14, https://github.com/jerryzh168
2022-08-23 15:18:40 +00:00
Jerry Zhang
a419e483b2 [quant][fx] Add support for quantized matmul (#83885)
Summary:
att, probably missed the op during migration to the reference flow

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qmatmul

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83885
Approved by: https://github.com/andrewor14
2022-08-23 05:46:25 +00:00
Andrew Or
b8496eb411 [Quant] Separate FBGEMM/QNNPACK BackendConfigs (#83566)
Summary: Previously we use a single BackendConfig
(get_native_backend_config) for both the FBGEMM and QNNPACK
backends. However, these two backends have subtle differences
in terms of their requirements that cannot be satisfied using
a single BackendConfig. Therefore, this commit is the first step
torwards decoupling the two backends. The real change in
functionality will come in a future commit after DTypeConfig
supports quant_min/quant_max and scale_min/scale_max. Existing
uses of `get_native_backend_config` should not be affected.

Public facing changes:
```
from torch.ao.quantization.backend_config import (
    get_fbgemm_backend_config,
    get_qnnpack_backend_config,
)
fbgemm_backend_config = get_fbgemm_backend_config()
qnnpack_backend_config = get_qnnpack_backend_config()
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168

Subscribers: jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83566
Approved by: https://github.com/jerryzh168
2022-08-22 16:44:10 +00:00
PyTorch MergeBot
6a9c02339d Revert "[quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)"
This reverts commit 432f037498.

Reverted https://github.com/pytorch/pytorch/pull/78713 on behalf of https://github.com/janeyx99 due to Reverting for breaking (trunk-only) ios build
2022-08-22 07:32:37 +00:00
PyTorch MergeBot
b1a7b67529 Revert "[quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)"
This reverts commit e6fb97d8ae.

Reverted https://github.com/pytorch/pytorch/pull/78714 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:30:48 +00:00
PyTorch MergeBot
355d343fa8 Revert "[quant][ao_migration] torch.nn.quantized._referencetorch.ao.nn.quantized._reference (#78715)"
This reverts commit a7344e52b9.

Reverted https://github.com/pytorch/pytorch/pull/78715 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:29:15 +00:00
PyTorch MergeBot
e9dd4d5adf Revert "[quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)"
This reverts commit e0876feb49.

Reverted https://github.com/pytorch/pytorch/pull/78717 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:26:44 +00:00
PyTorch MergeBot
4cbb1986fe Revert "[quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)"
This reverts commit 7cd2fa1d38.

Reverted https://github.com/pytorch/pytorch/pull/78716 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:23:24 +00:00
zaf
7cd2fa1d38 [quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat`
    - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716
Approved by: https://github.com/jerryzh168
2022-08-22 05:33:23 +00:00
zaf
e0876feb49 [quant][ao_migration] torch.nn.quantizabletorch.ao.nn.quantizable. (#78717)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] [Current PR] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861090](https://our.internmc.facebook.com/intern/diff/D36861090/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861090/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78717
Approved by: https://github.com/jerryzh168
2022-08-22 05:31:48 +00:00
zaf
a7344e52b9 [quant][ao_migration] torch.nn.quantized._referencetorch.ao.nn.quantized._reference (#78715)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] [Current PR] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36860927](https://our.internmc.facebook.com/intern/diff/D36860927/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860927/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78715
Approved by: https://github.com/jerryzh168
2022-08-22 05:29:23 +00:00
zaf
e6fb97d8ae [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-22 05:22:00 +00:00
zaf
432f037498 [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D36860145](https://our.internmc.facebook.com/intern/diff/D36860145/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-22 01:38:55 +00:00
Jerry Zhang
13f42069a8 [quant][fx][refactor] Rename qconfig_utils.py to qconfig_mapping_utils.py in torch/ao/quantization/fx (#83369)
Summary:
att, it seems more appropriate to name it qconfig_mapping_utils, also we probably want to move
the functions in torch/ao/quantization/qconfig_mapping_utils.py to torch/ao/quantization/fx/qconfig_mapping_utils.py as well

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83369
Approved by: https://github.com/andrewor14
2022-08-19 21:36:05 +00:00
Daniel Recoskie
7453019e79 Remove duplicate_dequantize_node and remove_extra_dequantize (#83611)
Summary: removed duplicate_dequantize_node and remove_extra_dequantize

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels

Reviewers: jerryzh168

Subscribers:

Tasks:

Tags:

Differential Revision: [D38841052](https://our.internmc.facebook.com/intern/diff/D38841052)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83611
Approved by: https://github.com/jerryzh168
2022-08-19 16:59:55 +00:00
vspenubarthi
88e0165d08 [ao] Added Equalization QConfig generation to ModelReport class (#83698)
Summary: This adds the capability to generate a QConfigMapping based on
the suggestions of the ModelReport API for the user to use. The only
dependency of this feature is that the calibration is run before the
generation of the QConfigMapping and there is no dependency on the
report generation other than that the observers cannot be removed before
this is called. This maps module fqns to EqualizationQConfigs instead of regular
QConfigs.

Example Usage (after callibration):

```
quantization_mapping = mod_report.generate_qconfig_mapping()
equalization_mapping = mod_report.generate_equalization_mapping()

prepared_model = quantize_fx.prepare_fx(model, mapping, example_input, _equalization_config=equalization_mapping)

quantized_model = quantize_fx.convert_fx(prepared)
```

This was tested by ensuring that the suggestions generated in the
QConfigMapping are:
	1.	Correct according to the set backend and data passed through
	2.	Able to be prepared and converted as a proper config (is a valid
config)
The test for this is a part of the TestFxModelReportClass test suite.

Test Plan: python test/test_quantization.py TestFxModelReportClass.test_equalization_mapping_generation

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83698
Approved by: https://github.com/jerryzh168
2022-08-19 02:16:01 +00:00
Jerry Zhang
784c47fbee [quant][fx][refactor] Move ObservationType to backend_config.py (#83368)
Summary:
Now we have a separate file to define BackendConfig related classes, we can move ObservationType to that file as well

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83368
Approved by: https://github.com/andrewor14
2022-08-19 01:22:10 +00:00
vspenubarthi
5e715be17e [ao] Added Quantization QConfig generation to ModelReport class (#83688)
Summary: This adds the capability to generate a QConfigMapping based on
the suggestions of the ModelReport API for the user to use. The only
dependency of this feature is that the callibration is run before the
generation of the QConfigMapping and there is no dependency on the
report generation other than that the observers cannot be removed before
this is called.

Example Usage (after callibration):
```
mapping = mod_report.generate_qconfig_mapping()

prepared_model = quantize_fx.prepare_fx(model, mapping, example_input)

quantized_model = quantize_fx.convert_fx(prepared)
```

This was tested by ensuring that the suggestions generated in the
QConfigMapping are:
1. Correct according to the set backend and data passed through
2. Able to be prepared and converted as a proper config (is a valid
config)

The test for this is a part of the TestFxModelReportClass test suite.

Test Plan: python test/test_quantization.py TestFxModelReportClass.test_qconfig_mapping_generation

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83688
Approved by: https://github.com/jerryzh168
2022-08-18 23:12:05 +00:00
zaf
78c8a0d752 [quant][ao_migration] torch.nn.quantized.functionaltorch.ao.nn.quantized.functional (#78712)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
  - [X] [Current PR] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
  - [ ] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
  - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
  - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
  - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
  - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
  - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
  - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
  - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
    - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
    - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10

Differential Revision: [D36792967](https://our.internmc.facebook.com/intern/diff/D36792967/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36792967/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78712
Approved by: https://github.com/jerryzh168
2022-08-18 17:51:54 +00:00
Daniel Recoskie
ea2183f0ea removed duplicate_quantize_dynamic_node (#83459)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83459
Approved by: https://github.com/jerryzh168
2022-08-17 21:26:12 +00:00
Jerry Zhang
3586af8adc [quant] Remove unused quantize handler definitions (#83360)
Summary:
CommonQuantizeHandler This was added previously to make some of the refactor to use reference quantized model flow easier, now we have
fully migrated to use reference quantized model flow, it's no longer needed, so we can remove it

Also updated some comments

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83360
Approved by: https://github.com/andrewor14
2022-08-16 23:13:01 +00:00
Jesse Cai
d4bd88b64b [Quant][fx] Remove WEIGHT_INDEX_DICT and BIAS_INDEX_DICT (#83263)
Summary:

This change adds in input_type_to_index mappings to the backend patterns for `nn.functional.linear`, `nn.functional.conv1d`, `nn.functional.conv1d`, and `nn.functional.conv3d`.

This let's us remove `WEIGHT_INDEX_DICT` and `BIAS_INDEX_DICT` from `prepare.py`.
Instead we pass around `backend_config` and check wether an arg is weight/bias agains that config

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Reviewers:
@andrewor14

Subscribers:

Tasks:

Tags: quant, fx

Differential Revision: [D38705516](https://our.internmc.facebook.com/intern/diff/D38705516)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83263
Approved by: https://github.com/andrewor14
2022-08-15 14:23:22 +00:00
joncrall
4618371da5 Integrate xdoctest - Rebased (#82797)
This is a new version of #15648 based on the latest master branch.

Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.

In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)

Fixes https://github.com/pytorch/pytorch/issues/71105

@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
2022-08-12 02:08:01 +00:00
Jerry Zhang
bce1540f1f [quant][fx] Add more detailed docs for prepare_fx/prepare_qat_fx/convert_fx (#83132)
Summary:
att

Test Plan:
visual inspection of generated docs page
https://pytorch.org/docs/stable/quantization-support.html

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83132
Approved by: https://github.com/andrewor14
2022-08-11 16:20:30 +00:00
vspenubarthi
a2ca89331f [ao] Create framework for ModelReport Qconfig Generation (#83091)
Summary: This creates the framework in the ModelReport API for the
generation of QConfigs by the ModelReport instance based on suggestions.
This functionality will eventually be added into the report generation
or be something that complements it, but for now it will be an
independent call for API stability and to be able to better modularize
the features as it stabilizes.

This also adds the framework for the relavent test function and a note
in the README at what future changes are planned for this new method in
the ModelReport API.

Test Plan: python test/test_quantization.py TestFxModelReportClass.test_qconfig_generation

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83091
Approved by: https://github.com/HDCharles
2022-08-11 00:11:50 +00:00
vspenubarthi
888c1a143f [ao] Added some additional / future tasks for ModelReport API to README (#83088)
Summary: I added some additional tasks to further improve the
ModelReport API to the README. These are tasks that I will try to
complete in the next few weeks but also can help to provide future
direction later.

Test Plan: No code added

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83088
Approved by: https://github.com/andrewor14
2022-08-10 17:13:25 +00:00
macandro96
03abcf2317 [ao][sparsity] Data Sparsity with Post Training Quantization (#82759)
Implementation of `post_training_sparse_quantize` that takes in a model
and applies sparsification and quantization to only `embeddings` & `embeddingbags`.
The quantization step can happen before or after sparsification depending on the `sparsify_first` argument.

Test Plan:
```python test/test_ao_sparsity.py TestQuantizationUtils```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82759
Approved by: https://github.com/z-a-f
2022-08-10 16:51:35 +00:00
Yixin Bao
2e1929709d Back out "[Quant][fx] Remove dequant-quant around getitem" (#83147)
Differential Revision: D38566988

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83147
Approved by: https://github.com/soumith
2022-08-10 09:41:34 +00:00
Sergii Dymchenko
a0b3854548 Change seperate -> separate (#83056)
One instance was caught by Meta-internal "exact-word-misspell" linter in D38505529.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83056
Approved by: https://github.com/huydhn, https://github.com/seemethere
2022-08-09 23:11:34 +00:00
Andrew Or
782f3489c6 [Quant][fx][bc-breaking] Integrate BackendConfig with quantization flow (part 2) (#82557)
This is part 2 of the effort to replace `backend_config_dict` with
a python config object, a more formal and robust API that leads to
better user experience. This commit integrates the `BackendConfig`
implemented in part 1 (https://github.com/pytorch/pytorch/pull/81469)
with the existing FX graph mode quantization flow.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

BC-breaking Notes:

Before:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import ObservationType
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = {
    "input_dtype": torch.quint8,
    "output_dtype": torch.quint8
    "weight_dtype": torch.qint8,
    "bias_dtype": torch.float,
}

backend_config_dict = {
    "name": "my_backend",
    "configs": [{
        "pattern": torch.nn.Linear,
        "observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT,
        "dtype_configs": [dtype_config],
        "root_module": torch.nn.Linear,
        "reference_quantized_module": torch.nn.quantized._reference.Linear,
        "qat_module": torch.nn.qat.Linear,
    }]
}

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(
    m, qconfig_mapping, example_inputs,
    backend_config_dict=backend_config_dict)
m = convert_fx(m, backend_config_dict=backend_config_dict)
```

After:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import (
    BackendConfig,
    BackendPatternConfig,
    DTypeConfig,
    ObservationType,
)
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = DTypeConfig(
    input_dtype=torch.quint8,
    output_dtype=torch.quint8
    weight_dtype=torch.qint8,
    bias_dtype=torch.float,
)

backend_config = BackendConfig("my_backend").set_backend_pattern_config(
    BackendPatternConfig(torch.nn.Linear)
        .set_observation_type(ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT)
        .add_dtype_config(dtype_config)
        .set_root_module(torch.nn.Linear)
        .set_reference_quantized_module(torch.nn.quantized._reference.Linear)
        .set_qat_module(torch.nn.qat.Linear))

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(m, qconfig_mapping, example_inputs, backend_config=backend_config)
m = convert_fx(m, backend_config=backend_config)
```

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

Differential Revision: [D38471932](https://our.internmc.facebook.com/intern/diff/D38471932)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82557
Approved by: https://github.com/jerryzh168
2022-08-08 18:55:50 +00:00
asl3
b91ff5e361 [quant] Remove unneeded lines from APoT linear (#82909)
### Summary
Remove unnecessary lines from APoT linear module

### Test Plan
Run unit tests with:` python /pytorch/test/quantization/core/experimental/test_linear.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82909
Approved by: https://github.com/jerryzh168
2022-08-08 11:30:24 +00:00
vspenubarthi
86437b8631 [ao] Updated ModelReportVisualizer per-channel line plot (#82918)
Summary: Before, the line plot for the ModelReportVisualizer used to
plot a different line for each channel. However, for models that have a
lot of channels, this can get really hard to read and parse and doesn't
provide much valuable information.

Now, we just have a single value per module that is the average of the
500 channels.

We also considered plotting 3 lines (a min line, a max line, and an
average line) but the issue was that large outliers could result in one
of the lines completely messing up the scale and the other two not being
visible. As a result, it made sense to do an average and let the user
use the report data to generate the other two if they wished to do so.

This was tested visually in a ipynb notebook

Test Plan: Tested visually in a ipynb notebook

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82918
Approved by: https://github.com/jerryzh168
2022-08-06 02:55:43 +00:00
vspenubarthi
da5272ef3b [ao] Fix per-channel histogram visualization in ModelReportVisualizer (#82917)
Summary: There was an issue with per-channel visualizations in the
ModelReportVisualizer that in specific scenarios in which there were
only per-channel features for a module, it would fail to specifically
get the channel by channel info.

After digging through the code, the core reason was a for loop that was
enumerating on the `tensor_table` (tensor level info) even in the
scenario in which we only had per-channel info.

This was fixed, and tested in a Bento to ensure expected functionality.

Test Plan: Tested visually

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82917
Approved by: https://github.com/jerryzh168
2022-08-06 02:36:55 +00:00
vspenubarthi
2e74b51a4e [ao] Added ModelReportVisualizer info to README for ModelReport (#82796)
Summary: This adds information on how the ModelReportVisualizer
integrates into the ModelReport API into the README file for the
ModelReport folder. It updates the high level usage flow, includes
information on the API, some of the important public methods and what
they do, as well as updates to the folder structure to include the new
`model_report_visualizer.py` file as well as updating the tests section
to highlight that there are high level tests for the
ModelReportVisualizer as well.

There really aren't any direct tests for this since it's just updates to
a README, but the tests for the ModelReportVisualizer are relavent and
were run to make sure table generation was still properly occuring.

Test Plan: python test/test_quantization.py TestFxModelReportVisualizer

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82796
Approved by: https://github.com/jerryzh168
2022-08-05 19:29:50 +00:00
vspenubarthi
5ca098fe38 [ao] Changed ratio of channels needed for input-weight rec (#82795)
Summary: After working on a tutorial and spending more time
experimenting with the input-weight equalization recommendation feature,
I realized that having half as the number of channels to benefit from
input-weight was too high, and that it should be a bit more lenient.
Based on the example I played around with in an internal tutorial, I
found that somewhere in the 0.3 - 0.4 threshold made more sense. In the
future, more in-depth testing and experimenting with more models may
help further fine-tune this fraction of channels that would benefit.

Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82795
Approved by: https://github.com/jerryzh168
2022-08-05 16:34:30 +00:00
vspenubarthi
95c7fc395b [ao] Fix punctuation issue with Dynamic Static Report (#82794)
Summary: This fixes a punctuation issue with the Dynamic Static Detector
that was missing a period when suggesting to use a dynamic quantize per
tensor layer.

Quick grammer fix and no other changes to code.

Test Plan: python test/test_quantization.py TestFxModelReportDetectDynamicStatic

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82794
Approved by: https://github.com/jerryzh168
2022-08-05 16:33:59 +00:00
Andrew Or
8f38f6773a [Quant][fx] Remove dequant-quant around getitem (#82675)
Summary: https://github.com/pytorch/pytorch/issues/82480 saw
unnecessary dequant-quant pairs around the getitem op, which led
to significant slowdowns. This commit simply removes this pair in
the lowering step, since getitem already handles quantized inputs.

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_getitem_no_dequant_quant

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

Tasks: https://github.com/pytorch/pytorch/issues/82480

Differential Revision: [D38427508](https://our.internmc.facebook.com/intern/diff/D38427508)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82675
Approved by: https://github.com/jerryzh168
2022-08-05 15:01:51 +00:00
asl3
34103a3033 Refactor quant levels visualization (#82790)
### Summary
Refactors quantization levels visualization function to include alpha qparam in parameters of `float_to_apot` function call (due to `float_to_apot` function update). Also adds additional detail to the documentation for `quant_levels_visualization`.

### Test Plan
Print visualization by calling `quant_levels_visualization` function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82790
Approved by: https://github.com/jerryzh168
2022-08-04 22:28:50 +00:00
Hao Li
aa40503954 Add Custom Module Support List (#82606)
Summary:
Add a global custon module support list  for the users to specify the modules they want the equalization process support.

To use this list, import it from the _equalize.py file and append module in it.

Unittest passed to check global support list:

https://pxl.cl/28RKG

Test Plan: buck1 test mode/dev //on_device_ai/odai/tests/transforms:test_transforms -- --exact 'on_device_ai/odai/tests/transforms:test_transforms - test_custom_support_list (on_device_ai.odai.tests.transforms.test_input_weight_for_turing.TestInputWeight)'

Reviewed By: jerryzh168

Differential Revision: D38264244

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82606
Approved by: https://github.com/HDCharles
2022-08-03 17:48:51 +00:00
asl3
4680047001 Modify LinearAPoT matrix multiplication bitshift to support all k (#82409)
### Summary
This PR modifies the bitshift implementation of matrix multiplication for LinearAPoT in `bitshift_mul` to support all input values of k. It also fixes the row/col dimension assignment for the `mat_mul `method

### Test Plan
Run unit tests with: `python test/quantization/core/experimental/test_linear.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82409
Approved by: https://github.com/dzdang
2022-07-28 20:40:26 +00:00
asl3
a1e6325149 Implement linear module for APoT quantization (#82105)
### Summary
Implement linear module to support APoT quantization. Use bitshifting method discussed in APoT paper https://arxiv.org/pdf/1909.13144.pdf to multiply PoT terms in APoT quantized weight tensor with uniformly quantized activation tensor to demonstrate alternative to matrix multiplication.

Multiplication using bitshifting for PoT:

<img width="340" alt="Screen Shot 2022-07-25 at 12 44 26 PM" src="https://user-images.githubusercontent.com/68875504/180831050-ff849bca-8eb0-4b69-9b7f-c6c94a4cdfb5.png">

### Test Plan
Run unit tests with: `python /pytorch/test/quantization/core/experimental/test_linear.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82105
Approved by: https://github.com/HDCharles
2022-07-28 13:09:59 +00:00
asl3
13ad4739a6 [quant] Implement PTQ for APoT FakeQuant (#81040)
### Summary:
This PR implements PTQ for APoT FakeQuant. It runs models (Resnet-18 pre-trained model, ImageNet dataset) to compare accuracy metrics for different qconfig settings of uniform vs. APoT quantized activation and weight.

According to the collected accuracy stats, model #2 (uniform activation and APoT weight) appears to have a slight improvement in accuracy compared to model #1 (uniform activation and uniform weight) for 8-bit and significant improvement for 4-bit (see "Accuracy Stats" section below).

### Test Plan:
Run models with: `python test/quantization/core/experimental/fx_graph_mode_apot.py`

### Accuracy Stats:
8-bit (Uniform int8, APoT b = 8 k = 2)

**Model #1:** Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 64.43% (Top-1), 85.62% (Top-5)

**Model #2:** Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 64.51% (Top-1), 85.78% (Top-5)

**Model #3:** APoT activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 64.32% (Top-1), 85.78% (Top-5)

4-bit (Uniform int4, APoT b = 4 k = 2)

**Model #1:** Uniform activation, uniform weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 45.63% (Top-1), 71.96% (Top-5)

**Model #2:** Uniform activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 64.24% (Top-1), 85.56% (Top-5)

**Model #3:** APoT activation, APoT weight (FX Graph Mode quantized)
Evaluation accuracy on test dataset: 45.40% (Top-1), 76.21% (Top-5)

**Full Precision model (FX Graph Mode quantized)**
Evaluation accuracy on test dataset: 69.76% (Top-1), 89.08% (Top-5)

**Eager mode quantized model**
Evaluation accuracy on test dataset: 69.49% (Top-1), 88.90% (Top-5)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81040
Approved by: https://github.com/jerryzh168
2022-07-28 07:21:31 +00:00
HDCharles
8d82367f52 [ao][sparsity][fx] make sparse prepare->quant prepare compose (#81993)
Summary: The primary issue was that fusion and matching had to be
updated to handle parametrized modules

Test Plan: python test/test_ao_sparsity.py TestFxComposability

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81993
Approved by: https://github.com/jerryzh168
2022-07-27 22:09:29 +00:00
macandro96
e0e3a98555 [ao][sparsity] README for base data scheduler class (#82131)
The readme file contains an overview of the base data scheduler.
Consists of code snippets and instructions on how to create your own custom
data scheduler and how to use during training a model.

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82131
Approved by: https://github.com/z-a-f
2022-07-27 20:44:38 +00:00
HDCharles
8533951f09 [ao][sparsity][fx] make quant prepare -> sparse prepare compose (#81992)
Summary: sparse_prepare automatically composes with quantized prepare
even in cases with fusion. However, the convert step needed to be updated to handle parametrized
modules.

Test Plan: python test/test_ao_sparsity.py TestFxComposability

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81992
Approved by: https://github.com/jerryzh168
2022-07-27 17:14:13 +00:00
macandro96
ad788662b1 [ao][sparsity] README for activation sparsifier (#81814)
The README contains introduction and details on the activation sparsifier. It also contains
code snippets and examples on using the activation sparsifier.

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81814
Approved by: https://github.com/z-a-f
2022-07-27 17:04:45 +00:00
macandro96
7af2baffce [ao][sparsity] README for data sparsifier lightning callbacks (#81813)
The README contains instructions on using the lightning callbacks to sparsify the
model during and post training.

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81813
Approved by: https://github.com/z-a-f
2022-07-27 16:50:01 +00:00
macandro96
0d0bd0e3c6 [ao][sparsity] README for BaseDataSparsifier (#82130)
The readme file contains an overview of the base data sparsifier, it's implementation details.
Also, consists of code snippets and instructions on how to create your own custom
data sparsifier.

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82130
Approved by: https://github.com/z-a-f
2022-07-27 16:39:54 +00:00
macandro96
7391dec96a [ao][sparsity] Bug Fix: Retain mask and config while replacing data in data sparsifier (#82129)
Bug: The config and mask were being recreated while replacing data on the data sparsifier.

Fix: Introduced an argument `reuse_mask` which when set `True` uses the old mask. If new config is not
specified, the data sparsifier by default uses the old config with the new data.
Also, added unit tests to check this bug.

Test Plan:
```python test/test_ao_sparsity.py TestBaseDataSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82129
Approved by: https://github.com/z-a-f
2022-07-27 16:37:28 +00:00
macandro96
85a9e7367c [ao][sparsity] Store mask as sparse coo during serialization for the activation sparsifier (#82181)
The stored mask is dumped as `torch.sparse_coo` while serializing. While restoring the state,
the mask is converted to a dense tensor again.

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82181
Approved by: https://github.com/z-a-f
2022-07-26 23:01:35 +00:00
macandro96
18e8bc9b72 [ao][sparsity] Store mask as sparse coo during serialization for the data sparsifier (#82180)
The stored mask is dumped as `torch.sparse_coo` while serializing. While restoring the state,
the mask is converted to a dense tensor again.

Test Plan:
```python test/test_ao_sparsity.py TestBaseDataSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82180
Approved by: https://github.com/z-a-f
2022-07-26 22:59:03 +00:00
asl3
a01fb5392f Modify APoT dequantize method (#82126)
### Summary
Modify APoT dequantize method to correctly add dequantized values to result numpy array and retain original tensor dimensions

### Test Plan
Run unit tests with: `python test/quantization/core/experimental/test_quantizer.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82126
Approved by: https://github.com/HDCharles
2022-07-26 20:26:51 +00:00
vspenubarthi
04fc3e4c04 [ao] Add histogram visualization capability to ModelReportVisualizer (#81975)
Summary: This adds the capability to visualize the histogram in the
ModelReportVisualizer. You can visualize the histogram of a single
feature for a single layer, (for example you want to see the
distribution of some data across all channels), or for some feature
across multiple layers of a similar kind. All channel data is merged
together to plot one large distribution. The user gets to decide the
number of bins the histogram has and it will create those many equally
spaced bins.

    Expected Usage
    ```
    mod_rep_visualizer.generate_histogram_visualization(<feature_name>,<module_name>)
    ```

    You can also filter the modules so that only modules with a certain
    substring will have their features represented in the plot.

    > ** This is intended to be used in a `.ipynb` style notebook**

    The tests for this were just visual inspection for two reasons:
        1.) This method does not return anything, it just generates the
        visualization plot
        2.) All the data to create the plot visualization is gotten from
        `generate_filtered_tables` which is already tested, so testing all that
        for this again would be redundant.

    Example Image outputs are pasted below in the PR thread.

Test Plan: Visual Test

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81975
Approved by: https://github.com/jerryzh168
2022-07-26 19:57:43 +00:00
vspenubarthi
d7d04fd38c [ao] Add line plot visualization capability to ModelReportVisualizer (#81974)
Summary: This adds the capability to visualize the line plot in the
ModelReportVisualizer. You can visualize line plots of single feature,
and this feature can either be a per-tensor or per-channel feature. If
the feature is per tensor, then the idx of the module is printed as the
x axis and the values of the feature as the y. If the feature is per
channel, then **an** (the first one) idx of the module will the value on
the x axis and the corresponding feature val in the y axis, and there
will be a seperate line for each channel, and a legend denoting which
line belongs to which channel.

Expected Usage
```
mod_rep_visualizer.generate_plot_visualization(<feature_name>)
```

You can also filter the modules so that only modules with a certain
substring will have their features represented in the plot.

> ** This is intended to be used in a `.ipynb` style notebook**

The tests for this were just visual inspection for two reasons:
    1.) This method does not return anything, it just generates the
    visualization plot
    2.) All the data to create the plot visualization is gotten from
    `generate_filtered_tables` which is already tested, so testing all that
    for this again would be redundant.

Example Image outputs are pasted below in the PR thread.

Test Plan: Visual Test

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81974
Approved by: https://github.com/jerryzh168
2022-07-26 19:55:26 +00:00
vspenubarthi
eac20a45fa [ao] Added table visualization capability to ModelReportVisualizer (#81973)
Summary: This adds the capability to visualize the table of information
in the ModelReportVisualizer. This allows the user to filter based on
module name pattern match or feature name pattern match and the
implemented method `generate_table_visualization` prints out the table
in a string format that is easy to parse.

Expected Usage
```
mod_rep_visualizer.generate_table_visualization()
```
Can also pass in optional filters as well if needed.

The tests for this were just visual inspection for two reasons:
1.) This method does not return anything, it just generates the
visualization
2.) All the data to create the table visualization is gotten from
`generate_filtered_tables` which is already tested, so testing all that
for this again would be redundant.

Example Printed Output
```
Tensor Level Information
  idx  layer_fqn        input_activation_global_max    input_activation_global_min    input_weight_channel_axis    input_weight_threshold    outlier_detection_channel_axis    outlier_detection_ratio_threshold    outlier_detection_reference_percentile    weight_global_max    weight_global_min
-----  -------------  -----------------------------  -----------------------------  ---------------------------  ------------------------  --------------------------------  -----------------------------------  ----------------------------------------  -------------------  -------------------
    1  block1.linear                        1.9543                        -1.33414                            1                       0.5                                 1                                  3.5                                      0.95             0.380521           -0.568476
    2  block2.linear                        1.81486                        0                                  1                       0.5                                 1                                  3.5                                      0.95             0.521438           -0.0256195

 Channel Level Information
  idx  layer_fqn        channel    constant_batch_counts    input_activation_per_channel_max    input_activation_per_channel_min    input_weight_channel_comparison_metrics  input_weight_equalization_recommended      outlier_detection_batches_used  outlier_detection_is_sufficient_batches      outlier_detection_percentile_ratios  outliers_detected      weight_per_channel_max    weight_per_channel_min
-----  -------------  ---------  -----------------------  ----------------------------------  ----------------------------------  -----------------------------------------  ---------------------------------------  --------------------------------  -----------------------------------------  -------------------------------------  -------------------  ------------------------  ------------------------
    1  block1.linear          0                        0                            1.9543                             -1.33414                                    0.956912  True                                                                    1  True                                                                     1.77489  False                                0.300502                -0.568476
    2  block1.linear          1                        0                            1.14313                            -0.756184                                   1.04378   True                                                                    1  True                                                                     2.07887  False                                0.336131                -0.261025
    3  block1.linear          2                        0                            0.653274                           -0.937748                                   1.10837   True                                                                    1  True                                                                     1.00712  False                                0.380521                -0.183536
    4  block2.linear          0                        0                            1.81486                             0                                          0.542731  True                                                                    1  True                                                                     1.78714  False                                0.13552                 -0.0256195
    5  block2.linear          1                        0                            1.72578                             0                                          0.505475  True                                                                    1  True                                                                     1.40475  False                                0.485536                 0.352621
    6  block2.linear          2                        0                            1.7284                              0                                          0.909304  True                                                                    1  True                                                                     1.40392  False                                0.521438                 0.0906605
```

Test Plan: Visual Test

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81973
Approved by: https://github.com/jerryzh168
2022-07-26 19:51:45 +00:00
vspenubarthi
b9ed224a1b [ao] Added filtered table generation capability to ModelReportVisualizer (#81673)
Summary: This adds the ability to generate and display the collected
statistics in a table format for the ModelReportVisualizer. The output
of this is a dictionary containing two keys, mapping to a tensor stats
table and channel stats table respectively.

The two ways you can filter is by module_fqn, by only including modules
with the `module_fqn_filter` substring, or by feature filter, which only includes
features that contain the `feature_filter` substring.

Expected Use:
```
table_dict = mod_rep_visualizer.generate_filtered_tables()

tensor_table = table_dict[ModelReportVisualizer.TABLE_TENSOR_KEY]
channel_table = table_dict[ModelReportVisualizer.TABLE_CHANNEL_KEY]
```

Headers for the Tensor level info:
```
         idx  layer_fqn  feature_1   feature_2   feature_3   .... feature_n
        ----  ---------  ---------   ---------   ---------        ---------
```

Headers for the channel level info:
```
         idx  layer_fqn  channel  feature_1   feature_2   feature_3   .... feature_n
        ----  ---------  -------  ---------   ---------   ---------        ---------
```

The reason we split this up into two tables is because with the design
where everything is in one table, it is ambiguous and easy to mix up
whether a tensor level stat is actually tensor level stat or might be a
per channel stat since we would have a row for each channel.

Also changed some of the framework to abstract out the finding of the
tables to the actual visualization to make the API much easier for the
user to digest and parse.

Test Plan: python test/test_quantization.py TestFxModelReportVisualizer.test_generate_table

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81673
Approved by: https://github.com/jerryzh168
2022-07-26 19:42:08 +00:00
macandro96
fc78976921 [ao][sparsity] README for data sparsifier benchmarking (#81781)
The README contains the results of the benchmarking exercise and area of future work.
It also contains instructions to run the benchmarking scripts to reproduce the results.
Also, contains other information such as requirements, machine config etc.

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81781
Approved by: https://github.com/z-a-f
2022-07-26 16:57:09 +00:00
macandro96
58c330fcbb [ao][sparsity] Data Sparsifier Benchmarking: Forward time evaluation of the sparse dlrm model with torch.sparse (#81780)
The objective is to check if introducing torch sparse coo in the sparse dlrm model improves the inference time
over different sparsity levels.
The ```evaluate_forward_time.py``` makes use of the ```sparse_model_metadata.csv``` file dumped by the
```evaluate_disk_savings.py```. Records forward time for the sparse dlrm model using sparse coo
tensors and without using sparse coo tensors and dumps it into a csv file ```dlrm_forward_time_info.csv```

**Results**: The dlrm model with sparse coo tensor is slower (roughly 2x).

After running, `evaluate_memory_savings.py`, run: `python evaluate_forward_time.py --raw_data_file=<path_to_raw_data_txt_file> --processed_data_file=<path_to_kaggleAdDisplayChallenge_processed.npz> --sparse_model_metadata=<path_to_sparse_model_metadata_csv>`

Dependencies: DLRM Repository (https://github.com/facebookresearch/dlrm)

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81780
Approved by: https://github.com/z-a-f
2022-07-26 16:43:45 +00:00
macandro96
eca21fbd17 [ao][sparsity] Data Sparsifier Benchmarking: Model quality evaluation of the sparsified DLRM model (#81779)
The objective is to perform evaluation of the model quality after sparsifying the embeddings of the dlrm model.
The ```evaluation_model_metrics.py``` makes use of the ```sparse_model_metadata.csv``` file dumped by the
```evaluate_disk_savings.py```. The model metrics such as accuracy, auc, f1 etc are calculated on the test-dataset
for various sparsity levels, block shapes and norms available on the metadata csv file.

**Results**: The model accuracy decreases slowly with sparsity levels. Even at 90% sparsity levels, the model accuracy decreases only by 2%.

After running `evaluate_memory_savings.py`, run: `python evaluate_model_metrics.py --raw_data_file=<path_to_raw_data_txt_file> --processed_data_file=<path_to_kaggleAdDisplayChallenge_processed.npz> --sparse_model_metadata=<path_to_sparse_model_metadata_csv>`

Dependencies: DLRM Repository (https://github.com/facebookresearch/dlrm)

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81779
Approved by: https://github.com/z-a-f
2022-07-25 20:40:32 +00:00
macandro96
1223e94469 [ao][sparsity] Data Sparsifier Benchmarking: Evaluating disk savings of DLRM model (#81778)
The objective is to sparsify the embeddings of the dlrm model and observe the disk savings.
The model is sparsified and dumped to disk and then zipped.
The embeddings are pruned to different sparsity levels (0.0 - 1.0), for multiple block shapes ((1,1) and (1,4))
and optimization functions (L1, L2).
The user trying to reproduce the results is required to clone the dlrm repository and copy the files to dlrm directory.
Then train the dlrm model as per the instructions on the github page and then run this script.

**Results**: Introducing sparsity in the embeddings reduces file size after compression. The compressed model size goes
down from 1.9 GB to 150 MB after 100% sparsity.

Dependencies: DLRM Repository (https://github.com/facebookresearch/dlrm)

After Setup, Run: `python evaluate_disk_savings.py --model_path=<path_to_model_checkpoint> --sparsified_model_dump_path=<path_to_dump_sparsified_models>`

Test Plan: None
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81778
Approved by: https://github.com/z-a-f
2022-07-25 20:38:46 +00:00
Andrew Or
194255bb56 [Quant][fx] Implement BackendConfig (part 1) (#81469)
Summary: Following https://github.com/pytorch/pytorch/pull/78452
and https://github.com/pytorch/pytorch/pull/79066, this commit
is part 1 of the broader effort to replace `backend_config_dict`
with a python config object, a more formal and robust API that
leads to better user experience. Note that there is no change in
behavior in this commit by itself. A future commit (part 2) will
replace all existing usages of `backend_config_dict` with the
`BackendConfig` object added in this commit.

Test Plan:
python test/test_quantization.py TestBackendConfig

Reviewers: jerryzh168

Subscribers: jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81469
Approved by: https://github.com/jerryzh168
2022-07-24 00:34:48 +00:00
macandro96
1ba63e5a56 [ao][sparsity] Serialization support (#80890)
Implemented dumping and loading of state_dicts and __get_state__ and __set_state__ functions.
hook and layer are removed from the data_groups dictionary before serializing.

In the future, might have to treat functions differently before serializing. Currently, it is being
treated similar to other types while serializing.

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80890
Approved by: https://github.com/z-a-f
2022-07-22 21:57:56 +00:00
macandro96
aa23447904 [ao][sparsity] Implementation of squash_mask() (#80889)
Unregisters aggreagate hook that was applied earlier and registers sparsification hooks.
The sparsification hook will apply the mask to the activations before it is fed into the
attached layer.

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80889
Approved by: https://github.com/z-a-f
2022-07-22 21:54:24 +00:00
macandro96
6b3bf3d6d9 [ao][sparsity] Implementation of step() and update_mask() (#80888)
The step() internally calls the update_mask() function for each layer
The update_mask() applies reduce_fn and mask_fn to compute the sparsification mask.
Note:
    the reduce_fn and mask_fn is called for each feature, dim over the data

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80888
Approved by: https://github.com/z-a-f
2022-07-22 21:50:10 +00:00
macandro96
5fe3a1669c [ao][sparsity] Implementation of register_layer() and get_mask() (#80887)
The register_layer() attaches a pre-forward hook to the layer to aggregate
activations over time. The mask shape is also inferred here.

The get_mask() returns the computed mask associated to the attached layer.
The mask is
    - a torch tensor is features for that layer is None.
    - a list of torch tensors for each feature, otherwise

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80887
Approved by: https://github.com/z-a-f
2022-07-22 21:47:34 +00:00
macandro96
f87d8c2f62 [ao][sparsity] Basic implementation of activation sparsifier (#80886)
The Activation sparsifier class aims to sparsify/prune activations in a neural
network. The idea is to attach the sparsifier to a layer (or layers) and it
zeroes out the activations based on the mask_fn (or sparsification function)
input by the user.
The mask_fn is applied once all the inputs are aggregated and reduced i.e.
mask = mask_fn(reduce_fn(aggregate_fn(activations)))

Note::
    The sparsification mask is computed on the input **before it goes through the attached layer**.

Test Plan:
```python test/test_ao_sparsity.py TestActivationSparsifier```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80886
Approved by: https://github.com/HDCharles
2022-07-22 21:43:33 +00:00
vspenubarthi
75aab6540e [ao] Update DynamicStatic Detector to account for Conv (#81972)
Summary: This updates the DynamicStatic Detector to also provide insight
into whether Conv layers should use dynamic or static quantization.
Before, this was not included because as of now, Dynamic quantization is
not supported for Conv layers, but this adds a check for Conv layers and
if dynamic is recommended, it will also give a disclaimer that it is not
currently supported but will be in the future.

Test Plan: python test/test_quantization.py TestFxModelReportDetectDynamicStatic

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81972
Approved by: https://github.com/jerryzh168
2022-07-22 21:00:29 +00:00
vspenubarthi
0cacaf070f [ao] Fix to InputWeightEqualization detector to handle Conv groups (#81971)
Summary: The current implementation of the InputWeightEqualization
detector broke when it was tested on MobileNetV2, and the reason for
this is that it wasn't able to properly handle groups in Conv layers,
and there also had to be some minor reshaping of the weights to handle
this as well.

In addition, the output was correspondingly tuned so that instead of
giving on output for each channel on each layer, it gives a single
suggestion per module and just lets it know how many of the channels
could benefit from input-weight equalization, and suggests it if it's
more than half.

There was also the realization that the test class didn't do a good job
of testing different dimensions for the batch vs. height vs. width, so
this was updated to be more comprehensive as well.

Test Plan: python test/test_quantization TestFxDetectInputWeightEqualization

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81971
Approved by: https://github.com/jerryzh168
2022-07-22 20:56:15 +00:00
macandro96
e66986421d [ao][sparsity] Training-aware data sparsity callback for lightning (#80371)
This callback aims to sparsify the model inside lightning module after training.
**Note that the model is copied and then sparsified, so the existing model is not modified**

The sparsified model can be used for comparison and can be accessed using
<callback_obj>.sparsified

Test Plan:
```python torch/ao/sparsity/_experimental/data_sparsifier/lightning/tests/test_callbacks.py TestTrainingAwareCallback```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80371
Approved by: https://github.com/z-a-f
2022-07-21 16:41:43 +00:00
macandro96
eecf34fbe7 [ao][sparsity] Post training data sparsifier callback for lightning (#80370)
Lightning callback that enables post-training sparsity.

This callback aims to sparsify the model inside lightning module after training.
**Note that the model is copied and then sparsified, so the existing model is not modified**

The sparsified model can be used for comparison and can be accessed using <callback_obj>.sparsified

Test Plan
```python torch/ao/sparsity/_experimental/data_sparsifier/lightning/tests/test_callbacks.py TestPostTrainingCallback```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80370
Approved by: https://github.com/z-a-f
2022-07-21 16:39:13 +00:00
Weiwen Xia
2edd6aaeaa Add prelu op and module for quantized CPU backend (#73491)
Add prelu op and module for quantized CPU backend.
The PR includes:
- Quantized version of prelu op
- Native prelu kernel for quantized CPU
- Prelu modules in `nn` and `nn.quantized`
- FX support for prelu
- Unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491
Approved by: https://github.com/jerryzh168
2022-07-20 07:48:15 +00:00
vspenubarthi
589e8a1da5 [ao] Get feature and module names from ModelReportVisualizer class (#81647)
Summary: Added the functionality to be able to get the feature names and
module_fqns from the ModelReportVisualizer class. The purpose of this
addition is so that users can see the exact set of module_fqns or
feature names that they can filter based on, and use this information to
perform their filtering.

Test Plan: python test/test_quantization.py
TestFxModelReportVisualizer.test_get_modules_and_features

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81647
Approved by: https://github.com/andrewor14
2022-07-20 03:03:03 +00:00
vspenubarthi
1d3935a77d [ao] Add method in ModelReport to generate visualizer (#81589)
Summary: We created a ModelReportVisualizer class, and the primary
way it is envisioned that it is accessed is:

```
model_report_visualizer = model_reporter.generate_visualizer()
```

This method only works after reports have been generated and it takes in
the generated reports and reformats them to be ordered by module, into
the format required by the ModelReportVisualization. It then generates
the visualizer instance and returns that to the user.

Test Plan: python test/test_quantization.py TestFxModelReportClass.test_generate_visualizer

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81589
Approved by: https://github.com/andrewor14
2022-07-20 02:58:52 +00:00
vspenubarthi
d0ce1fbbe2 [ao] Created Skeleton for ModelReportVisualizer class (#81523)
Summary: This introduces the skeleton for the ModelReportVisualizer
class. This class helps visualize the information generated by the
ModelReport class `generate_report()` output. This class aims to provide
visualizations in a table, plot (line graph) and histogram view.

This also introduces an empty test class for testing visualizations. As
implementations start occuring for this class, tests will also be
approrpriately added.

This includes the high level descriptions for each of the methods as
well. Expected use cases will be added to the class description in a
future commit as that gets finalized.

Test Plan: python test/test_quantization.py TestFxModelReportVisualizer

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81523
Approved by: https://github.com/andrewor14
2022-07-20 02:39:14 +00:00
vspenubarthi
8a6d1289d8 [ao] Revised ModelReport API to take in model at initialization (#81588)
Summary: Currently, the ModelReport API only takes in detectors at the
beginning and for each of its methods, you have to pass in the model
each time, which doesn't really make sense because:

1. you will always want to be working on the same model
2. passing in a different model could break things, so more
fault-tolerant if we keep the model internally and make calls on it

Therefore, now the model will be passed in in intialization, and will
just be used for the rest of the operations with the local link.

All the ModelReport tests have been adjusted to account for this, and
this change must pass all the tests to ensure a successful API
transition.

If you wish to see how the updated API looks, the Expected Usage in the
ModelReport clas description has been updated to reflect the changes.

The README has also been updated with these changes as well.

Test Plan: python test/test_quantization.py TestFxModelReportClass

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81588
Approved by: https://github.com/jerryzh168
2022-07-19 16:11:46 +00:00
vspenubarthi
e907a8d966 [ao] Updated dict keys of detectors to have consistent naming scheme (#81587)
Summary: Currently, all the detectors have pretty accurate naming
schemes that give an idea of what they do. However, since now there are
more and more detectors being developed, there is a need to make sure
that the naming scheme for detectors are consistent for their keys.

This updates the keys of the returned dictionary keys to better
highlight if something is an activation stat or weight stat, etc.

Test Plan:

python test/test_quantization.py TestFxModelReportDetector

python test/test_quantization.py TestFxModelReportObserver

python test/test_quantization.py TestFxModelReportDetectDynamicStatic

python test/test_quantization.py TestFxModelReportClass

python test/test_quantization.py TestFxDetectInputWeightEqualization

python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81587
Approved by: https://github.com/jerryzh168
2022-07-19 08:50:30 +00:00
asl3
368018530e [quant] Implement forward and backward autograd functions for fake quantize (#81438)
### Summary:
This PR implements custom autograd functions for forward and backward to be used in APoT fake quantization. The implementation follows this doc about custom autograd functions: https://pytorch.org/tutorials/beginner/examples_autograd/polynomial_custom_function.html

### Test Plan:
Run tests with: `python test/quantization/core/experimental/test_fake_quantize.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81438
Approved by: https://github.com/jerryzh168
2022-07-19 02:15:30 +00:00
vspenubarthi
8a3f88b5e0 [ao] Standardized InputWeightEqualizationDetector output to single level (#81586)
Summary: Currently the InputWeightEqualizationDetector has a
multi-layered output.

Example
```
{'block1.linear': {'channel_axis_selected': 1,
                   'channel_comparison_metrics': tensor([0.8736, 0.6594, 0.2916], grad_fn=<DivBackward0>),
                   'input_range_info': {'global_max': tensor(9.),
                                        'global_min': tensor(-10.),
                                        'per_channel_max': tensor([9., 9., 9.]),
                                        'per_channel_min': tensor([-10., -10., -10.])},
                   'input_weight_equalization_recommended': [True,
                                                             False,
                                                             False],
                   'threshold': 0.8,
                   'weight_range_info': {'global_max': tensor(0.5618, grad_fn=<UnbindBackward0>),
                                         'global_min': tensor(-0.2211, grad_fn=<UnbindBackward0>),
                                         'per_channel_max': tensor([0.3764, 0.5618, 0.2894], grad_fn=<NotImplemented>),
                                         'per_channel_min': tensor([-0.2211,  0.2213,  0.2228], grad_fn=<NotImplemented>)}},
}
```

With all the levels, it can be hard to parse the information for
anything, especially the planned visualization feature where the data
has to be reorganized. Therefore, to make it standardized across all
detectors, all outputs will be limited to one level.

The new format is:
```
{'block1.linear': { 'channel_axis_selected': 1,
                    'channel_comparison_metrics': tensor([0.5705, 0.9457, 0.8891], grad_fn=<DivBackward0>),
                    'activation_global_max': tensor(9.),
                    'activation_global_min': tensor(-10.),
                    'activation_per_channel_max': tensor([9., 9., 9.]),
                    'activation_per_channel_min': tensor([-10., -10., -10.]),
                    'input_weight_equalization_recommended': [False, True, True],
                    'threshold': 0.8,
                    'weight_global_max': tensor(0.4258, grad_fn=<UnbindBackward0>),
                    'weight_global_min': tensor(-0.4958, grad_fn=<UnbindBackward0>),
                    'weight_per_channel_max': tensor([0.1482, 0.3285, 0.4258], grad_fn=<NotImplemented>),
                    'weight_per_channel_min': tensor([-0.1517, -0.4958, -0.3027], grad_fn=<NotImplemented>)},
}
```

The README will also be updated to reflect this change.

Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81586
Approved by: https://github.com/jerryzh168
2022-07-19 01:00:40 +00:00
vspenubarthi
2ddb722bc6 [ao] Standardize PerChannelDetector Output to be single level (#81585)
Summary: Currently, the PerChannelDetector has a multi-layered output.

Example:
```
{'backend': 'qnnpack',
 'per_channel_status': {'block1.linear': {'per_channel_supported': True,
                                          'per_channel_used': False},
                        'block2.linear': {'per_channel_supported': True,
                                          'per_channel_used': False}}}
```

The issue with this is that when it comes to future features such as
visualizations where we need to go through this dictionary, it can be
hard because of the variable number of layers.

This changes the output format of the PerChannelDetector to have a
standard format.

Ex.)
```
{'block1.linear': {'backend': 'qnnpack',
                   'per_channel_supported': True,
                   'per_channel_used': False},
 'block2.linear': {'backend': 'qnnpack',
                   'per_channel_supported': True,
                   'per_channel_used': False}}
```

Test Plan: python test/test_quantization.py TestFxModelReportDetector

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81585
Approved by: https://github.com/HDCharles
2022-07-18 22:16:08 +00:00
vspenubarthi
845792db3c [ao] Fix for extra lines after return in Outlier Detector (#81499)
Summary: There were accidently two lines added after a return statement
in the OutlierDetecor insertion that was not caught by either the linter
nor the tests nor i, that were harmless, but some odd merge issue. This
removes those two lines.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81499
Approved by: https://github.com/kit1980
2022-07-15 00:10:59 +00:00
vspenubarthi
0f3c8c939f [ao] Added README for ModelReport functionality (#81369)
Summary: This adds a README for the ModelReport functionality that
contains an overview of the class, what it does,
and how it works, an example of usage, information on how to implement a
new detector (since this is how core functionality is added), folder
structure information, and finally information on tests and where they
are located.

The ModelReport class is still in development and will, in the future,
get additional features such as visualizations, and the README will be
updated with this information as it is added.

Test Plan: Just a new README, no code is added, README will be reviewed
for accuracy and ease of use/ easiness to read.

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81369
Approved by: https://github.com/jerryzh168
2022-07-14 19:17:52 +00:00
vspenubarthi
8f743d7a70 [ao] Updated detector observer insert args to be vars not strings (#81382)
Summary: Before for the detectors, the
determine_observer_insert_points() function for all of them would have
hard coded strings as the keys for the dictionary that would be returned
to the ModelReport instance, and those same hard-coded keys would be
used to actually extract information from them. Since all detectors used
the same string keys, these were just made default variables at the top
of the detector.py file, and all detectors just used those. The same
ones are imported and now used in ModelReport file as well. This way,
there is less of a chance of an error because of incorrectly typed
strings.

The test plan primarily tests the ModelReport class because this uses
the same new vars as well for the strings and is the primary one calling
each of the detector instances' determine_observer_insert_points()

Test Plan: python test/test_quantization.py TestFxModelReportClass

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81382
Approved by: https://github.com/jerryzh168
2022-07-14 19:17:28 +00:00
Jerry Zhang
446edadd95 [quant][fx] Follow up fixes for qconfig validations for fixedqparams ops (#81010)
Summary:
This adds a few things on top of https://github.com/pytorch/pytorch/pull/80184,
1). node.target was assumed to be "tanh", torch.nn.Tanh etc. this PR handles that properly
2). adds FixedQParamsFakeQuantize support
3). extends the comparison function _partial_wrapper_equals to work with FakeQuantize.with_args(observer=...)

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D37735193](https://our.internmc.facebook.com/intern/diff/D37735193)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81010
Approved by: https://github.com/andrewor14
2022-07-14 18:06:23 +00:00
Andrew Or
c657c3d3ab [Quant][fx] Rename convert_to_reference to convert_to_reference_fx (#81326)
Summary: This commit renames the convert_to_reference function to
convert_to_reference_fx, which is more descriptive and matches
prepare_fx and prepare_qat_fx better.

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers: jerryzh168

Subscribers: jerryh168

Differential Revision: [D37787876](https://our.internmc.facebook.com/intern/diff/D37787876)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81326
Approved by: https://github.com/jerryzh168
2022-07-13 22:18:46 +00:00
vspenubarthi
a25df29cc4 [ao] Updated ModelReport function calls to show not dependent on Fx GraphMode (#81252)
Summary: Before, all the function calls for the ModelReport object were
dependent on the Fx Graph Mode workflow. However, in reality, this was
not true and the only requirement that was needed was for the model to
be a traceable GraphModule. This also helped keep the ModelReport class
as detached from the Fx Workflow as possible so that it can be used as a
more all purpose tool in the future.

This updated all the references to make sure that it wasn't specifically
referencing that a Fx Graph Mode workflow is needed, and is instead more
general since all we really need is a traceable model.

Test Plan: python test/test_quantization.py TestFxModelReportClass

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81252
Approved by: https://github.com/jerryzh168
2022-07-13 20:24:37 +00:00
vspenubarthi
5eec908700 [ao] Update ModelReport class with class usage in description. (#81251)
Summary: This adds a example usage description to the ModelReport class
so that people can see how it can be used right in the class
documentation without having to consult external sources. The example
usage depicts how it can be used using the QuantizationTracer, which was
a decision taken to illustrate how there is no strict requirement on
using this tool with only Fx Graph Mode workflow.

Test Plan: python test/test_quantization.py TestFxModelReportClass

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81251
Approved by: https://github.com/jerryzh168
2022-07-13 20:21:37 +00:00
vspenubarthi
6366c99e5b [ao] Added Collab link for Outlier Detector ratio val choice (#81250)
Summary: A huge part of the work for the Outlier detector was figuring
out what a good nth percentile to compare against the 100th percentile
was while also figuring out what a good comparision ratio would be. This
commit adds the link to a collab to the documentation of the function so
that people can go and see what the calculations used to determine those
values are and realize that they are not just randomly thrown in there.

At a high level, this collab contains work that includes:
- Figuring out whether to use interpolation or lower as the rule for
finding quantile between two indices
- Figuring out what a good value for reference_percentile is
- Figuring out what a good value for ratio_threshold is

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81250
Approved by: https://github.com/jerryzh168
2022-07-13 20:19:24 +00:00
vspenubarthi
9c298fff2e [ao] Added constant channel check to Outlier Detector (#81249)
Summary: The current Outlier detector does a good job of finding whether
data distributions passing through layers have outliers. However,
suppose we have a completely constant channel. The outlier detector
would not detect it as an outlier, but that is still something we want
to highlight because a constant channel usually is a result of a bad
configuration or something really wrong with the data.

To address this there are two additions to the outlier detector that
this commit makes:
- The first is to add whether there are any constant batches at all and
let the user know in the text report
- The second is to let the user know the number of total constant
batches found for each channel, so they can figure out if there are any
unnecessary channels present.

The exisiting outlier detector tests were modified to do a quick check
for this feature.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81249
Approved by: https://github.com/andrewor14
2022-07-13 20:16:33 +00:00
vspenubarthi
229762dcd9 [ao] Added statistical threshold arg in Outlier Detector (#81174)
Summary: The outlier detector has a feature where it's able to notify
the user if below the whole set of batches that passed through were used
in Outlier calculation, which mainly happens as a result of 0-errors.
This changes the code so that instead of comparing against a value like
30 as we were before, we now let the user pass in an optional fractional
value and if the ratio of the batches used was below that value, the
detector alerts the user.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81174
Approved by: https://github.com/andrewor14
2022-07-13 20:13:46 +00:00
vspenubarthi
893d763276 [ao] Implemented Outlier Detection Report Generation (#80937)
Summary: This adds the implementation for the report generation for the
Outlier Detector class. This includes both the generation of a
dictionary containing each module that had an observer attached and any
relavent stats collected by the observer that can help shed light on
outlier relavent data or computed metrics. It also includes a string
denoting specific modules that had outliers and gives a bit of insight
into what channels they are contained in.

This contains both the implementation for the report generation for the
outlier detector as well as a test class to test the report generation
functionality.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80937
Approved by: https://github.com/andrewor14
2022-07-12 19:56:33 +00:00
zaf
55d1b376ea [ao][sparsity] Vectorized WeightNormSparsifier (#80059)
The previous implementation was using loops to compute the sparsity within a block in a mask, as well as across the mask blocks. This implements the vectorized version.

## Vectorization:

A high level overview of the vectorization procedure falls into a two step process:

### Tensor-level masking

A tensor-level masking is a mask generation routine that has a granularity of `sparse_block_shape`. That means that only patches of that shape can be considered sparse/dense. To vectorize:

1. Reshape the data such that one of the dimensions represents the patches of sparse_block_shape.
2. Create a mask of the same shape as the reshaped data
3. Find the smallest `k` elements in the the data, given the dimension of the sparse "patches". `k` represents a derived paramter specifying the sparsity level.
4. Apply the 0/1 to the patches in the mask
5. Reshape the mask back to the original dimensions

Note: because the shape of the mask might not be multiple of the sparse_block_shape, we nudge the sshape of the mask, and truncate it afterwards.

## Block-level masking

A block-level masking is a mask generation routine that concerns itself only with sparsity within a patch of shape `sparse_block_shape`. This is useful when block sparsity allows partial block sparsification.

To vectorize:

Overall the block-level masking follows the same routine as the tensor-level algorithm described above. One distinction is that when reshaping the data/mask tensors we aim for creating a dimension that captures the internals of each patch. For example, if a `sparse_block_shape` is `(2, 2)`, we want to reshape the data/mask into `(2, 2, -1)`. That allows us to sort the internal elements on the last axis, and zero-out the ones that obey the sparse logic.

Differential Revision: [D37352494](https://our.internmc.facebook.com/intern/diff/D37352494/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37352494/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80059
Approved by: https://github.com/jerryzh168
2022-07-12 19:16:44 +00:00
PyTorch MergeBot
caee732aa1 Revert "[quant][fx] Support keyword arguments for functional linear (#79095)"
This reverts commit d71fb40d98.

Reverted https://github.com/pytorch/pytorch/pull/79095 on behalf of https://github.com/jerryzh168 due to broken master
2022-07-09 21:45:01 +00:00
Jerry Zhang
d71fb40d98 [quant][fx] Support keyword arguments for functional linear (#79095)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/78117
Fixes: https://github.com/pytorch/pytorch/issues/73463

This PR adds a normalization pass that normalizes all the args to keyword args in positional order and fixes lowering code that previously
only uses node.args to use both args and kwargs instead.

Also tried to add a test for F.conv2d, but since conv2d matches multiple schemas we are doing an extra schema match, and because we are using symbolic values
in `transform`, we don't have a schema match, so F.conv2d still fails with runtime errors. we can resolve this issue later when there is a need.

Another thing I'm considering is to do the normalization with real inputs instead of symbolic inputs and not rely on operator_schemas (which is based on torchscript),
and rely on inspect.signature, I tried this briefly but didn't get too far, it looks like we cannot get the python signature for `torch._C._nn.linear`, it might be possible to fix as well, but will need follow up discussions.

The goal for this PR is just to introduce normalization in our codebase so that we can adapt some downstream code to this, and also fix the F.linear issue.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_normalize_args

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D37163228](https://our.internmc.facebook.com/intern/diff/D37163228)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79095
Approved by: https://github.com/andrewor14
2022-07-09 20:01:09 +00:00
Zafar
68ec793cfd [ao] Moving the sparsity/experimental to sparsity/_experimental (#81149)
The experimental code in the sparsity does not have user-facing api,
and should reside under the proivate package. This involves pruner and
base_sparsifier.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81149
Approved by: https://github.com/macandro96
2022-07-09 03:00:11 +00:00
Andrew Or
8fab682e47 [Quant][fx][bc-breaking] Do not move models to CPU in convert (#80555)
Summary: Previously, we automatically moved the model to CPU in
torch.ao.quantization.fx.convert to work around the issue where
certain functions called by convert expect CPU arguments. This
commit pushes this responsibility to the caller since it is the
user's decision of which device to use.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

BC-breaking Notes:

Before:
```
model = resnet18(...)
model = prepare_fx(model, qconfig_mapping, example_inputs)
... # calibrate
model = convert_fx(model)
```
After:
```
model = resnet18(...)
model.cpu()
model = prepare_fx(model, qconfig_mapping, example_inputs)
... # calibrate
model = convert_fx(model)
```

Reviewers: jerryzh168

Subscribers: jerryzh168

Differential Revision: [D37528830](https://our.internmc.facebook.com/intern/diff/D37528830)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80555
Approved by: https://github.com/jerryzh168
2022-07-08 19:23:57 +00:00
vspenubarthi
6a7ed56d79 [ao] Added OutlierDetector observer insert implementation (#80880)
Summary: This adds the implementation for observer insertion point
selection for the OutlierDetector. For this detector, the insertion
points are to insert a ModelReportObserver before any leaf level module
to study the distribution of data that passes into the module to detect
outliers.

This commit contains the implementation of the observer insertion as
well as the relavent test case. Some code from the
InputWeightEqualization was abstracted and made more modular so the same
helper function could be used for multiple outlier class tests.

As a part of the work for this, there was testing done to determine what
a good default ratio threshold and reference percentile would be, and
the work to determine this (based on a normal distribution) was then
analyzed to find good paramters.

We still want to keep thresholds and reference percentile as something
the user can input because these were based on a normal distribution,
and it can definately vary depending on the type of data a user has.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80880
Approved by: https://github.com/andrewor14
2022-07-08 15:36:20 +00:00
Salil Desai
5c12cd224f [PyTorch Edge] Add qnnpack bcsr matrix unpacking and use unpacking in Linear module (#80475)
Having unpacking removes the need to store the original dense weights in the python Linear module

Differential Revision: [D34699287](https://our.internmc.facebook.com/intern/diff/D34699287/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D34699287/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80475
Approved by: https://github.com/qihqi
2022-07-07 15:32:21 +00:00
Salil Desai
eaf817df3a [PyTorch Edge] Add serialization/deserialization of Sparse Quantize Linear Packed Params (#80474)
Packed Params are serialized/deserialized in sparse form

Differential Revision: [D34392761](https://our.internmc.facebook.com/intern/diff/D34392761/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D34392761/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80474
Approved by: https://github.com/qihqi
2022-07-07 15:30:02 +00:00
Salil Desai
523b081a64 [PyTorch Edge] Remove Original Weight Tensor from QNNPack Sparse Quantized Linear Packed Params (#80473)
We plan to add serialization/deserialization wihout the original weight tensor, so we no longer need to store it

Differential Revision: [D34617321](https://our.internmc.facebook.com/intern/diff/D34617321/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80473
Approved by: https://github.com/qihqi
2022-07-07 15:11:44 +00:00
macandro96
daf00e843a [ao][sparsity] Bug Fix: data norm sparsifier not working with 1D tensors/parameters (#80465)
Issue:
Previously, the L1/L2 norm data sparsifier was not supported with
1D tensors or parameters.

Fix:
If the tensor is 1D, then unsqueeze it to make it look 2D and
perform the rest as usual. Also, added some 1D tensor in the
unit test to test this issue.

Test Plan:
```python test/test_ao_sparsity.py TestNormDataSparsifiers```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80465
Approved by: https://github.com/z-a-f
2022-07-06 21:04:19 +00:00
macandro96
ec594dd305 [ao][sparsity] Bug fix: data not correctly attached to the sparsifier (#80394)
Issue:
Previously, the data was not "attached" to the data sparsifier. Meaning
the data sparsifier created a copy of the actual data inside it's container. So,
when the data was modified outside of the sparsifier, the changes was not reflected
in the sparsifier.

Fix:
Use register_buffer() instead of nn.Parameter(..) to store the data inside the container.
Also, added a unit-test to reference this issue.

Test Plan:
```python test/test_ao_sparsity.py TestBaseDataSparsifier```
```python test/test_ao_sparsity.py TestNormDataSparsifiers```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80394
Approved by: https://github.com/z-a-f
2022-07-06 20:57:32 +00:00
Vasiliy Kuznetsov
ce0786add2 fx quant: fix warning in util function when cloning tensors (#80883)
Summary:

Some of the util functions in FX graph mode quantization throw warnings
such as:

```
/Users/vasiliy/pytorch/torch/ao/quantization/fx/utils.py:410: UserWarning: To copy construct from
a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().
requires_grad_(True), rather than torch.tensor(sourceTensor).
```

This PR fixes the warnings by moving the code to the recommended syntax if the
value is a tensor.

Test plan:

```
python test/test_quantization.py -k test_conv_linear_reference
// warning appeared before this PR and disappeared after this PR
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80883
Approved by: https://github.com/jerryzh168
2022-07-06 12:44:10 +00:00
Jiaxu Zhu
280f4704b7 [torch.fx] Check node type before fetching .users (#80166)
Summary:
as title
currently it fails when `node` is actually a constant instead of `fx.Node`

Test Plan: existing unit tests

Differential Revision: D37389003

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80166
Approved by: https://github.com/jerryzh168
2022-07-05 23:32:22 +00:00
asl3
5b493ba18b [quant] Refactor quantize clamping into float_to_apot util function (#80885)
### Summary:
This PR moves the clamping functionality from `quantize` to `float_to_apot` util function to align with the uniform quantize workflow in the codebase.

### Test Plan:
Run unit tests with:
python pytorch/test/quantization/core/experimental/test_quantizer.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80885
Approved by: https://github.com/dzdang
2022-07-05 19:28:37 +00:00
vspenubarthi
e5162dcfa7 [ao] Added framework for ModelReport Outlier Detector (#80743)
Summary: This adds the class framework for the ModelReport
OutlierDetector. This detector will be in charge of looking at
activation data and figuring out whether there are significant oultiers
present in them. It will average this data across batches to make a
recommendation / warning if significant outliers are found.

This commit contains just the class framework and a base test class.
Implementations will follow in following commits.

Test Plan: python test/test_quantization.py TestFxDetectOutliers

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80743
Approved by: https://github.com/HDCharles
2022-07-01 01:03:31 +00:00
PyTorch MergeBot
b64096a264 Revert "Add prelu op and module for quantized CPU backend (#73491)"
This reverts commit 3a6d6bc3cc.

Reverted https://github.com/pytorch/pytorch/pull/73491 on behalf of https://github.com/malfet due to Broke Windows builds, see 3a6d6bc3cc
2022-06-30 12:54:39 +00:00
Weiwen Xia
3a6d6bc3cc Add prelu op and module for quantized CPU backend (#73491)
Add prelu op and module for quantized CPU backend.
The PR includes:
- Quantized version of prelu op
- Native prelu kernel for quantized CPU
- Prelu modules in `nn` and `nn.quantized`
- FX support for prelu
- Unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491
Approved by: https://github.com/jerryzh168
2022-06-30 06:50:22 +00:00
Jerry Zhang
1a7e560ade [quant] Refactor quantization tracer to a separate file (#80268)
Summary:
att, since we need to reuse the tracer in some other places

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D37435748](https://our.internmc.facebook.com/intern/diff/D37435748)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80268
Approved by: https://github.com/vkuzo
2022-06-30 00:49:57 +00:00
HDCharles
fa6b6842e1 [ao][sparsity] removing leading '.' from fqn in utils (#79774)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom):
* __->__ #79774
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79774
Approved by: https://github.com/z-a-f
2022-06-30 00:00:56 +00:00
HDCharles
3f1dc7ec00 [quant] Create default custom modules for LSTM and MHA (#79960)
Summary:
Currently we expect the users to provide custom modules for LSTM and MHA. However, as we almost always ask the users to use those modules in the custom context, it is better to make this behavior default. In this case we try to align with the base quantization API, if the user specifies a custom_config_dict then that is used, however if the value is left as None then the default is used. If a user would like to both use the default and modify it, they have to do so manually, however the default is accessible by get_default_custom_config_dict

Additionally, the NS which uses prepare to insert custom observers for
its purposes had to be slightly modified to pass in an empty
custom_config_dict in order to avoid modifying the custom modules.

due to weird CI issues with previous PR,
previous discussion can be found: https://github.com/pytorch/pytorch/pull/71192

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79960
Approved by: https://github.com/z-a-f
2022-06-30 00:00:46 +00:00
Andrew Or
c44317704a [Quant][fx] Add default configs for fixed qparams ops (#80184)
Summary: This commit adds qconfigs with special observers for fixed
qparams ops in get_default_qconfig_mapping and
get_default_qat_qconfig_mapping. For correctness, we also require
users to use these special observers if we detect these fixed
qparams ops in prepare.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184
Approved by: https://github.com/jerryzh168
2022-06-29 23:07:26 +00:00
Andrew Or
17104d3d7f [Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091)
Summary: This PR removes the is_reference flag from the  existing
convert_fx API and replaces it with a new convert_to_reference
function. This separates (1) converting the prepared model to a
reference model from (2) lowering the reference model to a quantized
model, enabling users to call their custom lowering function for
custom backends. For the native fbgemm backend, for example, the
following are equivalent:

```
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

prepared = prepare_fx(model, ...)
quantized = convert_fx(prepared, ...)
```

```
from torch.ao.quantization.fx import lower_to_fbgemm
from torch.ao.quantization.quantize_fx import (
    prepare_fx,
    convert_to_reference
)

prepared = prepare_fx(model, ...)
reference = convert_to_reference(prepared, ...)
quantized = lower_to_fbgemm(reference, ...)
```

Note that currently `lower_to_fbgemm` takes in two other arguments
that are difficult for users to provide. A future commit will remove
these arguments to make the helper function more user friendly.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091
Approved by: https://github.com/jerryzh168
2022-06-29 23:01:27 +00:00
asl3
5070f5d18f [quant] Implement APoT fake quantization (#79845)
### Summary:
This PR implements APoT fake quantization for the purpose of quantization aware training. This implements `calculate_qparams` and `forward `methods to be used in fake quantization.

### Test Plan:
Run unit tests with: `python pytorch/test/quantization/core/experimental/test_fake_quantize.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79845
Approved by: https://github.com/dzdang
2022-06-28 18:15:26 +00:00
zaf
cb5ef130b6 [ao][sparsity] Fixing failing internal pruner tests (#80111)
After a recent change in the base_sparsifier API, the internal pruner started failing. This adopts the testcases to the change:

1. Changed `module_groups` to `groups`
2. Changed the fusion logic from taking care of the whole fused module to handling the submodules individually.

Differential Revision: [D37364801](https://our.internmc.facebook.com/intern/diff/D37364801/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37364801/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80111
Approved by: https://github.com/macandro96
2022-06-28 04:38:58 +00:00
Andrew Or
8aedd8fb25 [Quant][fx] Hide equalization_config from prepare APIs (#80164)
Summary: This PR hides the equalization_config argument from
prepare_fx. This is a private API that we do not wish to expose
to users and have to maintain backward compatibility for.

Test Plan:
python test/test_quantization.py TestEqualizeFx

Reviewers: jerryzh168

Subscribers: jerryzh168

Differential Revision: [D37394353](https://our.internmc.facebook.com/intern/diff/D37394353)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80164
Approved by: https://github.com/jerryzh168
2022-06-28 04:20:34 +00:00