Commit Graph

8 Commits

Author SHA1 Message Date
Xia, Weiwen
3a3e2002d8 [Quant] Add unified x86 quant backend (#84329)
## Description

Implement unified quantization backend 'X86' for x86 platforms. It combines the advantages of FBGEMM and ONEDNN. It selects kernels during weight prepacking and hide the details from end users. It will be the default backend in place of FBGEMM.

For details, please refer to this RFC: [[RFC] Unified quantization backend for x86 CPU platforms](https://github.com/pytorch/pytorch/issues/83888)

## Validation
**Correctness**
Covered by UT

**Accuracy**
By running torchvision models on imagenet, no accuracy difference is found between FBGEMM and the unified X86 backend:
[torchvision_accuracy_comparison_fbgemm_vs_x86.xlsx](https://github.com/pytorch/pytorch/files/9598114/torchvision_accuracy_comparison_fbgemm_vs_x86.xlsx)

**Performance**
Depends on https://github.com/pytorch/pytorch/pull/84470 which improves performance.
For early PoC results, please refer to https://github.com/pytorch/pytorch/files/9399202/unified_qengine_poc_performance_bechmark.xlsx

With the two PRs combined, we collected some data on Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
Method: Run multi-instances with 4 cores per instance on whole socket. Using JeMalloc and Intel OMP.
Models/throughput | fbgemm | x86 | improvement
-- | -- | -- | --
wide_resnet101_2 | 173.5675 | 241.815 | 39.32%
resnext101_32x8d | 174.365 | 339.8175 | 94.89%
resnet50 | 573.155 | 1174.14 | 104.86%
vgg19_bn | 260.335 | 337.92 | 29.80%
vgg19 | 257.935 | 333.265 | 29.21%
inception_v3 | 601.1175 | 1309.33 | 117.82%
densenet161 | 296.645 | 435.5625 | 46.83%
mnasnet1_0 | 1216.7 | 4057.515 | 233.49%
squeezenet1_0 | 1220.085 | 5153.3875 | 322.38%
alexnet | 2294.91 | 2624.6375 | 14.37%
fbnetc_100 | 976.2825 | 3110.1825 | 218.57%
shufflenet_v2_x0_5 | 1555.76 | 3026.125 | 94.51%
spnasnet_100 | 1059.065 | 3502.0975 | 230.68%
pytorch-unet | 192.76 | 246.77 | 28.02%
acgan | 257.32 | 333.7325 | 29.70%
cgan | 7790.6925 | 7803.1025 | 0.16%
sgan | 257.565 | 338.8875 | 31.57%
se_resnet50 | 492.3725 | 916.5175 | 86.14%
vggm | 300.2875 | 316.2075 | 5.30%

Environment:
- PyTorch version: 1.13.0a0+gitcdd625b
- Is debug build: False
- CUDA used to build PyTorch: None
- ROCM used to build PyTorch: N/A
- OS: Ubuntu 20.04.3 LTS (x86_64)
- GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- Clang version: Could not collect
- CMake version: version 3.22.5
- Libc version: glibc-2.31
- Python version: 3.9.12 (main, Jun  1 2022, 11:38:51)  [GCC 7.5.0] (64-bit runtime)
- Python platform: Linux-5.11.0-27-generic-x86_64-with-glibc2.31
- Is CUDA available: False
- CUDA runtime version: No CUDA
- GPU models and configuration: No CUDA
- Nvidia driver version: No CUDA
- cuDNN version: No CUDA
- HIP runtime version: N/A
- MIOpen runtime version: N/A
- Is XNNPACK available: True

Versions of relevant libraries:
- [pip3] intel-extension-for-pytorch==1.13.0+cpu
- [pip3] numpy==1.23.3
- [pip3] pytorch-widedeep==0.3.7
- [pip3] torch==1.13.0a0+git48b423b
- [pip3] torchvision==0.14.0a0+ebb68f3
- [conda] blas                      1.0                         mkl
- [conda] intel-extension-for-pytorch 1.13.0+cpu               pypi_0    pypi
- [conda] mkl                       2021.4.0           h06a4308_640
- [conda] mkl-include               2022.1.0                 pypi_0    pypi
- [conda] mkl-service               2.4.0            py39h7f8727e_0
- [conda] mkl-static                2022.1.0                 pypi_0    pypi
- [conda] mkl_fft                   1.3.1            py39hd3c417c_0
- [conda] mkl_random                1.2.2            py39h51133e4_0
- [conda] numpy                     1.23.3                   pypi_0    pypi
- [conda] numpy-base                1.22.3           py39hf524024_0
- [conda] torch                     1.13.0a0+git48b423b          pypi_0    pypi
- [conda] torchvision               0.14.0a0+ebb68f3          pypi_0    pypi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84329
Approved by: https://github.com/jerryzh168
2022-09-29 00:44:40 +00:00
Jerry Zhang
4523ac7aa1 [quant][docs][ez] Fix formatting for qconfig_mapping (#85306)
Summary:
att

Test Plan:
visual inspection of generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85306
Approved by: https://github.com/vkuzo, https://github.com/andrewor14
2022-09-22 02:09:36 +00:00
Jesse Cai
d6b2f5c643 [Quant][fx] Remove remove_quant_dequant_pairs and fix tests (#84203)
Summary:
- `remove_quant_dequant_pairs` removes ops when a `quant` is followed by a `dequant`
- It looks like the quantized implementation of `layer_norm` only supports float weights, so updated the default qconfig to avoid quantizing the weight param.
-  Fixes broken test, `test_norm_weight_bias`. This was the only test that broke, because the default qconfig dict we pass in quantizes the weight. I just pulled the native qconfig object and converted it to a dict.
- Adds in qconfig and backend config support for layernorm

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels
```

Reviewers:

Subscribers:

Tasks: Fixes https://github.com/pytorch/pytorch/issues/83110

Tags: quant, fx

Differential Revision: [D39395141](https://our.internmc.facebook.com/intern/diff/D39395141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84203
Approved by: https://github.com/jerryzh168
2022-09-12 16:32:15 +00:00
Jerry Zhang
214a6500e3 [quant][docs] Additonal fixes for quantize_fx docs (#84587)
Summary:
Some more clarifications for the arguments, including linking to object docs (QConfigMapping, BackendConfig) and adding types
in the doc

Test Plan:
```
cd docs
make html
```
and

visual inspection for the generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84587
Approved by: https://github.com/vkuzo
2022-09-09 15:23:23 +00:00
Jerry Zhang
446edadd95 [quant][fx] Follow up fixes for qconfig validations for fixedqparams ops (#81010)
Summary:
This adds a few things on top of https://github.com/pytorch/pytorch/pull/80184,
1). node.target was assumed to be "tanh", torch.nn.Tanh etc. this PR handles that properly
2). adds FixedQParamsFakeQuantize support
3). extends the comparison function _partial_wrapper_equals to work with FakeQuantize.with_args(observer=...)

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D37735193](https://our.internmc.facebook.com/intern/diff/D37735193)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81010
Approved by: https://github.com/andrewor14
2022-07-14 18:06:23 +00:00
Andrew Or
c44317704a [Quant][fx] Add default configs for fixed qparams ops (#80184)
Summary: This commit adds qconfigs with special observers for fixed
qparams ops in get_default_qconfig_mapping and
get_default_qat_qconfig_mapping. For correctness, we also require
users to use these special observers if we detect these fixed
qparams ops in prepare.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184
Approved by: https://github.com/jerryzh168
2022-06-29 23:07:26 +00:00
Andrew Or
61a1eef7fc [Quant][fx] Add get_default_qconfig_mapping
Summary: This follows https://github.com/pytorch/pytorch/pull/78452,
which replaced the qconfig_dict with QConfigMapping. This PR
additionally replaces get_default_*qconfig_dict with
get_default_*qconfig_mapping. For backward compatibility, we
deprecate the old functions instead of removing them.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo, supriyar

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79618

Approved by: https://github.com/jerryzh168
2022-06-16 16:10:14 +00:00
Andrew Or
c7b4eec233 [Quant][fx][bc-breaking] Replace qconfig_dict with a config object (#78452)
**Summary:** Previously, FX graph mode quantization configurations
were specified through a dictionary of qconfigs. However, this
API was not in line with other core APIs in PyTorch. This commit
replaces this dictionary with a config object that users will
create and pass to prepare and convert. This leads to better
type safety and better user experience in notebook settings
due to improved auto completion.

The new API is as follows:

```
from torch.ao.quantization import QConfigMapping
from torch.ao.quantization.quantize_fx import prepare_fx

qconfig_mapping = QConfigMapping()
    .set_global(qconfig)
    .set_object_type(torch.nn.Linear, qconfig)
    .set_module_name_regex("foo.*bar", qconfig)
    .set_module_name("mod", qconfig)

prepare_fx(model, qconfig_mapping)
```

For backwards compatibility, `prepare_fx`, `prepare_qat_fx`,
and `convert_fx` will continue to accept qconfig_dicts, which
will be converted to QuantizationConfigs internally.

Note that this commit does not modify existing tests to use the
new API; they will continue to pass in qconfig_dict as before,
which still works but triggers a deprecation warning. This will
be handled in a future commit.

**Test Plan:**
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

**Reviewers:** jerryzh168, vkuzo

**Subscribers:** jerryzh168, vkuzo

Differential Revision: D36747998

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452
Approved by: https://github.com/jerryzh168
2022-05-30 18:30:07 +00:00