Commit Graph

301 Commits

Author SHA1 Message Date
Xia, Weiwen
a6d3da1835 [Quant] Add int8 linear op impl for quantization PT2E with Inductor. input is an int8 CPU tensor; weight is an int8 MdkldnnCPU tensor. (#105818)
**Summary**
Add a new onednn qlinear op for quantization PT2E with Inductor. input is an int8 CPU tensor and weight is an int8 MkldnnCPU tensor.

**Test plan**
python test/test_quantization.py -k test_qlinear_pt2e

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105818
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jerryzh168
2023-08-27 08:13:12 +00:00
leslie-fang-intel
9319dd1c7c [Quant][Inductor] Enable the lowering of quantized maxpool2d (#105906)
**Summary**
Enable the `dq-maxpool2d-q` pattern match and lower into `torch.ops.quantized.max_pool2d`.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qmaxpool2d
python -m pytest test_quantized_op.py -k test_max_pool2d_pt2e
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105906
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639
2023-08-26 08:36:47 +00:00
leslie-fang-intel
8ef057255d [Quant][PT2E] Enable qconv for quantization 2.0 export (#104580)
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104580
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-08-25 17:34:45 +00:00
vasiliy
61fe49b8ed pt2: make aot_eager backend handle basic float8 operations (#107783)
Summary:

Reland of https://github.com/pytorch/pytorch/pull/107642 with a fix for tests on Windows.

Makes aot_eager backend of torch.compile handle basic float8 operations.

This is useful for float8 training UX.

Test Plan:

```
python test/test_quantization.py -k test_pt2_traceable_aot_eager
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107783
Approved by: https://github.com/albanD
2023-08-23 18:10:53 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
5025fb9213 Revert "pt2: make aot_eager backend handle basic float8 operations (#107642)"
This reverts commit 24147a8e1c.

Reverted https://github.com/pytorch/pytorch/pull/107642 on behalf of https://github.com/huydhn due to Sorry for reverting this, but it is failing Windows CPU test in trunk. The Windows failures on your PR looks related I think ([comment](https://github.com/pytorch/pytorch/pull/107642#issuecomment-1688999380))
2023-08-22 22:17:36 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
vasiliy
24147a8e1c pt2: make aot_eager backend handle basic float8 operations (#107642)
Summary:

Makes aot_eager backend of torch.compile handle basic float8 operations.

This is useful for float8 training UX.

Test Plan:

```
python test/test_quantization.py -k test_pt2_traceable_aot_eager
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107642
Approved by: https://github.com/albanD
2023-08-22 18:57:14 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Jason Lu
bc88028e8e Back out "Reland "Make adding buffers more like adding parameters (#104069)" (#106224)" (#106743)
Summary:
Original commit changeset: 81319beb97f3

Original Phabricator Diff: D47961182

Test Plan: revert to maintain backward compat with legacy ads_dper3 production package. Read details in: S357822

Reviewed By: atuljangra

Differential Revision: D48131623

@diff-train-skip-merge
(D48131623 landed internally)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106743
Approved by: https://github.com/malfet
2023-08-08 15:27:34 +00:00
Sergii Dymchenko
af37608276 Remove duplicate ops tests in test_quantized_op.py (#106398)
The duplicates are after https://github.com/pytorch/pytorch/pull/94170
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106398
Approved by: https://github.com/izaitsevfb, https://github.com/malfet, https://github.com/jerryzh168
2023-08-02 02:37:36 +00:00
Mikayla Gawarecki
d8e5f2aa6d Reland "Make adding buffers more like adding parameters (#104069)" (#106224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106224
Approved by: https://github.com/atalman, https://github.com/albanD
2023-07-31 17:18:56 +00:00
vasiliy
8b34fa5e9b add basic cuda support for float8 dtypes (#105807)
Summary:

Ensures that creating tensors, copying, filling with zeroes, checking for nan works on cuda for the `float8` dtypes.  This should be enough for float8 emulation on cuda.

Note that I skipped the mul test - it's less trivial to add (need a new c++ macro), and there is no use case for it. We can follow up on that in the future.

Test Plan:

```
python test/test_quantization.py TestFloat8Dtype
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105807
Approved by: https://github.com/ezyang, https://github.com/jerryzh168, https://github.com/albanD
2023-07-25 03:43:36 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
PaliC
9760ea58a3 fix lint (#105675)
Forward fix of the lint issues introduced by https://github.com/pytorch/pytorch/pull/104242
We are forward fixing as this PR contains Meta internal changes that would be tricky to revert smoothly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105675
Approved by: https://github.com/jerryzh168, https://github.com/albanD, https://github.com/atalman
2023-07-20 18:42:25 +00:00
Amadeusz Skrzypczak
b64bd4a5dd Add torch.float8_e5m2 and torch.float8_e4m3 data types (#104242)
Proposal of two float8 variants - e5m2 and e4m3 - based on https://arxiv.org/pdf/2209.05433.pdf

Hide all Float8 operator implementations behind `#if !defined(C10_MOBILE)` guard to keep Android build size almost unchanged

TODO:
 - Refactor duplicated code
 - Cleanup unbalanced pragma pop in dtype utils
 - Add native implementation on the CUDA size

Co-authored-by: Nikita Shulga <nshulga@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104242
Approved by: https://github.com/albanD
2023-07-20 16:09:11 +00:00
PyTorch MergeBot
f2b15772ff Revert "Add torch.float8_e5m2 and torch.float8_e4m3 data types (#104242)"
This reverts commit a9804130e5.

Reverted https://github.com/pytorch/pytorch/pull/104242 on behalf of https://github.com/PaliC due to breaks lint (run lintrunner and remerge) ([comment](https://github.com/pytorch/pytorch/pull/104242#issuecomment-1644150284))
2023-07-20 15:37:53 +00:00
Amadeusz Skrzypczak
a9804130e5 Add torch.float8_e5m2 and torch.float8_e4m3 data types (#104242)
Proposal of two float8 variants - e5m2 and e4m3 - based on https://arxiv.org/pdf/2209.05433.pdf

Hide all Float8 operator implementations behind `#if !defined(C10_MOBILE)` guard to keep Android build size almost unchanged

TODO:
 - Refactor duplicated code
 - Cleanup unbalanced pragma pop in dtype utils
 - Add native implementation on the CUDA size

Co-authored-by: Nikita Shulga <nshulga@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104242
Approved by: https://github.com/albanD
2023-07-20 09:45:45 +00:00
Andrey Talman
c6653b65d8 Back out "Make adding buffers more like adding parameters (#104069)" (#105581)
Summary:
D47537831 is breaking pyper tests: https://fb.workplace.com/groups/802176577445480/posts/1018902842439518/

with `TypeError: register_buffer() takes 3 positional arguments but 4 were given`

Original commit changeset: d4b4069fbd38

Original Phabricator Diff: D47537831

Test Plan:
```
buck2 run //caffe2/torch/fb/training_toolkit/integration_tests/training_lifecycle/cogwheel_tests/pyper_release_v2:cogwheel_smallworld_inline_cvr_infer_pyper_pyper__canary_offline_training-launcher -- --run-harness-in-tupperware --build-fbpkg ads_dper3 --build-fbpkg training_platform
```

Reviewed By: atalman

Differential Revision: D47600140

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105581
Approved by: https://github.com/mikaylagawarecki
2023-07-20 03:39:53 +00:00
Justin Chu
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
ekamiti
32d422f335 Make adding buffers more like adding parameters (#104069)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new `Buffer` class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the `register_buffer` method has not been changed. The `persistent` parameter in the `Buffer` type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new `Buffer` type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the `Buffer` type can be used as a drop in replacement for `register_buffer` as it just leads to `register_buffer` being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104069
Approved by: https://github.com/mikaylagawarecki
2023-07-17 17:59:05 +00:00
HDCharles
8176cd8c0f [ao] fixing quantized prelu workflow (#103455)
Summary: https://github.com/pytorch/pytorch/issues/100654 noticed prelu
was not running its observers when the quantization flow was being run,
this was a bug which is now fixed and the relevant prelu tests also now
check for this. Also added a corrected observer for PReLU to
qconfig_mapping

Test Plan: python test/test_quantization.py TestStaticQuantizedModule.test_prelu

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103455
Approved by: https://github.com/jerryzh168
2023-06-23 16:45:40 +00:00
Omkar Salpekar
ae1ed27756 [codemod][numpy] replace np.str with str (#103931)
Summary:
`np.str` is removed from numpy 1.20.0. It was an alias to builtin `str` and it's safe to do the replacement.

The whole changes is mechanical, generated using the following onliner:
```
fbgr -sl 'np\.str\b' | xargs perl -pi -e 's,\bnp\.str\b,str,g'
```

Test Plan: sandcastle

Differential Revision: D46586144

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103931
Approved by: https://github.com/huydhn
2023-06-21 18:16:42 +00:00
leslie-fang-intel
488a4303a5 Enable quantized_max_pool3d (#101654)
**Summary**
Enable `quantized_max_pool3d` kernel to fix the issue https://github.com/pytorch/pytorch/issues/101386.

**Test Plan**
```
clear && python -u -m pytest -s -v test_quantized_op.py -k test_max_pool3d
clear && python -u -m pytest -s -v test_quantized_op.py -k test_max_pool3d_nhwc
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101654
Approved by: https://github.com/albanD, https://github.com/jgong5, https://github.com/mingfeima
2023-05-23 00:45:38 +00:00
Aaron Gokaslan
616208b4fe [BE]: Cleanup deprecated stdlib imports (UP006,UP035) (#101361)
Automated fix to cleanup some deprecated/useless python imports.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101361
Approved by: https://github.com/zou3519
2023-05-15 14:32:41 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
Justin Chu
79c9e82e27 Fix flake8 lint errors reported by ruff - take 2 (#99798)
Replaces #99784. This PR is pure autofix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99798
Approved by: https://github.com/Skylion007, https://github.com/kit1980
2023-04-23 23:09:51 +00:00
Vasiliy Kuznetsov
cdab1d676c pt2e short term quant: respect qmin/qmax for linear weight (#96232)
Summary:

Makes the `nnqr.Linear` module respect the qmin/qmax attributes of weight observer.  This is to unblock some customer teams who are depending on non-default values of these attributes.

Test plan:

```
python test/test_quantization.py -k TestReferenceQuantizedModule.test_linear_decomposed
```

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96232
Approved by: https://github.com/andrewor14
2023-03-10 04:46:20 +00:00
vasiliy
dc70e8175f Add various uninterpreted bit tensor data types (try 2) (#95860)
Summary:

This is a retry of https://github.com/pytorch/pytorch/pull/94992 which was reverted due to CI issues.

This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc).

@bypass-github-export-checks

Test Plan:

```
python test/test_quantization.py -k TestBits
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95860
Approved by: https://github.com/atalman
2023-03-04 03:35:59 +00:00
leslie-fang-intel
d1a168f176 equal_quantized_cpu requires both inputs are quantized tensor (#95875)
**Summary**
Fix the issue https://github.com/pytorch/pytorch/issues/95291, `equal_quantized_cpu` requires both inputs are quantized tensor.

**Test Plan**
```
python -m pytest test_quantization.py -k test_quantized_equal
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95875
Approved by: https://github.com/vkuzo, https://github.com/jgong5
2023-03-03 05:33:23 +00:00
Jacob Szwejbka
fc324d3485 [quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854)
Summary:
Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization
for the input in dynamic quantization

Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic"

Reviewed By: digantdesai

Differential Revision: D43134794

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854
Approved by: https://github.com/digantdesai
2023-02-28 19:39:31 +00:00
leslie-fang-intel
d89bfa16e7 [quant] add serialization method for quantized hardswish (#94486)
**Summary**
Fix the issue: https://github.com/pytorch/pytorch/issues/91877. The root cause is serialization and deserialization method for `state_dict` does not enable for `QuantizedHardswish`. Added these methods in this PR.

**Test plan**
```
python -m pytest quantization/core/test_quantized_module.py -k test_hard_swish
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94486
Approved by: https://github.com/jgong5, https://github.com/vkuzo
2023-02-24 04:43:27 +00:00
PyTorch MergeBot
3bafecf719 Revert "Add various uninterpreted bit tensor data types (#94992)"
This reverts commit 9dbfca7840.

Reverted https://github.com/pytorch/pytorch/pull/94992 on behalf of https://github.com/atalman due to breaks libtorch windows nightly builds see: https://github.com/pytorch/pytorch/pull/95406
2023-02-23 23:54:23 +00:00
Yuxin Wu
9bb2fe3eae fix numpy1.24 deprecations in unittests (#93997)
Fixes https://github.com/pytorch/pytorch/issues/91329

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93997
Approved by: https://github.com/ngimel, https://github.com/jerryzh168
2023-02-18 00:59:09 +00:00
vasiliy
9dbfca7840 Add various uninterpreted bit tensor data types (#94992)
Summary:

This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc).

Note: this is a copy-pasta of https://github.com/pytorch/pytorch/pull/89990 with a bug fix for clang9, easier to just to put up another PR since I'm not sure how comandeering works with Meta-only changes.

@bypass-github-export-checks

Test Plan:

```
python test/test_quantization.py -k TestBits
```

Reviewers:

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94992
Approved by: https://github.com/angelayi
2023-02-18 00:04:30 +00:00
Xuehai Pan
046e88a291 [BE] [3/3] Rewrite super() calls in test (#94592)
Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied.

- #94587
- #94588
- #94592

Also, methods with only a `super()` call are removed:

```diff
class MyModule(nn.Module):
-   def __init__(self):
-       super().__init__()
-
    def forward(self, ...):
        ...
```

Some cases that change the semantics should be kept unchanged. E.g.:

f152a79be9/caffe2/python/net_printer.py (L184-L190)

f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592
Approved by: https://github.com/ezyang, https://github.com/seemethere
2023-02-12 22:20:53 +00:00
Aaron Gokaslan
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
Aaron Gokaslan
3ce1ebb6fb Apply some safe comprehension optimizations (#94323)
Optimize unnecessary collection cast calls, unnecessary calls to list, tuple, and dict, and simplify calls to the sorted builtin. This should strictly improve speed and improve readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94323
Approved by: https://github.com/albanD
2023-02-07 23:53:46 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
Vasiliy Kuznetsov
f15ab8a7f2 AO migration: replace torch internal callsites (#94170)
Summary:

Do the following renames:
`torch.quantization` -> `torch.ao.quantization`
`torch.nn.quantized` -> `torch.ao.nn.quantized`
`torch.nn.quantizable` -> `torch.ao.nn.quantizable`
`torch.nn.qat` -> `torch.ao.nn.qat`
`torch.nn.intrinsic` -> `torch.ao.nn.intrinsic`

And then, do
`torch.ao.nn.quantized._reference` -> `torch.ao.nn.quantized.reference` to clean up the aftermath of https://github.com/pytorch/pytorch/pull/84974

Then, manually update `test/test_module_init.py` to fix hanging whitespace due to the replace.

Run this script to do the replacements: https://gist.github.com/vkuzo/7f7afebf8c31b9ba48306223e68a1c82

This is for https://github.com/pytorch/pytorch/issues/81667

Test plan: CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94170
Approved by: https://github.com/jerryzh168
2023-02-07 02:32:23 +00:00
Xia, Weiwen
eea752f853 [Quant][ONEDNN] Fix weight reorder issue for grouped convolution (#91934)
**Summary**
For onednn quant backend only.
QConv weight may be reordered to another blocked format if input shape is changed at runtime. It's a bug that group info is not retained for such reordering. This may lead to wrong shape of weight after reordering. This PR fixes this bug.

**Test plan**
python test/test_quantization.py -k test_conv_reorder_issue_onednn

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91934
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 07:43:53 +00:00
leslie-fang-intel
0f802eedc2 [Quant][FX] Lower QConvAddReLU2d for onednn backend (#91155)
**Summary**
Add quantization mappings for QConvAddReLU2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode.

**Test plan**
```
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91155
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:18:52 +00:00
leslie-fang-intel
e77f28a03d [Quant] Add fused ConvAddReLU2d module for onednn backend (#91154)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused ConvAddReLU2d module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.

**Test plan**
```
python -m pytest test_quantization.py -k test_conv2d_add_relu
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91154
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:16:23 +00:00
leslie-fang-intel
ef4118e435 [Quant][FX] Lower QConvAdd2d for onednn backend (#91153)
**Summary**
Add quantization mappings for QConvAdd2d for int8 inference for onednn backend. The fusion and lowering is supported only in FX mode.

**Test plan**
```
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_onednn
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_by_default
python -m pytest test_quantization.py -k test_fuse_conv_bn_add_relu_lowering
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91153
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:14:12 +00:00
leslie-fang-intel
53c3555a6a [Quant] Add fused ConvAdd2d module for onednn backend (#91152)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `ConvAdd2d` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.

**Test plan**
```
python -m pytest test_quantization.py -k test_conv2d_add
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91152
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-02-01 01:11:25 +00:00
leslie-fang-intel
21c7c7c72f [Quant] Use the true src zero point to query and create conv pd (#90818)
**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_transpose_reorder_issue_onednn
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90818
Approved by: https://github.com/Xia-Weiwen, https://github.com/jgong5, https://github.com/jerryzh168
2023-01-31 01:23:41 +00:00
leslie-fang-intel
a71d9a928f [Quant] Add fused conv2d_add_relu op for onednn backend (#90364)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90364
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-01-31 01:20:50 +00:00
leslie-fang-intel
a62fc09a1f [Quant] Add fused conv2d_add op for onednn backend (#90262)
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused `conv2d_add` op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90262
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
2023-01-28 06:30:29 +00:00
Jerry Zhang
61457671a5 [quant][fx][be] Remove _input_output_observed from backend_config (#92589)
Summary:
This is no longer needed, we can use dtype to decide whether an observer is needed or not

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92589
Approved by: https://github.com/jcaip
2023-01-27 22:17:05 +00:00
Kwanghoon An
a2e0f8e529 [ FL-gradient quantization] Adding QNN unpack feature (#92714)
Summary: We are trying to add a new feature for quantized gradient computation which enables backward() function for QNNPACK

Test Plan: buck2 test //caffe2/test/quantization:quantization -- test_qlinear_qnnpack_free_memory_and_unpack

Differential Revision: D40927291

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92714
Approved by: https://github.com/digantdesai, https://github.com/jianyuh
2023-01-27 05:37:03 +00:00