Commit Graph

71 Commits

Author SHA1 Message Date
Sergii Dymchenko
f51f6aa387 Fix non-existing parameters in docstrings (#90505)
Continuation after https://github.com/pytorch/pytorch/pull/90163.

Here is a script I used to find all the non-existing arguments in the docstrings (the script can give false positives in presence of *args/**kwargs or decorators):

_Edit:_
I've realized that the indentation is wrong for the last `break` in the script, so the script only gives output for a function if the first docstring argument is wrong. I'll create a separate PR if I find more issues with corrected script.

``` python
import ast
import os
import docstring_parser

for root, dirs, files in os.walk('.'):
    for name in files:
        if root.startswith("./.git/") or root.startswith("./third_party/"):
            continue
        if name.endswith(".py"):
            full_name = os.path.join(root, name)
            with open(full_name, "r") as source:
                tree = ast.parse(source.read())
                for node in ast.walk(tree):
                    if isinstance(node, ast.FunctionDef):
                        all_node_args = node.args.args
                        if node.args.vararg is not None:
                            all_node_args.append(node.args.vararg)
                        if node.args.kwarg is not None:
                            all_node_args.append(node.args.kwarg)
                        if node.args.posonlyargs is not None:
                            all_node_args.extend(node.args.posonlyargs)
                        if node.args.kwonlyargs is not None:
                            all_node_args.extend(node.args.kwonlyargs)
                        args = [a.arg for a in all_node_args]
                        docstring = docstring_parser.parse(ast.get_docstring(node))
                        doc_args = [a.arg_name for a in docstring.params]
                        clean_doc_args = []
                        for a in doc_args:
                            clean_a = ""
                            for c in a.split()[0]:
                                if c.isalnum() or c == '_':
                                    clean_a += c
                            if clean_a:
                                clean_doc_args.append(clean_a)
                        doc_args = clean_doc_args
                        for a in doc_args:
                            if a not in args:
                                print(full_name, node.lineno, args, doc_args)
                            break

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90505
Approved by: https://github.com/malfet, https://github.com/ZainRizvi
2022-12-09 21:43:09 +00:00
HDCharles
cee396fa07 [ao][ns] PNP demo for exposing arbitrary model transforms (#90153)
adding way to use arbitrary prepare and convert functions with PNP.

note this is a recreation of https://github.com/pytorch/pytorch/pull/89892 which was reverted due to landing not syncing between github and fbcode

python test/test_quantization.py
TestFxNumericSuiteNShadows.test_custom_functions_and_tracer

Differential Revision: [D41723892](https://our.internmc.facebook.com/intern/diff/D41723892/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90153
Approved by: https://github.com/vkuzo
2022-12-06 04:24:54 +00:00
HDCharles
9013c92a9f [ao] making QConfigMapping print in a user friendly way (#89932)
Summary: added __repr__ to QConfigMapping and QConfigMultiMapping
loosely based on __repr__ for BaseSparsifier

example output:

```
>>> import torch
>>> print(torch.ao.quantization.qconfig_mapping.get_default_qconfig_mapping())
QConfigMapping (
 global_qconfig
  QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
 object_type_qconfigs
  reshape: QConfig(activation=<class 'torch.ao.quantization.observer.ReuseInputObserver'>, weight=<class 'torch.ao.quantization.observer.NoopObserver'>)
  <class 'torch.nn.modules.conv.Conv1d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.conv.Conv2d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.conv.Conv3d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.conv.ConvTranspose1d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.conv.ConvTranspose2d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.conv.ConvTranspose3d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.linear.Linear'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <built-in method conv1d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <built-in method conv2d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <built-in method conv3d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <built-in method conv_transpose1d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <built-in method conv_transpose2d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <built-in method conv_transpose3d of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <built-in function linear>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.activation.ReLU'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <function relu at 0x7f08ad57bc10>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <built-in method relu of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.batchnorm.BatchNorm1d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.batchnorm.BatchNorm2d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <class 'torch.nn.modules.batchnorm.BatchNorm3d'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=functools.partial(<class 'torch.ao.quantization.observer.PerChannelMinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_channel_symmetric){})
  <function layer_norm at 0x7f08ad57fca0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=<class 'torch.ao.quantization.observer.PlaceholderObserver'>)
  <class 'torch.nn.modules.normalization.LayerNorm'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.HistogramObserver'>, reduce_range=True){}, weight=<class 'torch.ao.quantization.observer.PlaceholderObserver'>)
  <class 'torch.nn.modules.activation.Hardsigmoid'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <function hardsigmoid at 0x7f08ad57f670>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  hardsigmoid: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  hardsigmoid_: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.activation.Sigmoid'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <built-in method sigmoid of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  sigmoid: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  sigmoid_: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.activation.Softmax'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.00390625, zero_point=0, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <class 'torch.nn.modules.activation.Tanh'>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.0078125, zero_point=128, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  <built-in method tanh of type object at 0x7f08b99497e0>: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.0078125, zero_point=128, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  tanh: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.0078125, zero_point=128, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
  tanh_: QConfig(activation=functools.partial(<class 'torch.ao.quantization.observer.FixedQParamsObserver'>, scale=0.0078125, zero_point=128, dtype=torch.quint8, quant_min=0, quant_max=255){}, weight=functools.partial(<class 'torch.ao.quantization.observer.MinMaxObserver'>, dtype=torch.qint8, qscheme=torch.per_tensor_symmetric){})
 module_name_regex_qconfigs
  OrderedDict()
 module_name_qconfigs
  OrderedDict()
 module_name_object_type_order_qconfigs
  OrderedDict()
)
```

Test Plan: python test/test_quantization.py
TestFXNumericSuiteNShadows.test_qconfig_multi_mapping_repr

python test/test_quantization.py
TestQuantizeFx.test_qconfig_mapping_repr
Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89932
Approved by: https://github.com/vkuzo
2022-12-02 05:24:47 +00:00
andrewor14
d80056312a [Quant][fx][bc-breaking] Rename fx/*patterns.py (#89872)
Summary: This commit renames fx/quantization_patterns.py
to fx/quantize_handler.py, and fx/fusion_patterns.py to
fx/fuse_handler.py. This is because these files contain
only QuantizeHandler and FuseHandler respectively, so the
new names are more descriptive. A future commit will
further break BC by removing all the empty *QuantizeHandler
classes.

BC-breaking notes:

The following classes under the
`torch.ao.quantization.fx.quantization_patterns` namespace
are migrated to the `torch.ao.quantization.fx.quantize_handler`
namespace:
```
QuantizeHandler
BinaryOpQuantizeHandler
CatQuantizeHandler
ConvReluQuantizeHandler
LinearReLUQuantizeHandler
BatchNormQuantizeHandler
EmbeddingQuantizeHandler
RNNDynamicQuantizeHandler
DefaultNodeQuantizeHandler
FixedQParamsOpQuantizeHandler
CopyNodeQuantizeHandler
GeneralTensorShapeOpQuantizeHandler
CustomModuleQuantizeHandler
StandaloneModuleQuantizeHandler
```

The following classes under the
`torch.ao.quantization.fx.fusion_patterns` namespace are
migrated to the `torch.ao.quantization.fx.fuse_handler`
namespace:
```
DefaultFuseHandler
FuseHandler
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89872
Approved by: https://github.com/jerryzh168
2022-12-01 17:37:07 +00:00
andrewor14
2bce6d09ee [Quant][fx][bc-breaking] Remove backend_config_utils.py (#89810)
Summary: Previously under torch/ao/quantization we have
backend_config/utils.py and fx/backend_config_utils.py, which
was confusing. This commit deletes the latter and moves
everything there to more suitable util files.

BC-breaking note: The following public APIs under the
`torch.ao.quantization.fx.backend_config_utils` namespace
are removed in this commit.

```
get_quantize_handler_cls
get_fusion_pattern_to_fuse_handler_cls
get_native_quant_patterns
get_pattern_to_quantize_handlers
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89810
Approved by: https://github.com/jerryzh168
2022-11-29 18:01:40 +00:00
PyTorch MergeBot
9d209e7834 Revert "[ao] making _is_activation_post_process private (#87520)"
This reverts commit 45c62a3377.

Reverted https://github.com/pytorch/pytorch/pull/87520 on behalf of https://github.com/bigfootjon due to Diff reverted internally
2022-11-21 16:48:26 +00:00
HDCharles
45c62a3377 [ao] making _is_activation_post_process private (#87520)
Summary: same function in observer and quantize, consolidated to a
single function. Note the definitions were slightly different, I've
changed the definition to be maximally inclusive so that the name of the
function is more accurate

Test Plan: python test/test_public_bindings.py
python test/test_quantization.py

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D40709276](https://our.internmc.facebook.com/intern/diff/D40709276)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87520
Approved by: https://github.com/jcaip
2022-11-16 21:31:57 +00:00
HDCharles
f7a04f310b [ao][ns] Replacing List[QConfigMapping] in PNP (#86922)
Summary: Added QConfigMultiMapping which is essentially a
List[QConfigMapping] with set methods and dedicated handling to
avoid unwanted matches and improve UX.

note: the from __future__ import annotations line caused weird errors when the
QConfigMultiMapping class was put in _numeric_suite_fx.py so it was moved.

Test Plan: python test/test_quantization.py TestFxNumericSuiteNShadows

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86922
Approved by: https://github.com/vkuzo
2022-10-26 18:56:53 +00:00
Jerry Zhang
761ca20dd8 [quant][be] Rename qconfig_map to node_name_to_qconfig (#86861)
Summary:
att, with the introduction of QConfigMapping, this name is now very confusing, so renamed
it to something clearer

Test Plan:
python test/test_quantization.py TestQuantizeFx

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86861
Approved by: https://github.com/vkuzo
2022-10-15 00:08:36 +00:00
zaf
efccb6401c [quant][ao_migration] nn.intrinsic.qat migration to ao (#86171)
All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.qat`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline.

```
python test/test_quantization.py TestAOMigrationNNIntrinsic
```

Differential Revision: [D39419993](https://our.internmc.facebook.com/intern/diff/D39419993/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39419993/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86171
Approved by: https://github.com/jerryzh168
2022-10-07 17:29:42 +00:00
Vasiliy Kuznetsov
237316aa1d PNP: early FX numeric suite tool to quantize each layer N times (#80521)
Summary:

This PR is an early prototype of a tool to quantize each layer of a model
N times, with N qconfigs each. We follow the design agreed upon in
https://fburl.com/gdoc/e1gaq3ih .

Current API:

```
m = M().eval()
example_input = (torch.randn(2, 2),)
qconfig_mappings = [
    QConfigMapping().set_global(torch.quantization.default_qconfig),
    QConfigMapping().set_global(torch.quantization.default_dynamic_qconfig),
]
backend_config = get_native_backend_config()

msp = prepare_n_shadows_model(
    m, example_input, qconfig_mappings, backend_config)

for _ in range(2):
    msp(*example_input)

msq = convert_n_shadows_model(msp)
msq(*example_input)

results = extract_results_n_shadows_model(msq)
print_comparisons_n_shadows_model(results)

// example output

subgraph_idx    ref_node_name      best_idx        1        2
--------------  ---------------  ----------  -------  -------
subgraph_0      fc1                       2  42.0834  42.6279
subgraph_1      fc2                       2  43.7259  50.0593
```

Test plan:

```
python test/test_quantization.py -k test_n_shadows
```

Differential Revision: [D37650332](https://our.internmc.facebook.com/intern/diff/D37650332)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80521
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14
2022-10-06 02:30:45 +00:00
Jesse Cai
d144594512 [Quant][fx] Remove WEIGHT_INDEX_DICT and BIAS_INDEX_DICT (Part 2) (#83853)
Summary:
- Finishes the second part of https://github.com/pytorch/pytorch/pull/83263
- Removes WEIGHT_INDEX_DICT and BIAS_INDEX_DICT from utils.py
- Moves two funcitons, `node_arg_is_weight` and `node_arg_is_bias` into utils.py from prepare.py
convert.py and _equalize.py now use node_arg_is_weight instead of the dictionaries
- Adds in quantization support for `F.groupnorm`.

Add in missing BackendPatternConfigs for layernorm, instancenorm, and groupnorm

Test Plan:
```
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 2b157e0dc4f1553be1f4813b4693db952e6fc558
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83848

Fixes #83093
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83853
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14
2022-08-29 18:08:36 +00:00
zaf
2f04ba2c7c [quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat`
    - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)!

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:38 +00:00
zaf
d32a762147 [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:34 +00:00
zaf
c92e5ac95b [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012/)

Differential Revision: [D38926012](https://our.internmc.facebook.com/intern/diff/D38926012)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-25 16:50:33 +00:00
Vasiliy Kuznetsov
58170fb8aa Remove DBR quantization from the codebase (#83642)
Summary:

DBR quantization is a no-go for now because it does not align well with
PyTorch 2.0 plans and we do not want to build yet another tracing system.

Deleting it from the codebase for now since there are no plans to develop
this in the near future. We can bring it back at a later time if necessary.

Test plan:

CI

Differential Revision: [D38839556](https://our.internmc.facebook.com/intern/diff/D38839556)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83642
Approved by: https://github.com/andrewor14, https://github.com/jerryzh168
2022-08-23 15:18:40 +00:00
PyTorch MergeBot
6a9c02339d Revert "[quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)"
This reverts commit 432f037498.

Reverted https://github.com/pytorch/pytorch/pull/78713 on behalf of https://github.com/janeyx99 due to Reverting for breaking (trunk-only) ios build
2022-08-22 07:32:37 +00:00
PyTorch MergeBot
b1a7b67529 Revert "[quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)"
This reverts commit e6fb97d8ae.

Reverted https://github.com/pytorch/pytorch/pull/78714 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:30:48 +00:00
PyTorch MergeBot
4cbb1986fe Revert "[quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)"
This reverts commit 7cd2fa1d38.

Reverted https://github.com/pytorch/pytorch/pull/78716 on behalf of https://github.com/janeyx99 due to sorry, reverting so https://github.com/pytorch/pytorch/pull/78713 could be cleanly reverted
2022-08-22 07:23:24 +00:00
zaf
7cd2fa1d38 [quant][ao_migration] torch.nn.qattorch.ao.nn.qat (#78716)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [X] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [X] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [X] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [X] [Current PR] `torch.nn.qat` → `torch.ao.nn.qat`
    - [X] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [X] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- None

Differential Revision: [D36861197](https://our.internmc.facebook.com/intern/diff/D36861197/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36861197/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78716
Approved by: https://github.com/jerryzh168
2022-08-22 05:33:23 +00:00
zaf
e6fb97d8ae [quant][ao_migration] torch.nn.quantized.dynamictorch.ao.nn.quantized.dynamic (#78714)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [X] [Current PR] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- [Documentation](docs/source/quantization-support.rst) @vkuzo
- [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10
- [BC test](test/quantization/bc/test_backward_compatibility.py) @vkuzo
- [IR emitter](torch/csrc/jit/frontend/ir_emitter.cpp) @jamesr66a
- [JIT serialization](torch/csrc/jit/serialization/import_source.cpp) @IvanKobzarev @jamesr66a

Differential Revision: [D36860660](https://our.internmc.facebook.com/intern/diff/D36860660/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36860660/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78714
Approved by: https://github.com/jerryzh168
2022-08-22 05:22:00 +00:00
zaf
432f037498 [quant][ao_migration] torch.nn.quantized.modulestorch.ao.nn.quantized.modules (#78713)
Context: In order to avoid the cluttering of the `torch.nn` namespace
the quantized modules namespace is moved to `torch.ao.nn`.

The list of the `nn.quantized` files that are being migrated:

- [ ] `torch.nn.quantized` → `torch.ao.nn.quantized`
    - [X] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional`
    - [X] [Current PR] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules`
    - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic`
    - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference`
- [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable`
- [ ] `torch.nn.qat` → `torch.ao.nn.qat`
    - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules`
    - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic`
- [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic`
    - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules`
    - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat`
    - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized`
        - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules`
        - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic`

Majority of the files are just moved to the new location.
However, specific files need to be double checked:

- Documentation @vkuzo
  - docs/source/conf.py
  - docs/source/quantization.rst
- [quantize_fx](torch/ao/quantization/quantize_fx.py) @jerryzh168
- [common test routine](test/quantization/ao_migration/common.py) @HDCharles
- JIT stuff @jamesr66a
  - torch/csrc/jit/passes/hoist_conv_packed_params.cpp
  - torch/csrc/jit/passes/quantization/helper.h
  - torch/csrc/jit/serialization/import_source.cpp

Differential Revision: [D36860145](https://our.internmc.facebook.com/intern/diff/D36860145/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78713
Approved by: https://github.com/jerryzh168
2022-08-22 01:38:55 +00:00
Weiwen Xia
2edd6aaeaa Add prelu op and module for quantized CPU backend (#73491)
Add prelu op and module for quantized CPU backend.
The PR includes:
- Quantized version of prelu op
- Native prelu kernel for quantized CPU
- Prelu modules in `nn` and `nn.quantized`
- FX support for prelu
- Unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491
Approved by: https://github.com/jerryzh168
2022-07-20 07:48:15 +00:00
PyTorch MergeBot
b64096a264 Revert "Add prelu op and module for quantized CPU backend (#73491)"
This reverts commit 3a6d6bc3cc.

Reverted https://github.com/pytorch/pytorch/pull/73491 on behalf of https://github.com/malfet due to Broke Windows builds, see 3a6d6bc3cc
2022-06-30 12:54:39 +00:00
Weiwen Xia
3a6d6bc3cc Add prelu op and module for quantized CPU backend (#73491)
Add prelu op and module for quantized CPU backend.
The PR includes:
- Quantized version of prelu op
- Native prelu kernel for quantized CPU
- Prelu modules in `nn` and `nn.quantized`
- FX support for prelu
- Unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491
Approved by: https://github.com/jerryzh168
2022-06-30 06:50:22 +00:00
HDCharles
3f1dc7ec00 [quant] Create default custom modules for LSTM and MHA (#79960)
Summary:
Currently we expect the users to provide custom modules for LSTM and MHA. However, as we almost always ask the users to use those modules in the custom context, it is better to make this behavior default. In this case we try to align with the base quantization API, if the user specifies a custom_config_dict then that is used, however if the value is left as None then the default is used. If a user would like to both use the default and modify it, they have to do so manually, however the default is accessible by get_default_custom_config_dict

Additionally, the NS which uses prepare to insert custom observers for
its purposes had to be slightly modified to pass in an empty
custom_config_dict in order to avoid modifying the custom modules.

due to weird CI issues with previous PR,
previous discussion can be found: https://github.com/pytorch/pytorch/pull/71192

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79960
Approved by: https://github.com/z-a-f
2022-06-30 00:00:46 +00:00
Vasiliy Kuznetsov
d4aa204f11 ns for fx: further fixes for kwargs-only
Summary:

This is a follow-up to https://github.com/pytorch/pytorch/pull/78181,
apparently that PR did not fix all errors in a Meta model using
the NS shadow APIs.

We do not have an OSS repro, so putting the PR up so we can test in fbcode.

Test plan:

```
python test/test_quantization.py -k FXNumericSuite -f
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79015

Approved by: https://github.com/dzdang
2022-06-08 15:29:48 +00:00
Vasiliy Kuznetsov
53e05ad4b2 ns for fx: remove restriction on nodes with no args and only kwargs
Summary:

Removes the restriction from NS for FX on handling nodes which have
no positional arguments, such as `F.linear(input=x, weight=w, bias=b).

In order to achieve this, we delete all places in the code which
were doing things like

```
node.args[0]
```

And replace them with

```
_get_normalized_nth_input(node, gm, 0)
```

The `_get_normalized_nth_input` function is a best effort way to
get the n'th normalized input.

This is needed because some FX tools output nodes normalized to
be kwargs only, and we need to be able to handle this in NS.

Test plan:

```
python test/test_quantization.py -k test_linear_kwargs_shadow
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78181

Approved by: https://github.com/z-a-f, https://github.com/hx89
2022-05-25 17:00:39 +00:00
Vasiliy Kuznetsov
b49a4be7ac ns for fx: remove duplicated BNReLU mappings
Summary:

These mappings are already defined for `BatchNorm{n}d` as the root
node, we don't need to specify them again. Removing to clean
up the code.

Test plan:

```
python test/test_quantization.py -k FXNumericSuite
python test/test_quantization.py -k FXGraphMatcher
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76993

Approved by: https://github.com/jerryzh168
2022-05-13 20:41:42 +00:00
Vasiliy Kuznetsov
d8479098a6 ns for fx: remove quantized ReLU6 from mapping
Summary:

This module is no longer swapped by FX graph mode quantization,
because it can take quantized inputs. Removing it from NS for FX
mappings.

Test plan:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76992

Approved by: https://github.com/jerryzh168
2022-05-13 20:38:31 +00:00
Vasiliy Kuznetsov
6a33b80191 ns for fx: remove GroupNorm from mapping
Summary:

GroupNorm quantization is defined but it looks like FX graph
mode quantization does not have it enabled.

Removing it from NS for FX.

Test plan:

```
python test/test_quantization.py -k FXGraphMatcher
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76991

Approved by: https://github.com/jerryzh168
2022-05-13 20:33:27 +00:00
Vasiliy Kuznetsov
6e05f76089 ns for fx: clean up linear in relationship mapping
Summary:

More cleanups in mappings:
1. makes the `nnqatd.Linear` entry be looked up dynamically
2. moves the `NonDynamicallyQuantizableLinear` down and marks it as edge case

Test plan:

```
python test/test_quantization.py -k FXGraphMatcher
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76990

Approved by: https://github.com/jerryzh168
2022-05-13 20:27:53 +00:00
Vasiliy Kuznetsov
9265cc2097 ns for fx: make torch.ops.quantized.dropout mapping dynamic
Summary:

Instead of hardcoding the relationship between `F.dropout` and `toq.dropout`,
read it from the mapping.

The mapping itself might need to be in the lowering file, but that's a separate
issue.

Test plan:

```
python test/test_quantization.py -k FXGraphMatcher
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76989

Approved by: https://github.com/jerryzh168
2022-05-13 20:26:06 +00:00
Vasiliy Kuznetsov
cc59641acb ns for fx: remove torch.ops.quantized.cat
Summary:

FX graph mode quantization no longer uses `torch.ops.quantized.cat`,
instead `torch.cat` can use quantized inputs.

This PR removes the outdated mapping from NS for FX.

Test plan:

```
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76988

Approved by: https://github.com/jerryzh168
2022-05-13 20:24:16 +00:00
Vasiliy Kuznetsov
20b75e3e5f ns for fx: clean up convtranspose mappings
Summary:

Fixes a couple of problems with `ConvTranspose` in NS mappings:
1. deletes the dynamic versions, as they do not work yet
2. deletes `ConvTranspose3d`, as it's not swapped yet in the quantization workflow
3. removes a duplicate set

Test plan:

```
python test/test_quantization.py -k FXGraphMatcher
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76980

Approved by: https://github.com/jerryzh168
2022-05-13 20:22:42 +00:00
Vasiliy Kuznetsov
0e067e4cc9 ns for fx to backend_config_dict [2/x]: native lowering mappings
Summary:

NS for FX mappings were originally hardcoded, because quantization op
mappings were not easily reusable. Now that we have `backend_config_dict`,
we can start moving NS for FX to use them and delete the hardcoded mappings.

This PR deletes the hardcoded mappings from NS about the lowering step,
and instead reads them from the lowering configs.

Note: for now, there is no way to configure the tool to use lowering
configs from a different lowering pass. That may be added at some
future point, but it's not important now.

Test plan:

```
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76978

Approved by: https://github.com/jerryzh168
2022-05-13 20:21:01 +00:00
Vasiliy Kuznetsov
5419236946 ns for fx to backend_config_dict [1/x]: fused and qat modules
Summary:

NS for FX mappings were originally hardcoded, because quantization op
mappings were not easily reusable. Now that we have `backend_config_dict`,
we can start moving NS for FX to use them and delete the hardcoded mappings.

This first PR deletes the hardcoded mappings for `nni` and `nniqat` modules,
and instead reads these mappings from `backend_config_dict`.

Future PRs will incrementally move more of the mappings over.

Test plan:

```
python test/test_quantization.py -k FXNumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76958

Approved by: https://github.com/jerryzh168
2022-05-13 20:14:39 +00:00
Vasiliy Kuznetsov
3a8752db86 ns for fx: skip shadowing ops if copy subgraph is not implemented (#76663)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76663

Subgraph copy does not handle all edge cases. It's high eng time
to handle them all, and currently an unhandled edge case crashes
the script.

This PR adds a function to check if the subgraph copy is supported,
and skips shadowing if it is not supported. This way the model
can still go through the shadowing APIs without an exception.

Test Plan:
```
python test/test_quantization.py -k FXNumericSuite
```

Reviewed By: hx89

Differential Revision: D36069304

Pulled By: vkuzo

fbshipit-source-id: 6b38b8d8e43396a4cf2373b247223a19d451d096
(cherry picked from commit e2322ca0635c51a4701e60fa90f77915a3c46d0f)
2022-05-05 13:19:53 +00:00
Vasiliy Kuznetsov
d3e338935a ns for fx: skip shadowing for torch.cat, and also for nodes with only kwargs (#76561)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76561

User model had syntax like `torch.cat(tensors=[x])`. This PR fixes two errors
to unbreak this in NS shadow model:
1. skip nodes which only have kwargs (instead of throwing an exception)
2. explicitly skip shadowing of `torch.cat` (since it's not supported anyways)

Test Plan:
```
python test/test_quantization.py -k test_op_with_only_kwargs_skips_shadowing
python test/test_quantization.py -k test_op_mul_add_cat_skips_shadowing
```

Reviewed By: hx89

Differential Revision: D36017356

Pulled By: vkuzo

fbshipit-source-id: 0da4840a62c2dac183f8294c2cec4fce262474b3
(cherry picked from commit 88409c1576e7f690708957b2baa285fc7961e9d6)
2022-05-05 13:19:53 +00:00
Vasiliy Kuznetsov
e155e2584a ns for fx: skip operator.add and operator.mul when shadowing (#76504)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76504

Shadowing for add and mul is not implemented, this PR fixes the skipping
logic to also skip the `operator.add` and `operator.mul` flavor of these
operators.

Test Plan:
```
python test/test_quantization.py -k test_mul_add_skips_shadowing
```

Reviewed By: dzdang

Differential Revision: D35985997

Pulled By: vkuzo

fbshipit-source-id: f832e54a5461d3b182df4bb905357d6c66742e98
(cherry picked from commit 93ae9592f68873865ebfdc438bffb1c9486dd1c1)
2022-05-03 05:58:46 +00:00
Vasiliy Kuznetsov
385e5ba561 ns for fx: more meaningful error message when creating shadow model (#76468)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76468

This makes the error message when copying an unsupported node more verbose.
This is useful to debug where specifically in a user model this is failing.

Test Plan:
1. hardcode this condition to hit
2. run NS tests
3. verify the exception now prints details about the offending node

Reviewed By: jerryzh168

Differential Revision: D35978652

Pulled By: vkuzo

fbshipit-source-id: 9cc93dfa46469bf6ef60aa38d4011041b6709df9
(cherry picked from commit c6e382c2a69aba6ba66740f238bc14446521a433)
2022-05-03 05:58:46 +00:00
Vasiliy Kuznetsov
35545d85dc fx quant: add quantized Softmax workflow integration (#75106)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75106

In https://github.com/pytorch/pytorch/pull/75017 a quantized softmax
kernel was added. This PR adds the FX graph mode quantization workflow
integration to swap `nn.Softmax` to `nnq.Softmax`.

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops
```

Reviewed By: kimishpatel, andrewor14

Differential Revision: D35324817

Pulled By: vkuzo

fbshipit-source-id: 710ae3bedf8a6ad1dc411cd9808fdd0ce743e757
(cherry picked from commit d67603c0fbb1d3469d97bd538cec38aa8b03324b)
2022-04-20 21:54:26 +00:00
Jerry Zhang
74454bdb46 [quant][fx] Move backend_config folder to torch.ao.quantization
Summary:
Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented
the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different
workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that
it can be shared by different workflows
Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py)

Test Plan:
python test/teset_quantization.py TestQuantizeFx
python test/teset_quantization.py TestQuantizeFxOps
python test/teset_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestAOMigrationQuantization
python test/test_quantization.py TestAOMigrationQuantizationFx

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823

Approved by: https://github.com/vkuzo
2022-04-19 15:38:57 +00:00
Vasiliy Kuznetsov
f1f185f6f9 ns for fx: fix bug to enable again on torchvision models
Summary:

The tests were disabled by https://github.com/pytorch/pytorch/pull/61687, but
this specific behavior broke some time after while these tests were disabled.

The issue was that:
1. `torch.add` is present in these models
2. In the common codepath of comparing fp32 to int8, torch.ops.quantized.add was already filtered out because it did not have a dtype specified
3. In the less common codepath of comparing fp32 to fp32, torch.add was eligible for shadowing, but the logic was broken

This PR fixes (3) by disabling shadowing on ops which do not support it, by op type.
The support may be built later, if needed.

Test plan:

```
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_resnet18
python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_mobilenet_v2
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75472

Approved by: https://github.com/jerryzh168
2022-04-13 19:44:46 +00:00
Vasiliy Kuznetsov
ae3210420e ns for fx: fix issue with shadowing nodes of unknown dtype
Summary:

In https://github.com/pytorch/pytorch/pull/61687, a couple of FX Numeric Suite
tests were disabled.

This PR reenables one of these tests. We update the dtype inference logic
of NS to always return a specific type instead of sometimes returning
"fp32 or int8". When the type cannot be deduced by the current logic,
we do not shadow the node.

As a better version of dtype inference becomes available in FX Graph Mode Quantization,
we could migrate this code to use it.

Future PRs in the stack will unbreak other things to enable NS for FX to
work on torchvision again.

Test plan:

```
python test/test_quantization.py -k NumericSuite
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75471

Approved by: https://github.com/jerryzh168
2022-04-13 19:44:46 +00:00
Jerry Zhang
f83d047338 [quant][fx] Use native backend_config_dict in prepare
Summary:
Previously we are still relying on the registration mechnism and get the default quantize handlers that are registered,
now we have moved all registration to backend_config_dict we can get all quant patterns just from backend_config_dict now.

This PR enables using native backend_config_dict everywhere in prepare when the backend_config_dict is None, we'll also
do similar changes in convert as well

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75469

Approved by: https://github.com/vkuzo
2022-04-12 17:05:31 +00:00
Jerry Zhang
dd667b6e97 [quant][fx] Move all fusion registrations to backend_config_dict (#75318)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75318

This PR moves the registrations for fusion patterns to backend_config_dict

Also fixed one issue in numeric suite graph matcher, since now (torch.nn.ReLU, torch.nn.BatchNorm3d)
would appear in quant patterns, (previously only in fusion pattern), and we need to match sure (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d))
can match before (torch.nn.ReLU, torch.nn.BatchNorm3d), but previously, it looks like (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d)) is not
really matched since `end_node_matches_reversed_fusion` is expecting a flattened pattern like (torch.nn.ReLU, torch.nn.BatchNorm3d, torch.nn.Conv3d),
for now we'll manually flatten this pattern, but in the future I think we might want to use the matching function `is_match` under torch.ao.quantization.fx.match_utils
to do this matching.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFXNumericSuiteCoreAPIs

Imported from OSS

Reviewed By: vkuzo, andrewor14

Differential Revision: D35423788

fbshipit-source-id: a54093ccebae9c59aeee9399669ddb2c48bfb9aa
(cherry picked from commit 6a55ea8eb2740cedafb9972888fedf68e927586d)
2022-04-09 05:08:37 +00:00
Jerry Zhang
53f7233004 [quant][fx] Move all binary op configs to backend_config_dict (#75241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75241

We have a previous PR that enabled operator.add in backend_config_dict, this
PR moved the rest binary ops to backend_config_dict.
There are some ops left, which are not needed (previously fp16 ops), we
will move them in the following PR

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
 python test/test_quantization.py TestFXNumericSuiteCoreAPIs

Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D35403589

fbshipit-source-id: 663703b310944a6b7c5ade6d07a4d938a6ca082b
(cherry picked from commit 5a76ce031872c4fed5fcab5bb3c84a9394b01118)
2022-04-06 16:11:13 -07:00
Jerry Zhang
9817875729 [quant][fx] Add support for BinarOpQuantizeHandler in backend_config_dict (#74882)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74882

This PR adds support for ops like add/mul in backend_config_dict, these ops have different
observation_type based on the number of tensor inputs, when number of tensor inputs is 1,
we will share the output observer with input, otherwise we'll have a new observer.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: vkuzo, andrewor14

Differential Revision: D35236032

fbshipit-source-id: 7077f3ccee8a5d8d19b40107cf8ff16cceafc535
(cherry picked from commit a6f7a37f99fc727269d022d35cc5c0157b70c656)
2022-04-06 06:42:03 +00:00
Andrew Or
ee9335a608 [Quant][fx] Define native backend_config_dict for linear and conv (#74636)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74636

This commit changes how quantization patterns for linear
and conv are set up in prepare. Previously, these were set up
through ConvReluQuantizeHandler and LinearReLUQuantizeHandler.
After this commit, however, these were set up through the
corresponding entries in the native backend_config_dict,
rendering the above quantize handlers no longer necessary.
In future commits, we will do the same for the remaining ops.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Imported from OSS

Reviewed By: jerryzh168, ngimel

Differential Revision: D35225680

fbshipit-source-id: 4a79f63a11fce46701eb17aaf3619c1e827d72a4
(cherry picked from commit 475f599821cd32d3ba71ba086885ecdc4cbee755)
2022-04-04 14:07:15 +00:00