Commit Graph

15 Commits

Author SHA1 Message Date
Yuanyuan Chen
315ffdc1e4 [4/N] Apply ruff UP035 rule to python code (#164206)
Follows #164104

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164206
Approved by: https://github.com/albanD
2025-10-01 19:05:53 +00:00
Aaron Orenstein
9e0437a04a PEP585 update - torch/ao/quantization (#145140)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145140
Approved by: https://github.com/bobrenjc93
2025-01-19 10:20:00 +00:00
bobrenjc93
a55977f763 Migrate from Tuple -> tuple in torch/ao (#144265)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144265
Approved by: https://github.com/aorenste
2025-01-10 00:12:06 +00:00
Aaron Gokaslan
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
Xuehai Pan
2ce734cee9 [BE] enable UFMT for torch/ao/quantization/ (#128863)
Part of #123062

- #123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128863
Approved by: https://github.com/ezyang
ghstack dependencies: #128861, #128862
2024-07-25 04:17:54 +00:00
Aaron Orenstein
62bcdc0ac9 Flip default value for mypy disallow_untyped_defs [4/11] (#127841)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127841
Approved by: https://github.com/oulgen
2024-06-08 18:36:48 +00:00
andrewor14
691a44f403 [Quant][fx][bc-breaking] Add simpler BackendConfig pattern format (#90698)
Summary: The existing BackendConfig fusion pattern
uses a "reversed nested tuple" format that is highly
unintuitive. For example,
```
linear-relu -> (nn.ReLU, nn.Linear)
conv-bn-relu -> (nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))
```
This pattern format also complicates the signatures
of the user specified "fuser methods", which needed
to accept arguments in reverse nested order to match
the patterns:
```
def fuse_linear_relu(is_qat, relu, linear):
    ...

def fuse_conv_bn_relu(is_qat, relu, bn_conv):
    (bn, conv) = bn_conv
    ...
```
Instead, this commit introduces a new pattern format that
simply specifies the ops in forward order with no nesting:
```
linear-relu -> (nn.Linear, nn.ReLU)
conv-bn-relu -> (nn.Conv2d, nn.BatchNorm2d, nn.ReLU)

def fuse_linear_relu(is_qat, linear, relu):
    ...

def fuse_conv_bn_relu(is_qat, conv, bn, relu):
    ...
```
Note that the legacy "reversed nested tuple" is still
used internally since it is more general. In the
future, we should replace it with the format used in
the subgraph rewriter in `torch.fx`, and simplify the
existing pattern matching code to handle the new
format added in this commit.

BC-breaking Notes:

Before:
```
import torch as nn
import torch.ao.nn.intrinsic as nni
from torch.ao.quantization.backend_config import BackendPatternConfig

def fuse_linear_relu(is_qat, relu, bn_conv):
    (bn, conv) = bn_conv
    return nni.ConvBnReLU2d(conv, bn, relu)

config = BackendPatternConfig((nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))) \
    .set_dtype_configs(...) \
    .set_fuser_method(fuse_conv_bn_relu) \
    .set_fused_module(nni.ConvBnReLU2d)
```

After:
```
def fuse_linear_relu(is_qat, conv, bn, relu):
    return nni.ConvBnReLU2d(conv, bn, relu)

config = BackendPatternConfig((nn.Conv2d, nn.BatchNorm2d, nn.ReLU)) \
    .set_dtype_configs(...) \
    .set_fuser_method(fuse_conv_bn_relu) \
    .set_fused_module(nni.ConvBnReLU2d)
```

OR (for backward-compatibility)

```
def fuse_linear_relu(is_qat, relu, bn_conv):
    (bn, conv) = bn_conv
    return nni.ConvBnReLU2d(conv, bn, relu)

config = BackendPatternConfig() \
    ._set_pattern_complex_format((nn.ReLU, (nn.BatchNorm2d, nn.Conv2d))) \
    .set_dtype_configs(...) \
    .set_fuser_method(fuse_conv_bn_relu) \
    .set_fused_module(nni.ConvBnReLU2d) \
    ._set_use_legacy_pattern_format(True)
```

Before:
```
backend_config.configs  # returns Dict[Pattern, BackendPatternConfig]
```

After:
```
backend_config.configs  # returns List[BackendPatternConfig]
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestBackendConfig

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Differential Revision: [D41954553](https://our.internmc.facebook.com/intern/diff/D41954553)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90698
Approved by: https://github.com/vkuzo, https://github.com/jerryzh168
2022-12-14 22:44:29 +00:00
andrewor14
61801799a0 [Quant][bc-breaking] Remove overwrite_output_observer (#88620)
Summary: When the BackendConfig was first introduced,
`overwrite_output_observer` and `overwrite_output_fake_quantize`
were added to ensure fixed qparams ops like `torch.nn.Sigmoid`
and `torch.nn.Tanh` used the correct observers and fake quantizes.
However, this is hacky because the BackendConfig should not set
the observer constructors themselves, but should instead specify
only requirements on the observers.

Later, https://github.com/pytorch/pytorch/pull/80184 added the
correct observers to `get_default_qconfig_mapping` along with
validation logic that throws an error if incorrect observers
were specified. With this change, we no longer need to overwrite
the observers from the BackendConfig, since we expect the user to
pass in the correct observers for these ops.

This commit removes these overwrite observer settings in the
BackendConfig. Instead, we represent the observer constraints for
fixed qparams ops through the existing DTypeWithConstraints
mechanism. Note that, however, to be consistent with other
DTypeWithConstraints checks, we no longer throw an error if an
incorrect observer is specified, but simply ignore the offending
QConfig and log a warning instead. This is the BC-breaking part
of the change.

BC-breaking notes:

```
from torch.ao.quantization.qconfig import default_qconfig
from torch.ao.quantization.quantize_fx import prepare_fx

model = ModelWithFixedQParamsOps()
qconfig_mapping = QConfigMapping().set_global(default_qconfig)
example_inputs = ...
prepare_fx(model, qconfig_mapping, example_inputs)
```

Before this commit, running the above leads to an exception
because the wrong observers are used for fixed qparams ops.
After this commit, the above will only encounter a warning,
and the fixed qparams ops will not be quantized. In both cases,
switching to `get_default_qconfig_mapping` will cause the
fixed qparams ops to be quantized.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

Reviewers: jerryzh168, vkuzo

Subscribers: jerryzh168, vkuzo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88620
Approved by: https://github.com/jerryzh168
2022-11-16 18:44:12 +00:00
Jerry Zhang
ecf277abec [quant][improvement] Check the fixedqparam op qconfig based on backend_config (#87425)
Summary:
Previously we hardcoded the supported observers for fixedqparam ops, this PR changes that to take the information from BackendConfig,
this allows users to customize the support for fixed qparam ops

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_change_backend_config_for_fixed_qparam_ops

Reviewers:

Subscribers:

Tasks:

Tags:

unlinked from diff since it's too hard to land
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87425
Approved by: https://github.com/andrewor14
2022-10-28 23:38:40 +00:00
HDCharles
25476f2e4b [ao] fixing public v private for quantization_types (#86031)
Summary: the main problem with this was that the different objects
defined simply as 'Any' should theoretically be public but making them
public either A) results in an error about the module being 'typing'
rather than whatever module it should be or B) you set the module
manually, thereby changing the module for the original 'Any' class.

note: QuantizeHandler has a similar issue where its simply defined as
'Any'

Pattern was defined in multiple places which was causing issues so i just moved it to a single
place given the note at the top of quantization_types.py indicating
these definitions should be moved to utils at some point anyway.

Finally i changed any references to these objects to point at the
correct locations. Note: i didn't see any fb internal references to
NodePattern or QuantizerCls that would cause issues.

Test Plan: python test/test_public_bindings.py

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86031
Approved by: https://github.com/jerryzh168
2022-10-12 20:06:30 +00:00
Jerry Zhang
214a6500e3 [quant][docs] Additonal fixes for quantize_fx docs (#84587)
Summary:
Some more clarifications for the arguments, including linking to object docs (QConfigMapping, BackendConfig) and adding types
in the doc

Test Plan:
```
cd docs
make html
```
and

visual inspection for the generated docs

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84587
Approved by: https://github.com/vkuzo
2022-09-09 15:23:23 +00:00
Sergii Dymchenko
591222f5d9 Fix use-dict-literal lint (#83718)
Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718
Approved by: https://github.com/albanD
2022-08-24 00:26:46 +00:00
Andrew Or
782f3489c6 [Quant][fx][bc-breaking] Integrate BackendConfig with quantization flow (part 2) (#82557)
This is part 2 of the effort to replace `backend_config_dict` with
a python config object, a more formal and robust API that leads to
better user experience. This commit integrates the `BackendConfig`
implemented in part 1 (https://github.com/pytorch/pytorch/pull/81469)
with the existing FX graph mode quantization flow.

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps

BC-breaking Notes:

Before:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import ObservationType
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = {
    "input_dtype": torch.quint8,
    "output_dtype": torch.quint8
    "weight_dtype": torch.qint8,
    "bias_dtype": torch.float,
}

backend_config_dict = {
    "name": "my_backend",
    "configs": [{
        "pattern": torch.nn.Linear,
        "observation_type": ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT,
        "dtype_configs": [dtype_config],
        "root_module": torch.nn.Linear,
        "reference_quantized_module": torch.nn.quantized._reference.Linear,
        "qat_module": torch.nn.qat.Linear,
    }]
}

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(
    m, qconfig_mapping, example_inputs,
    backend_config_dict=backend_config_dict)
m = convert_fx(m, backend_config_dict=backend_config_dict)
```

After:
```
import torch
from torch.ao.quantization import get_default_qconfig_mapping
from torch.ao.quantization.backend_config import (
    BackendConfig,
    BackendPatternConfig,
    DTypeConfig,
    ObservationType,
)
from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx

dtype_config = DTypeConfig(
    input_dtype=torch.quint8,
    output_dtype=torch.quint8
    weight_dtype=torch.qint8,
    bias_dtype=torch.float,
)

backend_config = BackendConfig("my_backend").set_backend_pattern_config(
    BackendPatternConfig(torch.nn.Linear)
        .set_observation_type(ObservationType.OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT)
        .add_dtype_config(dtype_config)
        .set_root_module(torch.nn.Linear)
        .set_reference_quantized_module(torch.nn.quantized._reference.Linear)
        .set_qat_module(torch.nn.qat.Linear))

m = MyModel()
qconfig_mapping = get_default_qconfig_mapping()
example_inputs = (torch.rand(3, 3),)
m = prepare_fx(m, qconfig_mapping, example_inputs, backend_config=backend_config)
m = convert_fx(m, backend_config=backend_config)
```

Reviewers: jerryzh168

Subscribers: jerryzh168, supriyar

Differential Revision: [D38471932](https://our.internmc.facebook.com/intern/diff/D38471932)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82557
Approved by: https://github.com/jerryzh168
2022-08-08 18:55:50 +00:00
Vasiliy Kuznetsov
c15fca1137 quant doc: improve rendered documentation for backend_config_dict
Summary:

This improves the documentation page for backend_config_dict to render
the configurations in a human readable format, such as

```
{
  'pattern': torch.nn.modules.pooling.AdaptiveAvgPool1d,
  'dtype_configs': [
    {
      'input_dtype': torch.quint8,
      'output_dtype': torch.quint8,
    },
    {
      'input_dtype': torch.float16,
      'weight_dtype': torch.float16,
      'bias_dtype': torch.float16,
      'output_dtype': torch.float16,
    },
  ],
  'observation_type': ObservationType.OUTPUT_SHARE_OBSERVER_WITH_INPUT,
},
```

The results are also now sorted alphabetically by the normalized name of
the root op in the pattern.

A couple of utility functions are created to help with this. If in the future
we convert backend_config_dict to use typed objects, we can move this logic
to the objects at that time.

Test plan:

```
cd docs
make html
cd build
python -m server.http
// renders correctly, example: https://gist.github.com/vkuzo/76adfc7c89e119c59813a733fa2cd56f
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77535

Approved by: https://github.com/andrewor14
2022-05-18 11:46:07 +00:00
Jerry Zhang
74454bdb46 [quant][fx] Move backend_config folder to torch.ao.quantization
Summary:
Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented
the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different
workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that
it can be shared by different workflows
Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py)

Test Plan:
python test/teset_quantization.py TestQuantizeFx
python test/teset_quantization.py TestQuantizeFxOps
python test/teset_quantization.py TestQuantizeFxModels
python test/test_quantization.py TestAOMigrationQuantization
python test/test_quantization.py TestAOMigrationQuantizationFx

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823

Approved by: https://github.com/vkuzo
2022-04-19 15:38:57 +00:00