Summary:
Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html
This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files).
This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838
Test Plan: CI. You can also run `flake8` locally.
Reviewed By: jbschlosser
Differential Revision: D27724232
Pulled By: samestep
fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55710
In the current code, there is an edge case which leads to an error
after the prepare step:
1. have a pattern like this:
```
user_func_unmatched_to_qhandler -> node_matched_to_copy_node_qhandler
```
2. the user function returns a type which is not observable (i.e. not a
Tensor)
3. if this is run through `prepare_fx`, calibrating it with data leads
to a runtime error, because observers cannot observe non-tensor types.
This PR fixes the issue. If a node matched to `CopyNodeQuantizeHandler`
is after an unmatched node, we delete the observer.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_no_obs_between_unmatched_node_and_copy_node
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27686811
fbshipit-source-id: 320be41b1f383c6352ff89fb39a9f480822a3bb2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429
Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27609972
fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388
temporarily revert D27314678 (c57541ce06), it appears to cause a perf regression that makes quantization of some models take too long to complete tests.
Reviewed By: houseroad
Differential Revision: D27583809
fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644
Previously we special case copy operator in normal insert observer code, this PR tries to split the
special case logic to a separate function and keep the rest of the code clean.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27314678
fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586
Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value
in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16
might do some followup PRs to clean up the hacks and refactor the code.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26912676
fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860
Currently we insert a quantize_per_tensor op when we encounter the quantizable input,
so if it has multiple uses and not all are quantizable then we need to add a dequantize op
before these ops.
In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them
since it is a no-op.
[internal only][pyper]
Before this change we had redundant dequantize nodes in the graph
Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (98143776f5)MSACpHKKu9qjorbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 49
After this change
https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz
FC layers -> 37
quantize_per_tensor -> 30
dequantize -> 39
We remove extra 10 dequantize nodes in the graph.
Test Plan:
python test/test_quantization.py test_fold_quant_dequant
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D27390506
fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54654
Fixes a bug where disabling quantizaton on potential fusion patterns
would lead to errors in the `convert` function. For example:
1. have a model with add-relu
2. disable quantization for the part of the model containing add-relu
3. run prepare and convert, the convert step would fail because
intermediate nodes were missing from `env`.
The fix is to add handling for this edge case. If quantization is
disabled, we manually copy the nodes for multi-node fusion patterns.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_fusion_pattern_unquantized
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D27318454
fbshipit-source-id: 27c1fd1cb7c9711a8e8d338200971c428dae8f98
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614
Ensures that every subclass of `QuantizeHandler` has a clear name. This
prevents ambiguous names like `Cat`, which look like a module but are
really a quantize handler.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26914784
fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196
Before this PR, code patterns like this did not work:
```
x = some_quant_layer(x)
x = torch.stack([x, ...])
x = torch.sum(x, ...)
```
The reason this did not work is because `torch.sum` is treated as
"quantized" because of the newly added fp16 support, even though it is
not actually "quantized" for models where fp16 is not used. We may
need to adjust the concept of "quantized vs non-quantized" into a
"dtype" for the longer term fix.
The current PR is a hacky fix to unblock. We need to clean things
up before this is landable
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_quant_sum
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26783960
fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53187
Before this diff, if we had code lik
```
x = any_quant_layer(...)
x_size0 = x.size(0)
torch._assert(x_size_0 == 1)
```
The convert code would try to insert a dequantize after `x_size0`,
because it was a descendant of a quantized node and it was needed
for a non-quantized operation. Since the actual type of the `size`
function output is an integer, this does not make sense.
For now, this is fixed as a one-off to unblock a customer. In the
future, we may need to think more deeply about all the functions which
can return non-quantized types from quantized tensors and make sure
they are all covered.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_assert_on_size_after_quant_layer
```
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D26780690
fbshipit-source-id: 44cc25c9179d460efb3f110d40b73d854d676af5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120
Currently there is a pattern which is not handled correctly by
FX graph mode quantization:
```
def forward(self, x):
ndim = x.ndim
# or add, mul, div, etc
x = torch.sub(x, ndim)
return x
```
The reason this does not work is as follows:
1. x.ndim becomes a getattr node
2. the real world type of x.ndim is an integer, but this is not known from the graph (yet)
3. binary ops such as `torch.sub` require quantization of inputs
4. the framework inserts an observer to observe the output of `ndim`
5. the observer fails because `ndim` is not a Tensor
For now, we hack a bandaid to unblock some teams, none of this is for
land. We will have to think of a better fix which is landable (TBD).
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26756180
fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585
Previously fp16_static CopyNode would be marked as unquantized because of
an incorrect condition check of whether a Node is statically quantized or not.
This PR fixes that.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26912677
fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53330
Fixed a condition check for fixed qparam ops, previously we were including CopyNodes as well
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26836867
fbshipit-source-id: 8c486155244f852e675a938c3f4237f26505671c
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50002
The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003
Reviewed By: mrshenli
Differential Revision: D26325953
Pulled By: jerryzh168
fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53166
Context: For fx modules that consist of scriptmodules, calling
delattr(module, 'qconfig') throws an attribute error. will follow up
with a separate issue/repro to fix this problem
This PR adds a temporary flag to convert_fx API to preserve the qconfig attributes on the converted model
We will remove this flag once we reach a conclusion on calling delattr on scriptmodules
Test Plan:
python test/test_quantization.py test_preserve_qconfig
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26771518
fbshipit-source-id: 9fd72816576856ffb4aa11f8fde08303d1df10a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651
Merging them for easier extensions to fp16 and more binary ops
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26600118
fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534
Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26557726
fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179
Rename debug to reference. We'll use this to produce a reference quantized model
that can be used as a common interface between pytorch quantized model and backends.
Test Plan:
python test/test_quantization.py TestQuantizeFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26424656
fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259
Store the FQN of the module that is using the packed weights (the quantized op)
In the case of fusion we update the scope mapping to store the module path of the fused node.
Test Plan:
python test/test_quantization.py test_packed_weight_fused_op
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26117964
fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171
Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be
registered as buffers in the module.
Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types.
Test Plan:
python test/test_quantization.py test_qparams_buffers
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26092964
fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166
Currently scale and zero_point values are stored as constant values in the graph.
This prevents these values from being updated in the graph and also does not enable saving
these values to state_dict
After this PR we store scale/zero_point values for quantized ops as buffers in the root module
and createe get_attr nodes for them in the graph.
We also use the FQN of the module where the quantized ops are present to name these attributes so
that they can be uniquely identified and mapped to quantized ops.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qparams_buffers
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26092965
fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51086
Previously we only supported getting scope for call_module and custom qconfig dict for call_module.
This PR extends the Scope class to record the scope for all node types.
For call_function qconfig if module_name is specified it takes precedence over function qconfig.
Test Plan:
python test/test_quantization.py test_qconfig_for_call_func
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26077602
fbshipit-source-id: 99cdcdedde2280e51812db300e17d4e6d8f477d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50173
Previously we did not set the qconfig for call_method node correctly since it requires us to know
the scope (module path of the module whose forward graph contains the node) of the node. This
PR modifies the QuantizationTracer to record the scope information and build a map from call_method
Node to module path, which will be used when we construct qconfig_map
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_for_call_method
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25818132
fbshipit-source-id: ee9c5830f324d24d7cf67e5cd2bf1f6e0e46add8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25768910
fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25684692
fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49719
We find there are multiple use cases for standalone module, one use case requires standalone module
to produce a module that takes float Tensor as input and outputs a float Tensor, the other needs to
produce a modulee that takes quantized Tensor as input and outputs a quantized Tensor.
This is similar to `quantized_input_idxs` and `quantized_output_idxs` so we want to nest
prepare_custom_config_dict in the standalone module configuration, for maximum flxibility we also
include qconfig_dict for stand alone module as well in case user needs to have special qconfig_dict for
the standalone module in the future.
Changed from
```python
prepare_custom_config_dict =
{
"standalone_module_name": ["standalone_module"],
"standalone_module_class": [StandaloneModule]
}
```
to
```python
prepare_custom_config_dict =
{
"standalone_module_name": [("standalone_module", qconfig_dict1, prepare_custom_config_dict1)],
"standalone_module_class": [(StandaloneModule, qconfig_dict2, prepare_custom_config_dict2)]
}
```
The entries in the config are:
1. name/module_class
2. optional qconfig_dict, when it is None, we'll use {"": qconfig} where qconfig is the one from parent qconfig_dict
3. optional prepare_custom_config_dict, when it is None, we'll use default value of prepare_custom_config_dict for prepare API (None)
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25675704
fbshipit-source-id: 0889f519a3e55a7a677f0e2db4db9a18d87a93d4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49688
Adds more types on FX quantize convert, fixing things as they
are uncovered by mypy.
Test Plan:
```
mypy torch/quantization
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25667231
fbshipit-source-id: 262713c6ccb050a05e3119c0457d0335dde82d25
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49628
Ensures that linear bias is not observed in a `F.linear` call. This should
be a small speedup in PTQ, and will change numerics (in a good way) for
QAT if someone is using `F.linear`.
Note: the implementation is slightly more verbose compared to conv
because bias is a keyword argument in Linear.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_linear_functional_bias_not_observed
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25653532
fbshipit-source-id: c93501bf6b55cbe4a11cfdad6f79313483133a39
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49623
(not ready for review)
Ensures that conv bias is not observed in a `F.conv{n}d` call.
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25652856
fbshipit-source-id: 884f87be1948d3e049a557d79bec3c90aec34340
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49621
This adds support to configure qconfig for a call_method, e.g. x.chunk, this will help workaround
a problem in our internal model.
TODO: since call_method is also a string and we flatten the qconfig, might need to resolve namespace conflict between
call_method and module_name
TODO: Add scope support to set the qconfig for call_method correctly with original qconfig
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25651828
fbshipit-source-id: 82d66b121d37c8274fd481b6a2e9f9b54c5ca73d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49420
Before: if an output was marked as quantized, it could actually not
be quantized, if the previous node was not quantized.
After: if an output was marked as quantized, it will be quantized
regardless of the quantization status of the previous node.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_quant_output_always_observed
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25566834
fbshipit-source-id: 84755a1605fd3847edd03a7887ab9f635498c05c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49382
Fixes an edge case. If the input to the graph is quantized and the
first node does not need activation observation, makes sure that
the observer is not inserted.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_int8_input_no_unnecessary_fq
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25551041
fbshipit-source-id: a6cba235c63ca7f6856e4128af7c1dc7fa0085ea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49239
Context: the existing implementation of `quantized_input_idxs` is convert-only.
Therefore, observers are inserted between the input and the first
quantized node. This is a problem during QAT, because the initial
input is a fake_quant, and it starts with scale=1 and zp=0. This does
not match the quantization parameters of the graph input, which can
lead to incorrect numerics.
Fix: do not insert observer for a quantized input.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25499486
fbshipit-source-id: 303b49cc9d95a9fd06fef3b0859c08be34e19d8a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49238
Moves the `input_quantized_idxs` and `output_quantized_idxs` options
from the convert config to the prepare config. This is done because
these operations are related to placing observers, which is numerics
changing during QAT.
The next PR will adjust the behavior of `input_quantized_idxs` in
prepare in QAT to prevent placing a fake_quant at the input if the
input is marked quantized. Placing a fake_quant there can lead to
numerical inaccuracies during calibration, as it would start with
scale=1 and zp=0, which may be different from the quantization
parameters of the incoming quantized input.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25498762
fbshipit-source-id: 17ace8f803542155652b310e5539e1882ebaadc6