Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179
Rename debug to reference. We'll use this to produce a reference quantized model
that can be used as a common interface between pytorch quantized model and backends.
Test Plan:
python test/test_quantization.py TestQuantizeFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26424656
fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51086
Previously we only supported getting scope for call_module and custom qconfig dict for call_module.
This PR extends the Scope class to record the scope for all node types.
For call_function qconfig if module_name is specified it takes precedence over function qconfig.
Test Plan:
python test/test_quantization.py test_qconfig_for_call_func
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26077602
fbshipit-source-id: 99cdcdedde2280e51812db300e17d4e6d8f477d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50173
Previously we did not set the qconfig for call_method node correctly since it requires us to know
the scope (module path of the module whose forward graph contains the node) of the node. This
PR modifies the QuantizationTracer to record the scope information and build a map from call_method
Node to module path, which will be used when we construct qconfig_map
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_for_call_method
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25818132
fbshipit-source-id: ee9c5830f324d24d7cf67e5cd2bf1f6e0e46add8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25768910
fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25684692
fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49719
We find there are multiple use cases for standalone module, one use case requires standalone module
to produce a module that takes float Tensor as input and outputs a float Tensor, the other needs to
produce a modulee that takes quantized Tensor as input and outputs a quantized Tensor.
This is similar to `quantized_input_idxs` and `quantized_output_idxs` so we want to nest
prepare_custom_config_dict in the standalone module configuration, for maximum flxibility we also
include qconfig_dict for stand alone module as well in case user needs to have special qconfig_dict for
the standalone module in the future.
Changed from
```python
prepare_custom_config_dict =
{
"standalone_module_name": ["standalone_module"],
"standalone_module_class": [StandaloneModule]
}
```
to
```python
prepare_custom_config_dict =
{
"standalone_module_name": [("standalone_module", qconfig_dict1, prepare_custom_config_dict1)],
"standalone_module_class": [(StandaloneModule, qconfig_dict2, prepare_custom_config_dict2)]
}
```
The entries in the config are:
1. name/module_class
2. optional qconfig_dict, when it is None, we'll use {"": qconfig} where qconfig is the one from parent qconfig_dict
3. optional prepare_custom_config_dict, when it is None, we'll use default value of prepare_custom_config_dict for prepare API (None)
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25675704
fbshipit-source-id: 0889f519a3e55a7a677f0e2db4db9a18d87a93d4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49238
Moves the `input_quantized_idxs` and `output_quantized_idxs` options
from the convert config to the prepare config. This is done because
these operations are related to placing observers, which is numerics
changing during QAT.
The next PR will adjust the behavior of `input_quantized_idxs` in
prepare in QAT to prevent placing a fake_quant at the input if the
input is marked quantized. Placing a fake_quant there can lead to
numerical inaccuracies during calibration, as it would start with
scale=1 and zp=0, which may be different from the quantization
parameters of the incoming quantized input.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25498762
fbshipit-source-id: 17ace8f803542155652b310e5539e1882ebaadc6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48671
Standalone module might be called separately so it's better to use float
as interface.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25256184
fbshipit-source-id: e209492a180ce1f81f31c8d6057956a74bad20b1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48624
Before this PR, there was an assumption that all graph inputs
and outputs are in floating point, with some exceptions for
`standalone_module`.
This PR adds an option to specify either inputs or outputs
as being quantized.
This is useful for incremental migrations of models using Eager mode.
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25231833
fbshipit-source-id: 9f9da17be72b614c4c334f5c588458b3e726ed17
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48331
Enables mypy to not ignore type errors in FX quantization files. Fixes the easy
typing errors inline, and comments out the harder errors to be fixed at a later time.
After this PR, mypy runs without errors on `torch/quantization`.
Test Plan:
```
> mypy torch/quantization/
Success: no issues found in 25 source files
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25133348
fbshipit-source-id: 0568ef9405b292b80b3857eae300450108843e80
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48038
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25000462
fbshipit-source-id: e3609a3ae4a3476a42f61276619033054194a0d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47415
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24747035
fbshipit-source-id: b8fdf13e513a0d5f0c4c6c9835635bdf9fdc2769
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47032
these are not top level apis, not supposed to be called directly by user.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24610602
fbshipit-source-id: c5510f06b05499387d70f23508470b676aea582c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46955
Initially we were thinking of adding a `invalidate_quantized_float_parameters` option to free the memory
of quantized floating parameters, but it turns out we will do module swap just like in eager mode for the modules
that are quantized, so the old floating point module will not be referenced after quantization. therefore this feature
is only needed for functionals, since most people are using quantization with modules we may not need this.
we'll revisit after we find there is a need for this.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D24579400
fbshipit-source-id: fbb0e567405dc0604a2089fc001573affdade986
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46786
Previously we only support static quant, this PR added support for other types of quantization.
Note qat is actually orthogonal to these quant types, this is referring to the convert step where we
convert the observed module to a quantized module.
for qat, user will provide a CustomModule -> FakeQuantizedCustomModule in prepare_custom_config_dict
and FakeQuantizedCustomModule -> static/dynamic/weight_only quantized CustomModule in convert_custom_config_dict.
Test Plan: Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D24514701
fbshipit-source-id: 2918be422dd76093d67a6df560aaaf949b7f338c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46346
Allow user to provide additional fusion/quant patterns for fx graph mode
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317437
fbshipit-source-id: 719927cce50c74dffa4f848bd5c98995c944a26a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46345
Allow user to add more fusion mappings
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317439
fbshipit-source-id: 3b144bbc305e41efbdf3e9fb25dbbeaad9e86c6a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46298
Allow user to specify a list of qualified names for non traceable submodule
or type of the non traceable submodule
See quantize_fx.py for api
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24294210
fbshipit-source-id: eb1e309065e3dfbf31e63507aaed73587f0dae29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45920
See docs for new way of defining custom modules
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24145856
fbshipit-source-id: 488673fba503e39e8e303ed5a776fe36899ea4e3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46095
Adds logging on usage of public quantization APIs. This only works in FB codebase
and is a no-op in OSS.
Test Plan: The test plan is fb-only
Reviewed By: raghuramank100
Differential Revision: D24220817
fbshipit-source-id: a2cc957b5a077a70c318242f4a245426e48f75e5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45919
As discussed with JIT team, we'll run symbolic trace in quantization functions
prepare_fx now takes orginal pytorch model (torch.nn.Module) instead of `GraphModule` as input
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D24145857
fbshipit-source-id: 2b7a4ca525a7a8c23a26af54ef594c6a951e4024
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45672
This PR merges all quantization mode and will only expose the following top level functions:
```
prepare_fx
prepare_qat_fx
convert_fx
```
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24053439
fbshipit-source-id: 03d545e26a36bc22a73349061b751eeb35171e64
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45292
This PR merges all quantization mode and will only expose the following top level functions:
```
prepare_fx
prepare_qat_fx
convert_fx
```
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23913105
fbshipit-source-id: 4e335286d6de225839daf51d1df54322d52d68e5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44856
Support following format of qconfig_dict
```python
qconfig_dict = {
# optional, global config
"": qconfig?,
# optional, used for module and function types
# could also be split into module_types and function_types if we prefer
"object_type": [
(nn.Conv2d, qconfig?),
(F.add, qconfig?),
...,
],
# optional, used for module names
"module_name": [
("foo.bar", qconfig?)
...,
],
# optional, matched in order, first match takes precedence
"module_name_regex": [
("foo.*bar.*conv[0-9]+", qconfig?)
...,
]
# priority (in increasing order): global, object_type, module_name_regex, module_name
# qconfig == None means fusion and quantization should be skipped for anything
# matching the rule
}
```
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23751304
fbshipit-source-id: 5b98f4f823502b12ae2150c93019c7b229c49c50
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43910
Adds a debug function to get a representation of all nodes in the
graph, such as
```
name op target args kwargs
x plchdr x () {}
linear_weight gt_prm linear.weight () {}
add_1 cl_fun <bi_fun add> (x, linear_weight) {}
linear_1 cl_mod linear (add_1,) {}
relu_1 cl_meth relu (linear_1,) {}
sum_1 cl_fun <bi_meth sum> (relu_1,) {'dim': -1}
topk_1 cl_fun <bi_meth topk> (sum_1, 3) {}
```
using only Python STL. This is useful for printing internal state of
graphs when working on FX code.
Has some on-by-default logic to shorten things so that node reprs for
toy models and unit tests fit into 80 chars.
Flexible on function name and location, I care more that this is
accessible from both inside PT as well as from debug scripts which
are not checked in.
Test Plan:
see
https://gist.github.com/vkuzo/ed0a50e5d6dc7442668b03bb417bd603 for
example usage
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D23435029
fbshipit-source-id: 1a2df797156a19cedd705e9e700ba7098b5a1376
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43901
Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23432430
fbshipit-source-id: fc99eb75cbecd6ee7a3aa6c8ec71cd499ff7e3c1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43581
Add similar APIs like eager and graph mode on torchscript
- fuse_fx
- quantize_fx (for both post training static and qat)
- quantize_dynamic_fx (for post training dynamic)
- prepare_fx (for both post training static and qat)
- prepare_dynamic_fx (for post training dynamic)
- convert_fx (for all modes)
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D23385091
fbshipit-source-id: b789e54e1a0f3af6b026fd568281984e253e0433