Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68343
Refactors `AutoQuantizationState._get_input_args_quant_dequant_info` to
use less internal state, makes the function have no side effects by passing
the state in the arguments, and moves the function to utils file.
This will help with a future refactor to cache this info at runtime.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: jerryzh168
Differential Revision: D32463760
Pulled By: vkuzo
fbshipit-source-id: bdd50b0772f128755f9b734b5eeb0a9f4bc4970b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68342
Before this PR, `get_quantized_op` required the current callable.
After this PR, `get_quantized_op` only requires `seen_op_info`.
The signature was changed slightly to return `None` if the original
callable does not need replacement for quantization.
This will make it easier to make performance improvements in a
future PR.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: jerryzh168
Differential Revision: D32463768
Pulled By: vkuzo
fbshipit-source-id: 5db2c4199f6c0529817f4c058f81fd1d32b9fa9f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68341
Before this PR, `get_func_output_obs_type` used information from the
incoming op and its arguments, which makes it hard to cache.
This PR refactors `get_func_output_obs_type` to only use information
collected during tracing. This will make it easier to make performance
improvements in a future PR.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: jerryzh168
Differential Revision: D32463755
Pulled By: vkuzo
fbshipit-source-id: 25a220de652f0285685d43aedf7392082104b26c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68251
Before this PR, DBR quantization used to recalculate scale and zero_point
in the converted model every time it was needed, which is slow.
This PR creates a pass during the convert function to go through every
observer in the model and cache its scale and zero_point.
Note: only doing this for observers which correspond to int8 operations
is saved for a future PR.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: VitalyFedyunin
Differential Revision: D32463769
Pulled By: vkuzo
fbshipit-source-id: d1d2e598e2bccc1958e5023096b451d69dc34e29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67776
This adds a barebones `add_loggers` and `extract_logger_info` API
to analyze intermediate activations of models using quantization
with dynamic tracing. The API generally matches the NS for FX tool,
with some omissions. For now, this is moving fast to help us
debug real models, and the API will be 100% aligned before this is marketed to users,
in future PRs.
Note: the current approach couples Numeric Suite with the quantization
logic. This is not the best for composability, and may be changed
at a future time.
Test Plan:
```
python test/test_quantization.py TestAutoTracing.test_numeric_suite
```
```
python test/test_quantization.py TestAutoTracing.test_numeric_suite
```
Differential Revision:
D32231332
D32231332
Reviewed By: jerryzh168
Pulled By: vkuzo
fbshipit-source-id: 8adfb50cd8b7836c391669afe2e2ff6acae6d40a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64676
We implement a working eager mode quantization flow which uses
tracing and `__torch_function__` and `torch.nn.Module.__call__` overrides to automate the model modifications needed for quantization. Partial program capture (instead of full program capture) is used, allowing this scheme to target a wide variety of user programs. Control flow over quantizeable ops is not supported, but general control flow is supported.
In particular:
* `auto_trace.py` contains the machinery to override `__torch_function__` and `torch.nn.Module.__call__` and call hooks before and after each quantizeable module or function
* `quantization_state.py` contains the state needed to use the hooks to implement quantization logic such as adding quants/dequants, observers, etc.
* please see `README.md` for more details
Test Plan:
```
python test/test_quantization.py TestAutoTracing
python test/test_quantization.py TestAutoTracingModels
```
```
python test/test_quantization.py TestAutoTracing
python test/test_quantization.py TestAutoTracingModels
```
Differential Revision:
D31992281
D31992281
Reviewed By: HDCharles
Pulled By: vkuzo
fbshipit-source-id: 6d40e855f3c96b9a4b637a0e677388a7b92f7967