pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Vasiliy Kuznetsov	2755cf457c	dbr quant: refactor AutoQuantizationState._get_input_args_quant_dequant_info (#68343 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68343 Refactors `AutoQuantizationState._get_input_args_quant_dequant_info` to use less internal state, makes the function have no side effects by passing the state in the arguments, and moves the function to utils file. This will help with a future refactor to cache this info at runtime. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR ``` Reviewed By: jerryzh168 Differential Revision: D32463760 Pulled By: vkuzo fbshipit-source-id: bdd50b0772f128755f9b734b5eeb0a9f4bc4970b	2021-11-20 15:17:02 -08:00
Vasiliy Kuznetsov	57472ec414	dbr quant: refactor `get_quantized_op` to only use `seen_op_info` (#68342 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68342 Before this PR, `get_quantized_op` required the current callable. After this PR, `get_quantized_op` only requires `seen_op_info`. The signature was changed slightly to return `None` if the original callable does not need replacement for quantization. This will make it easier to make performance improvements in a future PR. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR ``` Reviewed By: jerryzh168 Differential Revision: D32463768 Pulled By: vkuzo fbshipit-source-id: 5db2c4199f6c0529817f4c058f81fd1d32b9fa9f	2021-11-20 15:16:59 -08:00
Vasiliy Kuznetsov	9cf4779ec9	dbr quant: refactor `get_func_output_obs_type` to only use `seen_op_info` (#68341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68341 Before this PR, `get_func_output_obs_type` used information from the incoming op and its arguments, which makes it hard to cache. This PR refactors `get_func_output_obs_type` to only use information collected during tracing. This will make it easier to make performance improvements in a future PR. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR ``` Reviewed By: jerryzh168 Differential Revision: D32463755 Pulled By: vkuzo fbshipit-source-id: 25a220de652f0285685d43aedf7392082104b26c	2021-11-20 15:16:56 -08:00
Vasiliy Kuznetsov	ed6ef0eec4	dbr quantization: inline scale and zp (#68251 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68251 Before this PR, DBR quantization used to recalculate scale and zero_point in the converted model every time it was needed, which is slow. This PR creates a pass during the convert function to go through every observer in the model and cache its scale and zero_point. Note: only doing this for observers which correspond to int8 operations is saved for a future PR. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR ``` Reviewed By: VitalyFedyunin Differential Revision: D32463769 Pulled By: vkuzo fbshipit-source-id: d1d2e598e2bccc1958e5023096b451d69dc34e29	2021-11-20 15:16:51 -08:00
Vasiliy Kuznetsov	ca499567d2	barebones numeric suite for quantization with dynamic tracing (#67776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67776 This adds a barebones `add_loggers` and `extract_logger_info` API to analyze intermediate activations of models using quantization with dynamic tracing. The API generally matches the NS for FX tool, with some omissions. For now, this is moving fast to help us debug real models, and the API will be 100% aligned before this is marketed to users, in future PRs. Note: the current approach couples Numeric Suite with the quantization logic. This is not the best for composability, and may be changed at a future time. Test Plan: ``` python test/test_quantization.py TestAutoTracing.test_numeric_suite ``` ``` python test/test_quantization.py TestAutoTracing.test_numeric_suite ``` Differential Revision: D32231332 D32231332 Reviewed By: jerryzh168 Pulled By: vkuzo fbshipit-source-id: 8adfb50cd8b7836c391669afe2e2ff6acae6d40a	2021-11-20 15:15:48 -08:00
Vasiliy Kuznetsov	4466ba8f30	Working POC of define-by-run quantization (#64676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64676 We implement a working eager mode quantization flow which uses tracing and `__torch_function__` and `torch.nn.Module.__call__` overrides to automate the model modifications needed for quantization. Partial program capture (instead of full program capture) is used, allowing this scheme to target a wide variety of user programs. Control flow over quantizeable ops is not supported, but general control flow is supported. In particular: * `auto_trace.py` contains the machinery to override `__torch_function__` and `torch.nn.Module.__call__` and call hooks before and after each quantizeable module or function * `quantization_state.py` contains the state needed to use the hooks to implement quantization logic such as adding quants/dequants, observers, etc. * please see `README.md` for more details Test Plan: ``` python test/test_quantization.py TestAutoTracing python test/test_quantization.py TestAutoTracingModels ``` ``` python test/test_quantization.py TestAutoTracing python test/test_quantization.py TestAutoTracingModels ``` Differential Revision: D31992281 D31992281 Reviewed By: HDCharles Pulled By: vkuzo fbshipit-source-id: 6d40e855f3c96b9a4b637a0e677388a7b92f7967	2021-11-11 06:25:24 -08:00

6 Commits