pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Z. Yang	3bf922a6ce	Apply UFMT to low traffic torch modules (#106249 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249 Approved by: https://github.com/Skylion007	2023-07-29 23:37:30 +00:00
andrewor14	13fcc412be	[Quant][fx][bc-breaking] Remove unused functions in fx/utils.py (#90025 ) Summary and BC-breaking notes: This commit removes the following unused functions from both the `torch.quantization` and the `torch.ao.quantization` namespaces: ``` graph_pretty_str get_per_tensor_qparams quantize_node get_qconv_op create_qparam_nodes node_return_type_is_int is_get_tensor_info_node ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestAOMigrationQuantizationFx Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/90025 Approved by: https://github.com/HDCharles	2022-12-07 01:31:28 +00:00
Charles David Hernandez	c1d070d0f0	[ao] Fixing obs insertion through dtype propagation (#73274 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73274 As noticed in https://discuss.pytorch.org/t/calibration-of-model-in-post-training-static-quantization-using-fx-api/143661/6 and related to https://github.com/pytorch/pytorch/issues/72698 when using fx quantizaiton, if an op like view was used in a model and the index parameters were passed in to the ops with a variable rather than hard coded, fx would mistakenly insert observers for them, leading to an error when the observer tried to do tensor only operations on a non-tensor. To fix this, an API was added to specify non tensor arguments for various ops to enable better dtype propagation. NON_TENSOR_ARG_DICT is a nested dict whose first key is a named tuple which contains matching parameters for ops with nontensor args, the inner dict's keys are dtypes and the values are a list of those arg indices that take use such dtypes. Alternatively, instead of a list, the inner dict value can also be a function that takes the node as an argument and returns the list of arg indices. Theoretically this api can support arbitrary functions but the current implmentation is limited to simpler functions given the particular issue this fixes seems to be rare. Note: although torch.unsqueeze and torch.transpose are listed in quantization_patterns.py, those ops appear to be untraceable by fx. I've included tests for their cases but fixing this issue is beyond the scope of this PR Test Plan: python test/test_quantization.py test_non_reference_size ... python test/test_quantization.py test_non_reference_<op> Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34410122 fbshipit-source-id: fc09949ca8a2d6473876a4b6c214eb91e9a9dae2 (cherry picked from commit 3a1375d677b7c98d62b1f5c839645698c39b32b9)	2022-03-16 01:41:17 +00:00
Vasiliy Kuznetsov	b999f87503	fx quant: move _parent_name to common utils (#69720 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69720 This function is also useful for DBR quant, moving it from FX utils to common utils. Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeDBR ``` Reviewed By: jerryzh168 Differential Revision: D33003473 Pulled By: vkuzo fbshipit-source-id: 20360682c69d614a645c14fc29d3ee023d6b2623	2021-12-17 05:59:46 -08:00
Jerry Zhang	508845f2b5	[quant] AO migration of the `torch/quantization/quantize_fx.py` and `torch/quantization/fx/` (#65033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033 1. Move the file: ``` hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py ``` 2. Create new files ``` touch caffe2/torch/quantization/quantize_fx.py touch caffe2/torch/quantization/fx/__init__.py ``` 3. import things in the new files 4. add tests to test/quantization/ao_migration/test_quantization_fx.py this is because we have some fx import in quantize_fx and fx/.py Test Plan: buck test mode/dev //caffe2/test:quantization Reviewed By: vkuzo, z-a-f Differential Revision: D30949749 fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3	2021-09-22 09:29:15 -07:00
Zafar Takhirov	9cc44aad21	[quant] AO migration of the `quantize.py` (resubmission) (#64445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quantize.py from torch.quantization to torch.ao.quantization. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/dev //caffe2/test:quantization` Reviewed By: HDCharles Differential Revision: D30734870 fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b	2021-09-08 04:58:47 -07:00
Zafar Takhirov	046ed57a4d	Revert D30055886: [quant] AO migration of the `quantize.py` Test Plan: revert-hammer Differential Revision: D30055886 (`44e3ed88c9`) Original commit changeset: 8ef7470f9fa6 fbshipit-source-id: c5bd3ead43a2d44b9e56872ec5bd7a195bdac725	2021-09-02 16:59:59 -07:00
Zafar Takhirov	44e3ed88c9	[quant] AO migration of the `quantize.py` (#64086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/opt //caffe2/test:quantization` Reviewed By: jerryzh168, raghuramank100 Differential Revision: D30055886 fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f	2021-08-29 20:30:01 -07:00
Charles David Hernandez	32d0c3e8ee	Support for reference convert_fx working on gpu Summary: This PR enables gpu only quantization, best used with is_reference since there are not many gpu kernels for ops as of now. This PR mainly changes how qconfigs and their obs constructors operate once they on modules qconfig. The function add_module_to_qconfig_obs_ctr takes the obs constructors on the original qconfig, and configures them so that when invoked, the created obs will be on whatever device the module occupies. (Once observers are created, module.to(device) is already setup so that it moves any observers). To do this, a new method and a few small chanegs were added to the _PartialWrapper class that our observers already use to create constructors (without changing the existing functionality). These changes work in concert with changes to the prepare flow such that when the qconfigs are propagated to the moduels (in quantize.py and qconfig_utils.py) they are configured using add_module_to_qconfig_obs_ctr. Ideally this would work on other models but the is_reference support for a lot of modules isn't there yet, those tests should be added in a future PR Test Plan: python test/test_quantization.py TestQuantizeFxModels.test_static_gpu_convert_basic python test/test_quantization.py TestQuantizeFxModels.test_switch_device_prepare_convert python test/test_quantization.py TestQuantizeFxModels.test_prepare_serialize_switch_device_convert python test/test_quantization.py TestQuantizeFx.test_qconfig_precedence Reviewed By: vkuzo Differential Revision: D29684114 fbshipit-source-id: 19fefb8e1998eaf212723e836276ccf39467f2e7	2021-07-23 10:30:38 -07:00
Angela Yi	1a0195db49	[quant] Input-Weight Equalization - support for LinearReLU layers (#60653 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60653 Special casing was needed to get the weight attribute in the linear layers of fused LinearReLU layers. Initial Model: `x -> linear1 -> relu` After fusion: `x -> linearRelu` After prepare: `x -> input_quant_obs -> input_eq_obs1 -> linearRelu -> output_quant_obs1` After equalization functions: `x -> mul -> input_quant_obs (scaled) -> linearRelu -> output_quant_obs` After convert: `x -> mul -> quantize_per_tensor -> quantized::linearRelu -> dequantize` More step-throughs here: https://fb.quip.com/A9J3AsBxkykR Test Plan: `python test/test_quantization.py TestEqualizeFx` Original model: ``` LinearReluModel( (fc): Linear(in_features=5, out_features=5, bias=True) (relu): ReLU() ) ``` Graph after `prepare_fx`: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0_equalization_process_0 : [#users=1] = call_module[target=x_activation_post_process_0_equalization_process_0](args = (%x_activation_post_process_0,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0_equalization_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` Graph after equalization functions: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_0](args = (%mul,), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%x_activation_post_process_0,), kwargs = {}) %fc_activation_post_process_0 : [#users=1] = call_module[target=fc_activation_post_process_0](args = (%fc,), kwargs = {}) return fc_activation_post_process_0 ``` Graph after `convert_fx`: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_scale0 : [#users=1] = get_attr[target=x_equalization_scale0] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_scale0), kwargs = {}) %fc_input_scale_0 : [#users=1] = get_attr[target=fc_input_scale_0] %fc_input_zero_point_0 : [#users=1] = get_attr[target=fc_input_zero_point_0] %quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %fc_input_scale_0, %fc_input_zero_point_0, torch.quint8), kwargs = {}) %fc : [#users=1] = call_module[target=fc](args = (%quantize_per_tensor,), kwargs = {}) %dequantize : [#users=1] = call_method[target=dequantize](args = (%fc,), kwargs = {}) return dequantize ``` Imported from OSS Reviewed By: supriyar Differential Revision: D29406999 fbshipit-source-id: add38e8e7fb84a241c3b10bfb8451b50103effd4	2021-06-30 14:22:06 -07:00
angelayi	e13a9587b4	Revert "Revert D29135358: [quant] Input-Weight Equaliaztion - convert modifications" (#60646 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60646 This reverts commit `e60f9cfc58`. Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D29361191 Pulled By: angelayi fbshipit-source-id: 275d8691d8e47da4ab80bb21b51d77ec25a0f714	2021-06-25 15:37:05 -07:00
Rong Rong (AI Infra)	e60f9cfc58	Revert D29135358: [quant] Input-Weight Equaliaztion - convert modifications Test Plan: revert-hammer Differential Revision: D29135358 (`3de79b7757`) Original commit changeset: 2d0005672904 fbshipit-source-id: cac30c1202ebbce4f22e50ed920340c7b4c6849f	2021-06-23 11:23:24 -07:00
Angela Yi	3de79b7757	[quant] Input-Weight Equaliaztion - convert modifications (#59963 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59963 When converting, before quantizing the nodes, we call `update_obs_for_equalization()` and `convert_eq_obs()`. `update_obs_for_equalization`: 1. For each InputEqualizationObserver, we find the corresponding WeightEqualizationObserver. 2. For nn.Linear layers, we will create an instance of the WeightEqualizationObserver, run forward on the observer with the given weights. 3. Calculate the equalization scale between the InputEqualizationObserver and WeightEqualizationObserver. `convert_eq_obs`: For every InputEqualizationObserver, we will do the following: 1. Create a node (ex. `x0_activation_post_process_scale`) containing the equalization scale constant. 2. Create another node containing a `mul` operator multiplying the equalization scale and the input. 3. Remove the current InputEqualizationObserver node, and replace it with the `mul` node. For every WeightEqualizationObserver, we will do the following: 1. Get the next equalization scale (we may need this for equalizing connected linear layers). 2. Scale the weights by multiplying it with the reciprocal of the current equalization scale and the next equalization scale Currently, this supports models with `nn.Linear` layers, but does not support connecting linear layers. Test Plan: `python test/test_quantization.py TestEqualizeFx.test_input_weight_equalization_convert` Original Model: ``` .LinearModule( (linear): Linear(in_features=2, out_features=2, bias=True) ) ``` Graph after `prepare_fx`: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_process_0 : [#users=1] = call_module[target=x_equalization_process_0](args = (%x,), kwargs = {}) %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%x_equalization_process_0,), kwargs = {}) %linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {}) %linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {}) return linear_activation_post_process_0 ``` Graph after equalization functions: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {}) %x_activation_post_process_0 : [#users=1] = call_module[target=x_activation_post_process_00](args = (%mul,), kwargs = {}) %linear : [#users=1] = call_module[target=linear](args = (%x_activation_post_process_0,), kwargs = {}) %linear_activation_post_process_0 : [#users=1] = call_module[target=linear_activation_post_process_0](args = (%linear,), kwargs = {}) return linear_activation_post_process_0 ``` Graph after `convert_fx`: ``` graph(): %x : [#users=1] = placeholder[target=x] %x_equalization_process_0_scale : [#users=1] = get_attr[target=x_equalization_process_0_scale] %mul : [#users=1] = call_function[target=torch.mul](args = (%x, %x_equalization_process_0_scale), kwargs = {}) %linear_input_scale_0 : [#users=1] = get_attr[target=linear_input_scale_0] %linear_input_zero_point_0 : [#users=1] = get_attr[target=linear_input_zero_point_0] %quantize_per_tensor : [#users=1] = call_function[target=torch.quantize_per_tensor](args = (%mul, %linear_input_scale_0, %linear_input_zero_point_0, torch.quint8), kwargs = {}) %linear : [#users=1] = call_module[target=linear](args = (%quantize_per_tensor,), kwargs = {}) %dequantize : [#users=1] = call_method[target=dequantize](args = (%linear,), kwargs = {}) return dequantize ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D29135358 fbshipit-source-id: 2d00056729041318463de61841483490b6bfeee5	2021-06-22 20:43:30 -07:00
Supriya Rao	864d129bae	[quant][fx] Remove extra q-dq for weight bias in normalization ops (#59882 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59882 Currently for normalization ops, the weight and bias arguments are treated as activationn inputs which require observers. This results in adding extra quant-dequant ops for the weight and bias inputs. This PR adds support to skip observing weight/bias inputs of norm operators, thus removing the redundant q-dq ops Quantized graph with F.layer_norm Before this PR ``` def forward(self, x): _input_scale_0 = self._input_scale_0 _input_zero_point_0 = self._input_zero_point_0 quantize_per_tensor = torch.quantize_per_tensor(x, _input_scale_0, _input_zero_point_0, torch.quint8); x = _input_scale_0 = _input_zero_point_0 = None scale = self.scale _input_scale_1 = self._input_scale_1 _input_zero_point_1 = self._input_zero_point_1 quantize_per_tensor_1 = torch.quantize_per_tensor(scale, _input_scale_1, _input_zero_point_1, torch.quint8); scale = _input_scale_1 = _input_zero_point_1 = None bias = self.bias _input_scale_2 = self._input_scale_2 _input_zero_point_2 = self._input_zero_point_2 quantize_per_tensor_2 = torch.quantize_per_tensor(bias, _input_scale_2, _input_zero_point_2, torch.quint8); bias = _input_scale_2 = _input_zero_point_2 = None _scale_0 = self._scale_0 _zero_point_0 = self._zero_point_0 dequantize = quantize_per_tensor_1.dequantize(); quantize_per_tensor_1 = None dequantize_1 = quantize_per_tensor_2.dequantize(); quantize_per_tensor_2 = None layer_norm = torch.ops.quantized.layer_norm(quantize_per_tensor, [2, 5, 5], weight = dequantize, bias = dequantize_1, eps = 1e-05, output_scale = _scale_0, output_zero_point = _zero_point_0); quantize_per_tensor = dequantize = dequantize_1 = _scale_0 = _zero_point_0 = None dequantize_2 = layer_norm.dequantize(); layer_norm = None return dequantize_2 ``` After ``` def forward(self, x): _input_scale_0 = self._input_scale_0 _input_zero_point_0 = self._input_zero_point_0 quantize_per_tensor = torch.quantize_per_tensor(x, _input_scale_0, _input_zero_point_0, torch.quint8); x = _input_scale_0 = _input_zero_point_0 = None scale = self.scale bias = self.bias _scale_0 = self._scale_0 _zero_point_0 = self._zero_point_0 layer_norm = torch.ops.quantized.layer_norm(quantize_per_tensor, [2, 5, 5], weight = scale, bias = bias, eps = 1e-05, output_scale = _scale_0, output_zero_point = _zero_point_0); quantize_per_tensor = scale = bias = _scale_0 = _zero_point_0 = None dequantize = layer_norm.dequantize(); layer_norm = None return dequantize ``` Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_norm_weight_bias Imported from OSS Reviewed By: HDCharles, ailzhang Differential Revision: D29068203 fbshipit-source-id: 24b5c38bbea5fd355d34522bfa654c9db18607da	2021-06-11 16:22:36 -07:00
Jerry Zhang	18642e664a	[quant][graphmode][fx][refactor] Split quantize.py to prepare.py and convert.py (#59353 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59353 Next: remove Quantizer class Test Plan: Imported from OSS Reviewed By: raghuramank100 Differential Revision: D28856277 fbshipit-source-id: 25f5502be387dbe9706780f667501b46b82789a5	2021-06-02 23:52:39 -07:00
Jerry Zhang	06af7618e7	[quant][graphmode][fx][refactor] Remove Quantizer class from convert (QuantizeHandler) (#59040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59040 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724870 fbshipit-source-id: c0f748711b825cd46bdfcc05c054c77a41e8207a	2021-06-01 22:00:49 -07:00
Jerry Zhang	50e6ee3ca2	[quant][graphmode][fx][refactor] Remove Quantizer class from quantize_node (#59039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59039 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724874 fbshipit-source-id: bd984716b2da1d6879c3e92fa827574783a41567	2021-06-01 21:40:08 -07:00
Jerry Zhang	83892c1861	[quant][graphmode][fx][refactor] Remove node_name_to_scope from Quantizer (#59032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59032 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724868 fbshipit-source-id: 6df639f20076b480812b6dcf0fc7d2c87ca29d8b	2021-06-01 13:26:09 -07:00
Jerry Zhang	3826f7e8e0	[quant][graphmode][fx][refactor] Remove quantized_graph from Quantizer (#59031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59031 Trying to remove Quantizer class and split prepare and convert code Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724871 fbshipit-source-id: dad0332ba271c4cfb6ec1e8f2036443149b5bea4	2021-06-01 13:01:54 -07:00
Jerry Zhang	1b4586ee20	[quant][gx][graphmode][refactor] Remove modules from Quantizer (#59030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59030 Trying to remove Quantizer class and split prepare and convert code Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724875 fbshipit-source-id: d6610c1d5eb7755331252be9e348a230abf4175c	2021-06-01 12:42:28 -07:00
Jerry Zhang	10fc42eacc	[quant][graphmode][fx] Merge quant_env and env (#59028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59028 Previously we have an env and a quant_env in convert, which is a bit confusing, in this PR we merged them and have a Dict[str, Tuple[Node, torch.dtype]] Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724863 fbshipit-source-id: 722a682c70d300a6ccd2b988786a1ac2d45e880e	2021-06-01 09:21:38 -07:00
Vasiliy Kuznetsov	821a97595b	fx quant: improve performance of all_node_args_have_no_tensors (#58461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58461 Improves the logic which calculates whether a node has any tensors in its arguments by terminating the recursion early when possible. In a future PR, we should probably ditch this entire approach and switch to using dtype propagation. Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28499455 fbshipit-source-id: bedd844022b90e1fcb7d7a3cb4cc65440dc9cc59	2021-05-18 07:19:59 -07:00
Jerry Zhang	945c93b8bd	[quant][graphmode][fx] Skip observering boolean Tensors (#57375 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57375 Skip observing the input for masked_fill. Currently we don't have a way to query the type of Proxy in GraphModule, hopefully we should have the functionality to annotate the type, we'll need to annotate a Proxy to be a boolean Tensor to remove this hack. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_boolean_tensor Imported from OSS Reviewed By: vkuzo Differential Revision: D28126003 fbshipit-source-id: 2989766370a607579b3ea07ca36cdc2ce35893cc	2021-05-03 11:20:33 -07:00
Jerry Zhang	ecacb8c78b	[quant][graphmode][fx] Fix getitem for unmatched nodes (#57173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57173 If getitem is followed by an unmatched node, we'll remove the observer after it. Test Plan: python test/test_quantization.pyt TestQuantizeFxOps.test_getitem Imported from OSS Reviewed By: vkuzo Differential Revision: D28068805 fbshipit-source-id: e79f8ec3e8fd61d348b8a7069ab0bb434d737c30	2021-04-29 10:16:44 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
Jerry Zhang	bbd2b1bd3c	[quant][graphmode][fx] Add shape to nontensor op list (#55529 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55529 x.shape outputs a non-Tensor, add this to the all_node_args_have_no_tensors function to avoid inserting observer for the getattr "shape" node. Test Plan: Imported from OSS Reviewed By: wat3rBro Differential Revision: D27628145 fbshipit-source-id: 4729294ab80c0a1e72440396d31e7e82257b1092	2021-04-08 23:27:05 -07:00
Jerry Zhang	4d449f915f	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) (#55429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D27609972 fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e	2021-04-08 22:12:24 -07:00
Bradley Davis	8eaa4a97b7	Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388 temporarily revert D27314678 (`c57541ce06`), it appears to cause a perf regression that makes quantization of some models take too long to complete tests. Reviewed By: houseroad Differential Revision: D27583809 fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b	2021-04-06 14:20:36 -07:00
Jerry Zhang	c57541ce06	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314678 fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1	2021-03-31 17:50:32 -07:00
Jerry Zhang	55544cb13a	[quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586 Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16 might do some followup PRs to clean up the hacks and refactor the code. Test Plan: python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value Imported from OSS Reviewed By: vkuzo Differential Revision: D26912676 fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5	2021-03-31 17:48:50 -07:00
Supriya Rao	a7dc0ab845	[quant][fx][pyper] Get first linear use of quantize_per_tensor for FQN (#54859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54859 This is applicable to the case when a call_function linear op is one of the users of quantize op In order to be able to map the qparams of quantize_per_tensor to the qparams of the linear operator that consumes it, we need to use the FQN of the module with linear op for the qparmas of quantize_per_tensor. Test Plan: python test/test_quantization.py test_qparams_fqn Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27390505 fbshipit-source-id: a47af0e5ac016f2b2df74fbdf45afe99dc04be46	2021-03-30 08:38:51 -07:00
Supriya Rao	a0a7a2d648	[quant][fx] store dtype, axis as literals in the graph (#54624 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54624 previously we were creating setattr nodes for dtype and axis. The FX convention is that primitive types are embedded as literals in args/kwargs. With this change we won't see getattr nodes in the graph anymore for dtype/axis Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27306898 fbshipit-source-id: a7c91c7cb21ee96015c7f8830b38d943ada65358	2021-03-28 21:59:49 -07:00
Vasiliy Kuznetsov	93d5807c1e	[not for land yet]fix using size of quant layer in torch._assert (#53187 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53187 Before this diff, if we had code lik ``` x = any_quant_layer(...) x_size0 = x.size(0) torch._assert(x_size_0 == 1) ``` The convert code would try to insert a dequantize after `x_size0`, because it was a descendant of a quantized node and it was needed for a non-quantized operation. Since the actual type of the `size` function output is an integer, this does not make sense. For now, this is fixed as a one-off to unblock a customer. In the future, we may need to think more deeply about all the functions which can return non-quantized types from quantized tensors and make sure they are all covered. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_assert_on_size_after_quant_layer ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D26780690 fbshipit-source-id: 44cc25c9179d460efb3f110d40b73d854d676af5	2021-03-12 07:43:48 -08:00
Vasiliy Kuznetsov	ccab6680d5	[not for land yet] hacky fix for x.ndim followed by sub (#53120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120 Currently there is a pattern which is not handled correctly by FX graph mode quantization: ``` def forward(self, x): ndim = x.ndim # or add, mul, div, etc x = torch.sub(x, ndim) return x ``` The reason this does not work is as follows: 1. x.ndim becomes a getattr node 2. the real world type of x.ndim is an integer, but this is not known from the graph (yet) 3. binary ops such as `torch.sub` require quantization of inputs 4. the framework inserts an observer to observe the output of `ndim` 5. the observer fails because `ndim` is not a Tensor For now, we hack a bandaid to unblock some teams, none of this is for land. We will have to think of a better fix which is landable (TBD). Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26756180 fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263	2021-03-12 07:42:12 -08:00
Jerry Zhang	d9fa957ecc	[quant][graphmode][fix] Handle the case when observed node has no users (#53210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53210 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26791724 fbshipit-source-id: b2a226a22d6aba86dd01cacbb56577048a289b3e	2021-03-10 15:08:48 -08:00
Jerry Zhang	177694681e	[quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534 Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions to other backends Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26557726 fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c	2021-02-26 21:12:22 -08:00
Jerry Zhang	7097c0d4f3	[quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155 ) (#51254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51254 This PR added support for quantizing functional conv1d, conv3d, conv1d_relu and conv3d_relu Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: vkuzo Differential Revision: D26116172 fbshipit-source-id: 56e7d799c11963fe59ee3a1b6eb23f52007b91dc	2021-01-28 14:32:32 -08:00
Supriya Rao	288b94a8ee	[quant][fx] Make scale, zero_point buffers in the model, use FQN (for quantize_per_tensor ops) (#51171 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171 Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be registered as buffers in the module. Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types. Test Plan: python test/test_quantization.py test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092964 fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b	2021-01-28 08:35:46 -08:00
Supriya Rao	4c3f59b70e	[quant][fx] Make scale, zero_point buffers in the model and use FQN (for quantized ops) (#51166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166 Currently scale and zero_point values are stored as constant values in the graph. This prevents these values from being updated in the graph and also does not enable saving these values to state_dict After this PR we store scale/zero_point values for quantized ops as buffers in the root module and createe get_attr nodes for them in the graph. We also use the FQN of the module where the quantized ops are present to name these attributes so that they can be uniquely identified and mapped to quantized ops. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092965 fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b	2021-01-28 08:35:42 -08:00
Mike Ruberry	f7e90cf311	Revert D26089965: [quant][graphmode][fx] Add support for functional conv1d and conv3d Test Plan: revert-hammer Differential Revision: D26089965 (`dd1a97b3ae`) Original commit changeset: 4aea507d05b7 fbshipit-source-id: f54184cafb9dd07858683489d8bd147474e7e4b3	2021-01-27 13:27:10 -08:00
Jerry Zhang	dd1a97b3ae	[quant][graphmode][fx] Add support for functional conv1d and conv3d (#51155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51155 This PR added support for quantizing functional conv1d, conv3d, conv1d_relu and conv3d_relu Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: vkuzo Differential Revision: D26089965 fbshipit-source-id: 4aea507d05b744807e993f6d3711ab308fb7591b	2021-01-27 12:00:35 -08:00
Jerry Zhang	f6f0fde841	[reland][quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754 ) (#50058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058 This PR adds the support for {input/output}_quantized_idxs for standalone module. if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float input and produce float output, and will quantize the input and dequantize output internally if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d For more details, please see the test case Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D25768910 fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b	2021-01-05 20:27:46 -08:00
Vasiliy Kuznetsov	d033e185ed	fx quant: move more functions to utils (#48908 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48908 No logic change, improving readability Test Plan: CI Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25363080 fbshipit-source-id: 1d73a875bd7abf671b544ebc835432fea5306dc3	2020-12-08 15:37:04 -08:00
Vasiliy Kuznetsov	6b80b664bb	quant: enable mypy on torch/quantization/fx (#48331 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48331 Enables mypy to not ignore type errors in FX quantization files. Fixes the easy typing errors inline, and comments out the harder errors to be fixed at a later time. After this PR, mypy runs without errors on `torch/quantization`. Test Plan: ``` > mypy torch/quantization/ Success: no issues found in 25 source files ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25133348 fbshipit-source-id: 0568ef9405b292b80b3857eae300450108843e80	2020-11-21 15:29:27 -08:00
Jerry Zhang	ed57f804fa	[quant][refactor] Move some util functions from torch/quantization/fx/utils.py to torch/quantization/utils.py (#48107 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48107 Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D25026495 fbshipit-source-id: 3634b6b95a18670232600874b1e593180ea9f44c	2020-11-18 22:32:19 -08:00
Jerry Zhang	5883e0b0e0	[quant][fix][ez] Fix quant_type classification for fp16, fp16 (#48073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48073 Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D25011799 fbshipit-source-id: a12f645d6be1c607898633225b02617283d37df1	2020-11-18 20:07:54 -08:00
Jerry Zhang	be2e3dd2a1	[quant][graphmode][fx][fix] Linear work with float_qparam_dynamic_qconfig (#47068 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47068 Filter the dtype config before performing the quantization in linear Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D24627907 fbshipit-source-id: 162fa47b3fcf6648049f8bc0438e41ee97ac19e9	2020-11-02 16:28:33 -08:00
Jerry Zhang	998b9b9e68	[quant][graphmode][fx] custom_module support static/dynamic/weight_only quant (#46786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46786 Previously we only support static quant, this PR added support for other types of quantization. Note qat is actually orthogonal to these quant types, this is referring to the convert step where we convert the observed module to a quantized module. for qat, user will provide a CustomModule -> FakeQuantizedCustomModule in prepare_custom_config_dict and FakeQuantizedCustomModule -> static/dynamic/weight_only quantized CustomModule in convert_custom_config_dict. Test Plan: Imported from OSS Reviewed By: raghuramank100 Differential Revision: D24514701 fbshipit-source-id: 2918be422dd76093d67a6df560aaaf949b7f338c	2020-10-27 21:41:33 -07:00
Jerry Zhang	4f685ecc25	[reland][quant][graphmode][fx] Merge all quantization mode (#45292 ) (#45672 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45672 This PR merges all quantization mode and will only expose the following top level functions: ``` prepare_fx prepare_qat_fx convert_fx ``` Test Plan: Imported from OSS Imported from OSS Reviewed By: z-a-f Differential Revision: D24053439 fbshipit-source-id: 03d545e26a36bc22a73349061b751eeb35171e64	2020-10-01 15:47:11 -07:00
Mike Ruberry	c36b354072	Revert D23913105: [quant][graphmode][fx] Merge all quantization mode Test Plan: revert-hammer Differential Revision: D23913105 (`ffcb0989e7`) Original commit changeset: 4e335286d6de fbshipit-source-id: 5765b4e8ec917423f1745f73a9f3f235fc53423d	2020-10-01 03:12:42 -07:00

1 2

57 Commits