pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	9df605133e	[bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810657 fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19	2021-07-27 09:49:20 -07:00
Jerry Zhang	2d7c1e3fa8	[bc-breaking] Produce quantization pattern for add_scalar and mul_scalar (#61859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61859 BC-breakign note: Previously we do not add observer/fake_quant for output of add/mul for tensor - scalar operation, in this PR we added the observer/fake_quant instance (that's the same as input) to correctly model the behavior of the quantized add_scalar and mul_scalar op (since quantized add/mul scalar assumes the output quantized tensor have the same quantization parameter as input) Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_add python test/test_quantization.py TestQuantizeFxOps.test_mul Imported from OSS Reviewed By: vkuzo Differential Revision: D29770859 fbshipit-source-id: f43fcbfecd04c392467770b22c481bbbdaf43c25	2021-07-27 02:46:00 -07:00
Jerry Zhang	457a3fb6d1	[bc-breaking][quant][graphmode][fx] Produce dequant - fp_op - quant pattern for copy nodes (#61763 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61763 This PR changes the is_reference=True option for convert_fx to produce a dequant - fp_op - quant pattern for copy nodes like maxpool op. Before the PR: ``` def forward(self, x): maxpool2d_input_scale_0 = self.maxpool2d_input_scale_0 maxpool2d_input_zero_point_0 = self.maxpool2d_input_zero_point_0 quantize_per_tensor = torch.quantize_per_tensor(x, maxpool2d_input_scale_0, maxpool2d_input_zero_point_0, torch.quint8); x = maxpool2d_input_scale_0 = maxpool2d_input_zero_point_0 = None maxpool2d = self.maxpool2d(quantize_per_tensor); quantize_per_tensor = None dequantize = maxpool2d.dequantize(); maxpool2d = None return dequantize ``` After (we expand the maxpool2d that works with quantized input to "dequant - maxpool2d - quant" pattern ``` def forward(self, x): maxpool2d_input_scale_0 = self.maxpool2d_input_scale_0 maxpool2d_input_zero_point_0 = self.maxpool2d_input_zero_point_0 quantize_per_tensor = torch.quantize_per_tensor(x, maxpool2d_input_scale_0, maxpool2d_input_zero_point_0, torch.quint8); x = maxpool2d_input_scale_0 = maxpool2d_input_zero_point_0 = None dequantize = quantize_per_tensor.dequantize(); quantize_per_tensor = None maxpool2d = self.maxpool2d(dequantize); dequantize = None maxpool2d_output_scale_0 = self.maxpool2d_output_scale_0 maxpool2d_output_zero_point_0 = self.maxpool2d_output_zero_point_0 quantize_per_tensor_1 = torch.quantize_per_tensor(maxpool2d, maxpool2d_output_scale_0, maxpool2d_output_zero_point_0, torch.quint8); maxpool2d = maxpool2d_output_scale_0 = maxpool2d_output_zero_point_0 = None dequantize_1 = quantize_per_tensor_1.dequantize(); quantize_per_tensor_1 = None return dequantize_1 ``` note that the call to self.maxpool2d is expanded to ``` dequantize = quantize_per_tensor.dequantize(); quantize_per_tensor = None maxpool2d = self.maxpool2d(dequantize); dequantize = None maxpool2d_output_scale_0 = self.maxpool2d_output_scale_0 maxpool2d_output_zero_point_0 = self.maxpool2d_output_zero_point_0 quantize_per_tensor_1 = torch.quantize_per_tensor(maxpool2d, maxpool2d_output_scale_0, maxpool2d_output_zero_point_0, torch.quint8); maxpool2d = maxpool2d_output_scale_0 = maxpool2d_output_zero_point_0 = None ``` Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_copy_node_has_shared_actpp_instance ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D29728900 fbshipit-source-id: cf2c7f1f6659e3ba97cbb7c6204dd13983da10bd	2021-07-25 19:49:13 -07:00
Jerry Zhang	cc263ef795	[bc-breaking][quant][graphmode][fx] Add observer/fake_quant for copy nodes (#61687 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61687 Previously we do not insert observer/fake_quant for output copy nodes (e.g. maxpool). But to produce reference patterns we need to insert observer/fake_quant for the output and later convert that to a quantize node. Model: ``` class M(torch.nn.Module): def __init__(self): super().__init__() self.maxpool2d = torch.nn.MaxPool2d(kernel_size=3) def forward(self, x): x = self.maxpool2d(x) return x ``` result of prepare: Before: def forward(self, x): x_activation_post_process_0 = self.x_activation_post_process_0(x); x = None maxpool2d = self.maxpool2d(x_activation_post_process_0); x_activation_post_process_0 = None return maxpool2d After: def forward(self, x): x_activation_post_process_0 = self.x_activation_post_process_0(x); x = None maxpool2d = self.maxpool2d(x_activation_post_process_0); x_activation_post_process_0 = None maxpool2d_activation_post_process_0 = self.maxpool2d_activation_post_process_0(maxpool2d); maxpool2d = None return maxpool2d_activation_post_process_0 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D29715566 fbshipit-source-id: 817df9b2933a35cad5331d8d8ce1c5ba0752e9df	2021-07-23 21:29:37 -07:00
Charles David Hernandez	32d0c3e8ee	Support for reference convert_fx working on gpu Summary: This PR enables gpu only quantization, best used with is_reference since there are not many gpu kernels for ops as of now. This PR mainly changes how qconfigs and their obs constructors operate once they on modules qconfig. The function add_module_to_qconfig_obs_ctr takes the obs constructors on the original qconfig, and configures them so that when invoked, the created obs will be on whatever device the module occupies. (Once observers are created, module.to(device) is already setup so that it moves any observers). To do this, a new method and a few small chanegs were added to the _PartialWrapper class that our observers already use to create constructors (without changing the existing functionality). These changes work in concert with changes to the prepare flow such that when the qconfigs are propagated to the moduels (in quantize.py and qconfig_utils.py) they are configured using add_module_to_qconfig_obs_ctr. Ideally this would work on other models but the is_reference support for a lot of modules isn't there yet, those tests should be added in a future PR Test Plan: python test/test_quantization.py TestQuantizeFxModels.test_static_gpu_convert_basic python test/test_quantization.py TestQuantizeFxModels.test_switch_device_prepare_convert python test/test_quantization.py TestQuantizeFxModels.test_prepare_serialize_switch_device_convert python test/test_quantization.py TestQuantizeFx.test_qconfig_precedence Reviewed By: vkuzo Differential Revision: D29684114 fbshipit-source-id: 19fefb8e1998eaf212723e836276ccf39467f2e7	2021-07-23 10:30:38 -07:00
Vasiliy Kuznetsov	a70505cdbd	ns for fx: support comparing fp32 vs fp32_prepared, except shadowed (#61129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61129 Adds support the comparing fp32 model (without quantization) to a fp32 model prepared with quantization. The main missing feature was handling conv-bn fusion, since this fusion for PTQ happens outside of quantization patterns. Adds testing for this case for comparing weights and comparing activations Adds a TODO for also handling this for shadow activations, we need to first stop removing observers in graph passes before we can add this support, will be in a future PR. Test Plan: ``` python test/test_quantization.py TestFXGraphMatcherModels.test_mobilenet_v2 python test/test_quantization.py TestFXGraphMatcherModels.test_mobilenet_v2_qat python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_compare_activations_conv ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D29520009 fbshipit-source-id: f63484a998f1424bd9cacf5d823b82b2edfea1ae	2021-07-17 20:52:23 -07:00
Jerry Zhang	4a3eea9a6a	[quant][graphmode][fx] Produce reference linear module in convert (#60152 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60152 Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D29188263 fbshipit-source-id: f7bbbef5d4d747eadf7a627a4e77a5ec9bb0bc94	2021-06-20 20:08:12 -07:00
Jerry Zhang	2293ab4e53	[quant][graphmode][fx] Refactor convert for linear to use get_static_module_mapping and get_dynamic_module_mapping (#60151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60151 Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D29188264 fbshipit-source-id: d2b77ffcf4b7446fc6c43248e43218092d2a6aea	2021-06-20 19:41:16 -07:00
Jerry Zhang	47d727fe1b	[quant][graphmode][fx] Produce conv reference static quant modules (#60138 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60138 Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D29184791 fbshipit-source-id: 971a40012dbba0cf687c62a3a4af9358513c253b	2021-06-20 19:25:45 -07:00
Jerry Zhang	a029422cae	[quant][graphmode][fx][refactor] Change the env map to add dtype as a key (#60054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60054 Previously env in convert is Dict[str, Tuple[Node, torch.dtype]], that is, at a given time each node can only have one dtype, this causes a problem for the following case: ``` class M(torch.nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(1, 1, 1) def forward(self, x): x = self.conv(x) x1 = x.expand_as(x) x2 = torch.add(x, x1) return x2 def forward(self, x): x = self.activation_post_process_0(x) x = self.conv(x) x = self.activation_post_process_1(x) x1 = x.expand_as(x) x1 = self.activation_post_process_2(x1) x2 = torch.add(x, x1) x2 = self.activation_post_process_3(x2) return x2 def forward(self, x): x = torch.quantize_per_tensor(x, ...) x = self.conv(x). # quantized conv x = torch.dequantize(x) x1 = x.expand_as(x) x1 = torch.quantize_per_tensor(x1, ...) # Error: x is dequantized x2 = torch.ops.quantized.add(x, x1) return x2 Currently we have a env that is a map from node name of the observed graph to the Node in the quantized graph, here the problem is that following a quantized operator conv, we have two operators, one is expecting float input (expand_as), the other is expecting quantized input (quantized add), and in the quantized graph, ideally, expand_as should consume the dequantized output, and quantized add should consume the quantized output: quantized_conv - dequantize - expand_as \ ------- quantized_add But currently in env, each node needs to either be quantized or not quantized. Therefore we will need to change env to include dtype as well: env: Dict[str, Dict[dtype, Node]], e.g. {‘x’: {torch.float: dequantized_node, torch.quint8: quantized_node}} And when we load from the env, we will need to provide the dtype of the Node that we want to load as well. We can have a separate pass to figure out this information for each node. ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29149408 fbshipit-source-id: c9e4b7d65444ab6a6f573929bae1db5037629892	2021-06-18 13:31:43 -07:00
Jerry Zhang	3218d890dd	[quant][graphmode][fx][fix] Fix support for custom module (#59041 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59041 Static quantization for Custom module support was removed in a previous refactor https://github.com/pytorch/pytorch/pull/57519 since it's not covered by the test case This PR re-enabled the test case and fixed the support Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D28724866 fbshipit-source-id: 1974675b88b56a2173daf86965d6f3fb7ebd783b	2021-06-01 22:31:15 -07:00
Jerry Zhang	06af7618e7	[quant][graphmode][fx][refactor] Remove Quantizer class from convert (QuantizeHandler) (#59040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59040 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724870 fbshipit-source-id: c0f748711b825cd46bdfcc05c054c77a41e8207a	2021-06-01 22:00:49 -07:00
Jerry Zhang	50e6ee3ca2	[quant][graphmode][fx][refactor] Remove Quantizer class from quantize_node (#59039 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59039 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724874 fbshipit-source-id: bd984716b2da1d6879c3e92fa827574783a41567	2021-06-01 21:40:08 -07:00
Jerry Zhang	20348fb32e	[quant][graphmode][fx][refactor] Remove find_matches from Quantizer class (#59037 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59037 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724865 fbshipit-source-id: 6c6824d0af7dd47d4c111d6a08e373bc65f33e08	2021-06-01 16:07:07 -07:00
Jerry Zhang	e4b2684331	[quant][graphmode][fx][refactor] Remove patterns from Quantizer class (#59033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59033 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724861 fbshipit-source-id: 97b38e851b6bf581510a24636b1d8d6f1d977f5a	2021-06-01 13:44:08 -07:00
Jerry Zhang	83892c1861	[quant][graphmode][fx][refactor] Remove node_name_to_scope from Quantizer (#59032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59032 To remove Quantizer class and split prepare and convert functions to different files Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724868 fbshipit-source-id: 6df639f20076b480812b6dcf0fc7d2c87ca29d8b	2021-06-01 13:26:09 -07:00
Jerry Zhang	3826f7e8e0	[quant][graphmode][fx][refactor] Remove quantized_graph from Quantizer (#59031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59031 Trying to remove Quantizer class and split prepare and convert code Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724871 fbshipit-source-id: dad0332ba271c4cfb6ec1e8f2036443149b5bea4	2021-06-01 13:01:54 -07:00
Jerry Zhang	1b4586ee20	[quant][gx][graphmode][refactor] Remove modules from Quantizer (#59030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59030 Trying to remove Quantizer class and split prepare and convert code Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724875 fbshipit-source-id: d6610c1d5eb7755331252be9e348a230abf4175c	2021-06-01 12:42:28 -07:00
Jerry Zhang	10fc42eacc	[quant][graphmode][fx] Merge quant_env and env (#59028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59028 Previously we have an env and a quant_env in convert, which is a bit confusing, in this PR we merged them and have a Dict[str, Tuple[Node, torch.dtype]] Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28724863 fbshipit-source-id: 722a682c70d300a6ccd2b988786a1ac2d45e880e	2021-06-01 09:21:38 -07:00
Adnios	09a8f22bf9	Add mish activation function (#58648 ) Summary: See issus: https://github.com/pytorch/pytorch/issues/58375 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58648 Reviewed By: gchanan Differential Revision: D28625390 Pulled By: jbschlosser fbshipit-source-id: 23ea2eb7d5b3dc89c6809ff6581b90ee742149f4	2021-05-25 10:36:21 -07:00
Jerry Zhang	f29e75c4dc	[reland][quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer (#58455 ) (#58756 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58756 Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Imported from OSS Reviewed By: supriyar Differential Revision: D28607564 fbshipit-source-id: 979cf165941bb3a9044d03077a170b5ea64dc36a	2021-05-24 14:57:45 -07:00
Horace He	21a9334034	Revert D28497967: [quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer Test Plan: revert-hammer Differential Revision: D28497967 (`1cf8f7a439`) Original commit changeset: 421ce3d86fad fbshipit-source-id: b1b290be47d847ab0e0128e3ae89f528578550ee	2021-05-20 20:56:12 -07:00
Jerry Zhang	1cf8f7a439	[quant][fx][graphmode][refactor] Remove qconfig_map from Quantizer (#58455 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58455 Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28497967 fbshipit-source-id: 421ce3d86fadd3d92f4120b850b0167270509189	2021-05-20 20:34:47 -07:00
Jerry Zhang	4668d09ca6	[quant][graphmode][fx] Quantize the output of statically quantized fp16 op in QuantizeHandler (#58445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58445 Previously the output of statically quantized fp16 operator is not quantized in QuantizeHandler, which is not consistent with the behavior of static int8 operators. Also it does not work well with reference functions, this PR changes the fp16 static QuantizeHandler to quantize (call to(torch.float16)) in the QuantizeHandler, this also makes the future support for reference functions easier. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D28495830 fbshipit-source-id: 2140eab8ab2dd08f6570d9e305485e3029e1f47d	2021-05-20 16:03:42 -07:00
Emily Shen	07da584dbd	Fix KeyError returned by _maybe_get_last_node_only_observer (#58443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58443 Test Plan: arc lint Reviewed By: vkuzo Differential Revision: D28494119 fbshipit-source-id: 05abf4e12051afc237096812fb0ee08a8b9447f9	2021-05-18 12:41:19 -07:00
Vasiliy Kuznetsov	4f50fdc2a3	fx quant: refactor observer insertion Summary: tl;dr; rewrites the FX graph mode quantization observer insertion to be easier to understand and extend. The key conceptual difference from before is: * before: for each node, observers are always inserted to the output of the current node, even if they are needed for the next node. This is hard to reason about. * after: for each node, observers are inserted to the inputs (if needed, as calculated by the dtype of the argument and dtype of current node) and to the output (if needed for the type of pattern and qconfig). There is no knowledge of future nodes needed to insert observers for the current node. This allows us to significantly simplify various things: * all new observers needed for a node are inserted together. This makes it easier to understand and debug things. We add an invariant that node X will never change any observers inserted by any preceding or subsequent node, so to debug an issue the user can just understand what is happening for node X, without having to understand what happens before or after it. * all the state tracking of activation_post_process_map and activation_post_process_indices are removed, instead observers are looked up by graph traversals * since there is no longer a need for overlapping graph passes which mutate each other's interemediate state, it is easier to understand what the rules are for inserting observers, and to create new rules in the future. Test Plan: ``` # all OSS tests pass python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Differential Revision: D28241864 Reviewed By: jerryzh168 Pulled By: vkuzo fbshipit-source-id: 950d58972d26362808564cc0a2dfb30413a3734d	2021-05-15 09:51:33 -07:00
Vasiliy Kuznetsov	7c3a30fd79	fx quant: remove matching hack for binary qhandler (#57470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57470 Removes the earlier hack of matching patterns originally matched to BinaryOpQuantizeHandler to switch to CopyHandler. After this PR, each pattern can only be matched to one type of QuantizeHandler or to nothing. Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28152909 fbshipit-source-id: afc285e770bd7eb0518c90e3ee4874c421e78bbc	2021-05-04 16:38:56 -07:00
Vasiliy Kuznetsov	643f41be61	fx quant: remove FixedQParamsOpQuantizeHandler from quantize.py (#57393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57393 Moves the information on whether we should pass the information whether the output is quantized based on the inputs to live on the qhandler object. This allows us to remove FixedQParamsOpQuantizeHandler from quantize.py, further reducing the coupling between handler objects and the quantization pass. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: astaff Differential Revision: D28132414 fbshipit-source-id: 5c28524b47c00f618d3a38657376abae9e6ffe7c	2021-05-02 20:13:10 -07:00
Vasiliy Kuznetsov	2bd158386a	fx quant: move input_output_observed to qhandler (#57388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57388 It's a bit confusing to have this be a decorator. It's simpler to just expose it as a function on qhandler. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28129411 fbshipit-source-id: f7316f285e8546c67e8d8cf753462b2c2abb2636	2021-05-02 20:13:08 -07:00
Vasiliy Kuznetsov	1b20eeb138	fx quant: move output obs logic to QuantizeHandler (#57377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57377 Moves the logic which determines 1. whether a pattern instance's output should be observed 2. whether a pattern instance's output should be marked as observed based on its inputs 3. whether to ovverride the activation specified in the qconfig from `quantize.py` to `quantization_patterns.py`. This makes the code easier to read and reduces the coupling between `Quantizer` and `QuantizeHandler` instances. Note: there are some further cleanups which would be good after this one - leaving those for future PRs. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D28126896 fbshipit-source-id: 94c80a9c7307452783348d65b402acc84983e3f6	2021-05-02 20:13:07 -07:00
Jerry Zhang	096089abcb	[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat (#54924 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54924 Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Imported from OSS Reviewed By: vkuzo Differential Revision: D27416528 fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd	2021-04-21 10:58:09 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00
Charles David Hernandez	6e1fc5cef8	[quant] added dq->op->q quantization patterns for GELU and softmax ops (#56004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004 added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal python test/test_quantization.py TestQuantizeFxOps.test_silu_reference python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFuseFx python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxModels Imported from OSS Reviewed By: raghuramank100 Differential Revision: D27818340 fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c	2021-04-20 13:26:15 -07:00
Vasiliy Kuznetsov	8fc1ca0d22	fx quant: fix prepacking for F.conv1d (#55311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55311 Before this PR, `F.conv1d` was matched by FX graph mode quant patterns but the prepacking was happening inline. There was also a bug with argument type mismatch. This PR fixes both issues and adds a test. Thanks jerryzh168 for the code tip. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_functional_not_reference ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27575422 fbshipit-source-id: 42301e23cb101a9e64e46800813bc771317e233e	2021-04-14 09:04:28 -07:00
Jerry Zhang	c96b5b2a20	[quant][graphmode][fx][fix] Fix fp16 reference patterns for linear (#55727 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55727 number of dequantize for fp16 reference pattern was incorrect before, this PR fixes the problem Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27713390 fbshipit-source-id: 72b8d4cda0bdcea74abe27a76f918d1b47819b01	2021-04-13 23:19:45 -07:00
Jerry Zhang	4d449f915f	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) (#55429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D27609972 fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e	2021-04-08 22:12:24 -07:00
Bradley Davis	8eaa4a97b7	Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388 temporarily revert D27314678 (`c57541ce06`), it appears to cause a perf regression that makes quantization of some models take too long to complete tests. Reviewed By: houseroad Differential Revision: D27583809 fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b	2021-04-06 14:20:36 -07:00
Jerry Zhang	c57541ce06	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314678 fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1	2021-03-31 17:50:32 -07:00
Jerry Zhang	c0d6dbdce4	[quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead (#54643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54643 A refactor needed for future changes. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314677 fbshipit-source-id: 972fbfb506f86da13f8817b3eaa5e6d0ad16ffe1	2021-03-31 17:50:30 -07:00
Jerry Zhang	55544cb13a	[quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586 Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16 might do some followup PRs to clean up the hacks and refactor the code. Test Plan: python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value Imported from OSS Reviewed By: vkuzo Differential Revision: D26912676 fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5	2021-03-31 17:48:50 -07:00
Vasiliy Kuznetsov	4884a6ab51	fx quant: clean up names of quantize handlers (#53614 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614 Ensures that every subclass of `QuantizeHandler` has a clear name. This prevents ambiguous names like `Cat`, which look like a module but are really a quantize handler. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26914784 fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf	2021-03-12 07:43:53 -08:00
Vasiliy Kuznetsov	279b5372ab	[not for land] fix fx quant for quant_layer -> stack -> sum (#53196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196 Before this PR, code patterns like this did not work: ``` x = some_quant_layer(x) x = torch.stack([x, ...]) x = torch.sum(x, ...) ``` The reason this did not work is because `torch.sum` is treated as "quantized" because of the newly added fp16 support, even though it is not actually "quantized" for models where fp16 is not used. We may need to adjust the concept of "quantized vs non-quantized" into a "dtype" for the longer term fix. The current PR is a hacky fix to unblock. We need to clean things up before this is landable Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_quant_sum ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26783960 fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1	2021-03-12 07:43:50 -08:00
Vasiliy Kuznetsov	ccab6680d5	[not for land yet] hacky fix for x.ndim followed by sub (#53120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120 Currently there is a pattern which is not handled correctly by FX graph mode quantization: ``` def forward(self, x): ndim = x.ndim # or add, mul, div, etc x = torch.sub(x, ndim) return x ``` The reason this does not work is as follows: 1. x.ndim becomes a getattr node 2. the real world type of x.ndim is an integer, but this is not known from the graph (yet) 3. binary ops such as `torch.sub` require quantization of inputs 4. the framework inserts an observer to observe the output of `ndim` 5. the observer fails because `ndim` is not a Tensor For now, we hack a bandaid to unblock some teams, none of this is for land. We will have to think of a better fix which is landable (TBD). Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26756180 fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263	2021-03-12 07:42:12 -08:00
Jerry Zhang	7484c56fa3	[quant][graphmode][fx] Fix a condition check for CopyNode (#53585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585 Previously fp16_static CopyNode would be marked as unquantized because of an incorrect condition check of whether a Node is statically quantized or not. This PR fixes that. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26912677 fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993	2021-03-11 09:32:20 -08:00
hyperfraise	f9185973d1	[quantization] Add some support for 3d operations (#50003 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50002 The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003 Reviewed By: mrshenli Differential Revision: D26325953 Pulled By: jerryzh168 fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034	2021-03-10 16:40:35 -08:00
Jerry Zhang	46bd76fdec	[quant][graphmode][fx][fp16] Add fp16 support for silu (#52865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52865 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_silu Imported from OSS Reviewed By: vkuzo Differential Revision: D26672270 fbshipit-source-id: a6a6ab58c347a56f0ded612b2e0a3e2230a91d9e	2021-03-02 02:11:29 -08:00
Jerry Zhang	d40b501cfc	[quant][graphmode][fx][fp16] Add fp16 support for sigmoid (#52863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52863 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26672273 fbshipit-source-id: 30d5befe2a24081ac12ac773df4d2bd26d2d0192	2021-03-02 02:11:21 -08:00
Jerry Zhang	3fb324f05b	[quant][graphmode][fx][fp16] Add fp16 support for layer_norm (#52862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52862 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_layer_norm Imported from OSS Reviewed By: vkuzo Differential Revision: D26672272 fbshipit-source-id: 4cfdce986efa98db7dc58bf2a62b650e45a69ed0	2021-03-02 02:11:17 -08:00
Jerry Zhang	fc6fdade9f	[quant][graphmode][fx][fp16] Add fp16 support for torch.sum (#52811 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52811 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_sum Imported from OSS Reviewed By: vkuzo Differential Revision: D26655619 fbshipit-source-id: 642e0de47d0da7bd1abe1e981819de33e84c32f3	2021-03-02 02:11:13 -08:00
Jerry Zhang	97c51d5d5d	[quant][graphmode][fx][fp16] Add fp16 support for div (#52810 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52810 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_div Imported from OSS Reviewed By: vkuzo Differential Revision: D26655620 fbshipit-source-id: e46cb895ba456e99e4433bd6037229b8248a1b28	2021-03-02 02:11:08 -08:00

1 2 3

114 Commits