pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Sam Estep	4753100a3b	Un-ignore F403 in .flake8 (#55838 ) Summary: Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files). This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908). Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838 Test Plan: CI. You can also run `flake8` locally. Reviewed By: jbschlosser Differential Revision: D27724232 Pulled By: samestep fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34	2021-04-13 09:24:07 -07:00
Vasiliy Kuznetsov	ec9b20ddc0	fx quant: fix edge case with copynode after user function (#55710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55710 In the current code, there is an edge case which leads to an error after the prepare step: 1. have a pattern like this: ``` user_func_unmatched_to_qhandler -> node_matched_to_copy_node_qhandler ``` 2. the user function returns a type which is not observable (i.e. not a Tensor) 3. if this is run through `prepare_fx`, calibrating it with data leads to a runtime error, because observers cannot observe non-tensor types. This PR fixes the issue. If a node matched to `CopyNodeQuantizeHandler` is after an unmatched node, we delete the observer. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_no_obs_between_unmatched_node_and_copy_node ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27686811 fbshipit-source-id: 320be41b1f383c6352ff89fb39a9f480822a3bb2	2021-04-12 08:47:44 -07:00
Jerry Zhang	3e8ebb17aa	[reland][quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function (#54733 ) (#55307 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55307 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27567475 fbshipit-source-id: 74b7db63f7e1e795e7ac7ed6027cf786d922e7bf	2021-04-09 17:56:55 -07:00
Jerry Zhang	4d449f915f	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) (#55429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55429 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D27609972 fbshipit-source-id: 378f6aa70f18c0b477b62b6efe236648748aae7e	2021-04-08 22:12:24 -07:00
Bradley Davis	8eaa4a97b7	Back out "[quant][graphmode][fx] Separate handling Copy operator to a helper function" (#55388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55388 temporarily revert D27314678 (`c57541ce06`), it appears to cause a perf regression that makes quantization of some models take too long to complete tests. Reviewed By: houseroad Differential Revision: D27583809 fbshipit-source-id: e9c088ccbfd3bfb3a1d4c7eafee3eca29ee7717b	2021-04-06 14:20:36 -07:00
Mike Ruberry	15f04e3466	Revert D27408378: [quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function Test Plan: revert-hammer Differential Revision: D27408378 (`c445f4ee93`) Original commit changeset: 9143f0a6f939 fbshipit-source-id: ae65ea798a6d72f2ec724c4c1b492937edddf721	2021-03-31 20:51:42 -07:00
Jerry Zhang	c445f4ee93	[quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function (#54733 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54733 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27408378 fbshipit-source-id: 9143f0a6f939fa80f1d1d6bae4b2d37aa21cb9b9	2021-03-31 18:50:47 -07:00
Jerry Zhang	c57541ce06	[quant][graphmode][fx] Separate handling Copy operator to a helper function (#54644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54644 Previously we special case copy operator in normal insert observer code, this PR tries to split the special case logic to a separate function and keep the rest of the code clean. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314678 fbshipit-source-id: d36870ceb3717bc01eaeaa6f3f1532ad562cbaf1	2021-03-31 17:50:32 -07:00
Jerry Zhang	c0d6dbdce4	[quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead (#54643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54643 A refactor needed for future changes. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27314677 fbshipit-source-id: 972fbfb506f86da13f8817b3eaa5e6d0ad16ffe1	2021-03-31 17:50:30 -07:00
Jerry Zhang	c2adedf6fe	[quant][graphmode][refactor] Remove reduandent code (#54073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54073 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D27086067 fbshipit-source-id: b1995138de56f1352c5df03378ebc2832bf35ef7	2021-03-31 17:50:27 -07:00
Jerry Zhang	55544cb13a	[quant][graphmode][fx] Add support for one value being quantized with different qconfigs (#53586 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53586 Previously one value can only be quantized to one dtype, this PR adds the support for quantizing one value in the fx graph with multiple dtypes, e.g. first quantize to int8 and then float16 might do some followup PRs to clean up the hacks and refactor the code. Test Plan: python test/test_quantization.py TestQuantizeFx.test_multiple_qconfigs_single_value Imported from OSS Reviewed By: vkuzo Differential Revision: D26912676 fbshipit-source-id: ae3653fd67f05870a3a9e808f491871826c555d5	2021-03-31 17:48:50 -07:00
Supriya Rao	6f63126b5c	[quant][fx] Add pass in convert to fold quant-dequant sequence (#54860 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54860 Currently we insert a quantize_per_tensor op when we encounter the quantizable input, so if it has multiple uses and not all are quantizable then we need to add a dequantize op before these ops. In this pass - For a sequence of quantize_per_tensor - dequantize, we combine them since it is a no-op. [internal only][pyper] Before this change we had redundant dequantize nodes in the graph Example 1x inline_cvr graph https://www.internalfb.com/intern/everpaste/?handle=GODBxAlUMzGHD6 (`98143776f5`)MSACpHKKu9qjorbsIXAAAz FC layers -> 37 quantize_per_tensor -> 30 dequantize -> 49 After this change https://www.internalfb.com/intern/everpaste/?handle=GAl0uQnOlDNmpLoSAB-GZqRxu9wMbsIXAAAz FC layers -> 37 quantize_per_tensor -> 30 dequantize -> 39 We remove extra 10 dequantize nodes in the graph. Test Plan: python test/test_quantization.py test_fold_quant_dequant Imported from OSS Reviewed By: vkuzo Differential Revision: D27390506 fbshipit-source-id: 56e6fb8496171246eccf4bd45eb8bebd87fcb740	2021-03-30 08:40:24 -07:00
Vasiliy Kuznetsov	b81e10a291	fx quant: fix bug with fusion patterns and disabling quantization (#54654 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54654 Fixes a bug where disabling quantizaton on potential fusion patterns would lead to errors in the `convert` function. For example: 1. have a model with add-relu 2. disable quantization for the part of the model containing add-relu 3. run prepare and convert, the convert step would fail because intermediate nodes were missing from `env`. The fix is to add handling for this edge case. If quantization is disabled, we manually copy the nodes for multi-node fusion patterns. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_fusion_pattern_unquantized ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D27318454 fbshipit-source-id: 27c1fd1cb7c9711a8e8d338200971c428dae8f98	2021-03-25 22:21:41 -07:00
Yukio Siraichi	27048c1dfa	Remove legacy constructor calls from _torch_ folder. (#53889 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/53146 Related to https://github.com/pytorch/pytorch/issues/47112 As mentioned in https://github.com/pytorch/pytorch/issues/47112, the plan is to: 1. Verify that all `torch.Tensor()` scenarios are covered by other functions 2. Scrub internal `torch.Tensor()` uses 3. Update the docs and throw `TORCH_WARN_ONCE` if someone uses `torch.Tensor()` In this PR, I replaced all occurrences of `torch.Tensor` present in the _torch_ folder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53889 Reviewed By: walterddr, zou3519 Differential Revision: D27190743 Pulled By: jbschlosser fbshipit-source-id: 7ecc201d57935b8dbb98ae3718b60d95cb55a010	2021-03-19 15:20:19 -07:00
Vasiliy Kuznetsov	4884a6ab51	fx quant: clean up names of quantize handlers (#53614 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53614 Ensures that every subclass of `QuantizeHandler` has a clear name. This prevents ambiguous names like `Cat`, which look like a module but are really a quantize handler. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26914784 fbshipit-source-id: 6dca7e27975c09f422f8e36f1d2b709bf3eaaadf	2021-03-12 07:43:53 -08:00
Vasiliy Kuznetsov	279b5372ab	[not for land] fix fx quant for quant_layer -> stack -> sum (#53196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53196 Before this PR, code patterns like this did not work: ``` x = some_quant_layer(x) x = torch.stack([x, ...]) x = torch.sum(x, ...) ``` The reason this did not work is because `torch.sum` is treated as "quantized" because of the newly added fp16 support, even though it is not actually "quantized" for models where fp16 is not used. We may need to adjust the concept of "quantized vs non-quantized" into a "dtype" for the longer term fix. The current PR is a hacky fix to unblock. We need to clean things up before this is landable Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_quant_sum ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26783960 fbshipit-source-id: 3be7c3c1eaa2b8fcb99a105e1b0004c9ffd3a1c1	2021-03-12 07:43:50 -08:00
Vasiliy Kuznetsov	93d5807c1e	[not for land yet]fix using size of quant layer in torch._assert (#53187 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53187 Before this diff, if we had code lik ``` x = any_quant_layer(...) x_size0 = x.size(0) torch._assert(x_size_0 == 1) ``` The convert code would try to insert a dequantize after `x_size0`, because it was a descendant of a quantized node and it was needed for a non-quantized operation. Since the actual type of the `size` function output is an integer, this does not make sense. For now, this is fixed as a one-off to unblock a customer. In the future, we may need to think more deeply about all the functions which can return non-quantized types from quantized tensors and make sure they are all covered. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_assert_on_size_after_quant_layer ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D26780690 fbshipit-source-id: 44cc25c9179d460efb3f110d40b73d854d676af5	2021-03-12 07:43:48 -08:00
Vasiliy Kuznetsov	ccab6680d5	[not for land yet] hacky fix for x.ndim followed by sub (#53120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53120 Currently there is a pattern which is not handled correctly by FX graph mode quantization: ``` def forward(self, x): ndim = x.ndim # or add, mul, div, etc x = torch.sub(x, ndim) return x ``` The reason this does not work is as follows: 1. x.ndim becomes a getattr node 2. the real world type of x.ndim is an integer, but this is not known from the graph (yet) 3. binary ops such as `torch.sub` require quantization of inputs 4. the framework inserts an observer to observe the output of `ndim` 5. the observer fails because `ndim` is not a Tensor For now, we hack a bandaid to unblock some teams, none of this is for land. We will have to think of a better fix which is landable (TBD). Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_getattr_with_nontensor_result ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26756180 fbshipit-source-id: c0e498766b22c23df74fbb5aaeaa237c4c944263	2021-03-12 07:42:12 -08:00
Jerry Zhang	7484c56fa3	[quant][graphmode][fx] Fix a condition check for CopyNode (#53585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53585 Previously fp16_static CopyNode would be marked as unquantized because of an incorrect condition check of whether a Node is statically quantized or not. This PR fixes that. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26912677 fbshipit-source-id: 4ddb538714c5ba2db28430de5e1cf2931baf1993	2021-03-11 09:32:20 -08:00
Jerry Zhang	0584fd9339	[quant][fx][graphmode][fix] Only insert observers for fixed qparam ops (#53330 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53330 Fixed a condition check for fixed qparam ops, previously we were including CopyNodes as well Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26836867 fbshipit-source-id: 8c486155244f852e675a938c3f4237f26505671c	2021-03-10 16:51:36 -08:00
hyperfraise	f9185973d1	[quantization] Add some support for 3d operations (#50003 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50002 The last commit adds tests for 3d conv with the `SubModelFusion` and `SubModelWithoutFusion` classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50003 Reviewed By: mrshenli Differential Revision: D26325953 Pulled By: jerryzh168 fbshipit-source-id: 7406dd2721c0c4df477044d1b54a6c5e128a9034	2021-03-10 16:40:35 -08:00
Supriya Rao	7cec4b3d4a	[quant][fx] add _remove_qconfig flag to convert_fx (#53166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53166 Context: For fx modules that consist of scriptmodules, calling delattr(module, 'qconfig') throws an attribute error. will follow up with a separate issue/repro to fix this problem This PR adds a temporary flag to convert_fx API to preserve the qconfig attributes on the converted model We will remove this flag once we reach a conclusion on calling delattr on scriptmodules Test Plan: python test/test_quantization.py test_preserve_qconfig Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26771518 fbshipit-source-id: 9fd72816576856ffb4aa11f8fde08303d1df10a2	2021-03-03 12:58:05 -08:00
Jerry Zhang	d40b501cfc	[quant][graphmode][fx][fp16] Add fp16 support for sigmoid (#52863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52863 Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26672273 fbshipit-source-id: 30d5befe2a24081ac12ac773df4d2bd26d2d0192	2021-03-02 02:11:21 -08:00
Jerry Zhang	096bea5251	[reland][quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) (#53019 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53019 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D26725350 fbshipit-source-id: 2a89f5da6a21908f454f870521d2a4549fdd291e	2021-03-01 13:19:42 -08:00
Mike Ruberry	312b297b82	Revert D26626092: [quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} Test Plan: revert-hammer Differential Revision: D26626092 (`2962fbb03c`) Original commit changeset: 91d040efa51e fbshipit-source-id: cc6bcc0f451d6adcd7bf7572451e6e3cd6ad59d1	2021-03-01 04:52:47 -08:00
Jerry Zhang	2962fbb03c	[quant][graphmode][fx][fp16] Add fp16 support for {add\|mul}{_relu} (#52714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52714 Test Plan: python test/test_quantization.py TestQuantizedOps.test_add python test/test_quantization.py TestQuantizedOps.test_mul python test/test_quantization.py TestQuantizedOps.test_add_relu python test/test_quantization.py TestQuantizedOps.test_mul_relu Imported from OSS Reviewed By: vkuzo Differential Revision: D26626092 fbshipit-source-id: 91d040efa51e9c955eb688ec16a30f0c12233958	2021-02-27 22:12:10 -08:00
Jerry Zhang	0818dbf49d	[quant][refactor] Merge add and mul handler (#52651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651 Merging them for easier extensions to fp16 and more binary ops Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26600118 fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb	2021-02-27 19:56:32 -08:00
Jerry Zhang	b685864f50	[quant][graphmode][fx] Add reference option support for linear_static_fp16 (#52650 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52650 linear_dynamic_fp16 has following dtypes for activation, weight, bias, output: (fp32, fp16, fp32, fp32) linear_static_fp16 has following dtypes: (fp16, fp16, fp16, fp16) Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D26599803 fbshipit-source-id: b4a8345d355125070be718a227288cc848cc8bbc	2021-02-27 08:25:44 -08:00
Jerry Zhang	177694681e	[quant][graphmode][fx] Add reference option support for linear_dynamic_fp16 (#52534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534 Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions to other backends Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16 Imported from OSS Reviewed By: vkuzo Differential Revision: D26557726 fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c	2021-02-26 21:12:22 -08:00
Jerry Zhang	cb6b65699f	[quant][graphmode][fx] Add support for packed params in state_dict (#51639 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51639 Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D26228185 fbshipit-source-id: 6cf8b4106fec9c6900521a2afe0de6f3d29cc896	2021-02-26 15:13:50 -08:00
Jerry Zhang	626756ac39	[quant][graphmode][api] debug --> reference (#52179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179 Rename debug to reference. We'll use this to produce a reference quantized model that can be used as a common interface between pytorch quantized model and backends. Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D26424656 fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35	2021-02-19 14:20:01 -08:00
Supriya Rao	916af892b3	[quant][fx] Update name of packed weight attributes (#51259 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259 Store the FQN of the module that is using the packed weights (the quantized op) In the case of fusion we update the scope mapping to store the module path of the fused node. Test Plan: python test/test_quantization.py test_packed_weight_fused_op Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26117964 fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461	2021-01-28 20:31:08 -08:00
Supriya Rao	288b94a8ee	[quant][fx] Make scale, zero_point buffers in the model, use FQN (for quantize_per_tensor ops) (#51171 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171 Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be registered as buffers in the module. Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types. Test Plan: python test/test_quantization.py test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092964 fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b	2021-01-28 08:35:46 -08:00
Supriya Rao	4c3f59b70e	[quant][fx] Make scale, zero_point buffers in the model and use FQN (for quantized ops) (#51166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166 Currently scale and zero_point values are stored as constant values in the graph. This prevents these values from being updated in the graph and also does not enable saving these values to state_dict After this PR we store scale/zero_point values for quantized ops as buffers in the root module and createe get_attr nodes for them in the graph. We also use the FQN of the module where the quantized ops are present to name these attributes so that they can be uniquely identified and mapped to quantized ops. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qparams_buffers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26092965 fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b	2021-01-28 08:35:42 -08:00
Supriya Rao	096adf4b8b	[quant][fx] Scope support for call_function in QuantizationTracer (#51086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51086 Previously we only supported getting scope for call_module and custom qconfig dict for call_module. This PR extends the Scope class to record the scope for all node types. For call_function qconfig if module_name is specified it takes precedence over function qconfig. Test Plan: python test/test_quantization.py test_qconfig_for_call_func Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26077602 fbshipit-source-id: 99cdcdedde2280e51812db300e17d4e6d8f477d2	2021-01-28 08:32:24 -08:00
Jerry Zhang	f10e7aad06	[quant][graphmode][fx] Scope support for call_method in QuantizationTracer (#50173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50173 Previously we did not set the qconfig for call_method node correctly since it requires us to know the scope (module path of the module whose forward graph contains the node) of the node. This PR modifies the QuantizationTracer to record the scope information and build a map from call_method Node to module path, which will be used when we construct qconfig_map Test Plan: python test/test_quantization.py TestQuantizeFx.test_qconfig_for_call_method Imported from OSS Reviewed By: vkuzo Differential Revision: D25818132 fbshipit-source-id: ee9c5830f324d24d7cf67e5cd2bf1f6e0e46add8	2021-01-11 10:43:58 -08:00
Jerry Zhang	f6f0fde841	[reland][quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754 ) (#50058 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058 This PR adds the support for {input/output}_quantized_idxs for standalone module. if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float input and produce float output, and will quantize the input and dequantize output internally if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d For more details, please see the test case Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D25768910 fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b	2021-01-05 20:27:46 -08:00
Mike Ruberry	46cf6d332f	Revert D25684692: [quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs Test Plan: revert-hammer Differential Revision: D25684692 (`89b4899ea5`) Original commit changeset: 900360e01c0e fbshipit-source-id: 8b65fa8fbc7b364fbddb5f23cc696cd9b7db98cd	2020-12-24 15:50:52 -08:00
Jerry Zhang	89b4899ea5	[quant][graphmode][fx] Standalone module support {input/output}_quantized_idxs (#49754 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754 This PR adds the support for {input/output}_quantized_idxs for standalone module. if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float input and produce float output, and will quantize the input and dequantize output internally if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d For more details, please see the test case Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Reviewed By: raghuramank100 Differential Revision: D25684692 fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66	2020-12-23 22:36:57 -08:00
Jerry Zhang	f474ffa1a9	[quant][graphmode][fx] Change standalone module api (#49719 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49719 We find there are multiple use cases for standalone module, one use case requires standalone module to produce a module that takes float Tensor as input and outputs a float Tensor, the other needs to produce a modulee that takes quantized Tensor as input and outputs a quantized Tensor. This is similar to `quantized_input_idxs` and `quantized_output_idxs` so we want to nest prepare_custom_config_dict in the standalone module configuration, for maximum flxibility we also include qconfig_dict for stand alone module as well in case user needs to have special qconfig_dict for the standalone module in the future. Changed from ```python prepare_custom_config_dict = { "standalone_module_name": ["standalone_module"], "standalone_module_class": [StandaloneModule] } ``` to ```python prepare_custom_config_dict = { "standalone_module_name": [("standalone_module", qconfig_dict1, prepare_custom_config_dict1)], "standalone_module_class": [(StandaloneModule, qconfig_dict2, prepare_custom_config_dict2)] } ``` The entries in the config are: 1. name/module_class 2. optional qconfig_dict, when it is None, we'll use {"": qconfig} where qconfig is the one from parent qconfig_dict 3. optional prepare_custom_config_dict, when it is None, we'll use default value of prepare_custom_config_dict for prepare API (None) Test Plan: python test/test_quantization.py TestQuantizeFx.test_standalone_module Imported from OSS Reviewed By: raghuramank100 Differential Revision: D25675704 fbshipit-source-id: 0889f519a3e55a7a677f0e2db4db9a18d87a93d4	2020-12-22 21:58:40 -08:00
Vasiliy Kuznetsov	de07d07600	fx quant: improve types on convert (#49688 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49688 Adds more types on FX quantize convert, fixing things as they are uncovered by mypy. Test Plan: ``` mypy torch/quantization python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25667231 fbshipit-source-id: 262713c6ccb050a05e3119c0457d0335dde82d25	2020-12-22 16:53:23 -08:00
Vasiliy Kuznetsov	19f972b696	fx quant: do not observe bias on F.linear (#49628 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49628 Ensures that linear bias is not observed in a `F.linear` call. This should be a small speedup in PTQ, and will change numerics (in a good way) for QAT if someone is using `F.linear`. Note: the implementation is slightly more verbose compared to conv because bias is a keyword argument in Linear. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_linear_functional_bias_not_observed ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25653532 fbshipit-source-id: c93501bf6b55cbe4a11cfdad6f79313483133a39	2020-12-22 16:53:21 -08:00
Vasiliy Kuznetsov	c3a7591cef	fx quant: do not observe bias on F.conv (#49623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49623 (not ready for review) Ensures that conv bias is not observed in a `F.conv{n}d` call. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25652856 fbshipit-source-id: 884f87be1948d3e049a557d79bec3c90aec34340	2020-12-22 16:49:50 -08:00
Vasiliy Kuznetsov	edce6b138d	fx quant: fix types on _find_quants (#49616 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49616 Add types to `_find_quants` I/O and fix resulting errors, needed for an upcoming bug fix. Test Plan: ``` mypy torch/quantization python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25645719 fbshipit-source-id: 4bf788b55fd4fd086c83a4438b9c2df22b9cff49	2020-12-21 21:05:57 -08:00
Jerry Zhang	5cde23fdd4	[quant][graphmode][fx] Allow user to specify qconfig for call_method (#49621 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49621 This adds support to configure qconfig for a call_method, e.g. x.chunk, this will help workaround a problem in our internal model. TODO: since call_method is also a string and we flatten the qconfig, might need to resolve namespace conflict between call_method and module_name TODO: Add scope support to set the qconfig for call_method correctly with original qconfig Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D25651828 fbshipit-source-id: 82d66b121d37c8274fd481b6a2e9f9b54c5ca73d	2020-12-18 20:21:52 -08:00
Vasiliy Kuznetsov	82ac6c75af	fx quant: make sure observer is inserted before a quantized output (#49420 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49420 Before: if an output was marked as quantized, it could actually not be quantized, if the previous node was not quantized. After: if an output was marked as quantized, it will be quantized regardless of the quantization status of the previous node. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_quant_output_always_observed ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25566834 fbshipit-source-id: 84755a1605fd3847edd03a7887ab9f635498c05c	2020-12-16 18:53:37 -08:00
Vasiliy Kuznetsov	84506e0316	fx quant: fix fq when input is quantized and node does not need fq (#49382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49382 Fixes an edge case. If the input to the graph is quantized and the first node does not need activation observation, makes sure that the observer is not inserted. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_int8_input_no_unnecessary_fq ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25551041 fbshipit-source-id: a6cba235c63ca7f6856e4128af7c1dc7fa0085ea	2020-12-16 18:53:33 -08:00
Vasiliy Kuznetsov	7542076097	fx quant: do not insert observers at quantized inputs (#49239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49239 Context: the existing implementation of `quantized_input_idxs` is convert-only. Therefore, observers are inserted between the input and the first quantized node. This is a problem during QAT, because the initial input is a fake_quant, and it starts with scale=1 and zp=0. This does not match the quantization parameters of the graph input, which can lead to incorrect numerics. Fix: do not insert observer for a quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25499486 fbshipit-source-id: 303b49cc9d95a9fd06fef3b0859c08be34e19d8a	2020-12-16 18:53:30 -08:00
Vasiliy Kuznetsov	92df8706a0	fx quant: move {input\|output}_quantized_idxs cfg from convert to prepare (#49238 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49238 Moves the `input_quantized_idxs` and `output_quantized_idxs` options from the convert config to the prepare config. This is done because these operations are related to placing observers, which is numerics changing during QAT. The next PR will adjust the behavior of `input_quantized_idxs` in prepare in QAT to prevent placing a fake_quant at the input if the input is marked quantized. Placing a fake_quant there can lead to numerical inaccuracies during calibration, as it would start with scale=1 and zp=0, which may be different from the quantization parameters of the incoming quantized input. Test Plan: ``` python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25498762 fbshipit-source-id: 17ace8f803542155652b310e5539e1882ebaadc6	2020-12-16 18:53:27 -08:00
Vasiliy Kuznetsov	d033e185ed	fx quant: move more functions to utils (#48908 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48908 No logic change, improving readability Test Plan: CI Imported from OSS Reviewed By: jerryzh168 Differential Revision: D25363080 fbshipit-source-id: 1d73a875bd7abf671b544ebc835432fea5306dc3	2020-12-08 15:37:04 -08:00

1 2 3

146 Commits