pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
HDCharles	b5d3d3ebf0	[ao] making hist_obs handle torch.inf and closeby values (#103467 ) Summary: This PR does 2 things: 1) Previously this would simply error, now it will ignore any torch.inf values that it recieves. note: The code checks for torch.inf after aminmax that way if there are no torch.inf values found, the perf is a relatively unchanged 2) as mentioned in https://github.com/pytorch/pytorch/issues/100051, values close to (but not quite at) the maximum/minimum float value could overflow to infinity in the course of _adjust_min_max() (when this large value would be multiplied by something in the middle of a calculation that would otherwise result in a non inf value). This was fixed by rearranging the order of operations for the lines in question without altering the actual equations. Specifically, where operations in lines 1095, 1098 and 1100 have multiplication and division of large values, its better to divide the two large values before multiplying, rather than multiplying the two large values together (creating overflow) before dividing like it had been. Test Plan: python test/test_quantization.py TestObserver.test_histogram_observer_ignore_infinity python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467 Approved by: https://github.com/andrewor14	2023-12-08 21:41:31 +00:00
Jerry Zhang	1474dad28c	[quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer (#113288 ) Summary: FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag for all observers so that it's possible to convert these observers to the pattern. However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1) for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity checks in other observers to make sure the is_dynamic is False. Test Plan: python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113288 Approved by: https://github.com/kimishpatel	2023-12-04 23:06:38 +00:00
Aaron Gokaslan	d9f2cf9974	[BE]: Enable ruff rule PIE800 - unnecessary nested dict expansion (#113880 ) Adds an additional list which removes unnecessary dict literal unpacking, also applies the fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113880 Approved by: https://github.com/albanD	2023-11-16 22:34:38 +00:00
Paul Zhang	51c2b587c9	Back out "[PyPer][BE] Fix test_scripted_module in StatCollector" (#108588 ) Differential Revision: D48908507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108588 Approved by: https://github.com/jerryzh168	2023-09-08 14:33:58 +00:00
Jerry Zhang	32a16d4999	[quant][pt2e] Support int16 quantization (#108453 ) Summary: Previously we can only use native pytorch int dtypes that has corresponding quantized dtypes (e.g. quint8, qint8), this PR removes this assumption in observers/fake_quants so that users can use all pytorch native dtypes (except for int64, we can add it later if need) the main addition here is int16. Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108453 Approved by: https://github.com/kimishpatel	2023-09-06 19:31:20 +00:00
Paul Zhang	4a9c6f1b73	[PyPer][BE] Fix test_scripted_module in StatCollector (#108232 ) Summary: D41985889 removed the cast to int for the inputs to torch.histc below, allowing the inputs to still be tensors. These tensors still have require_grad_ set to True, causing issues with the call to torch.histc. Test Plan: buck2 test 'fbcode//mode/opt' fbcode//dper3/dper3/modules/low_level_modules/tests:stat_collector_test -- --exact 'dper3/dper3/modules/low_level_modules/tests:stat_collector_test - test_scripted_module (dper3.dper3.modules.low_level_modules.tests.stat_collector_test.StatCollectorTest_1)' Differential Revision: D48800879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108232 Approved by: https://github.com/jerryzh168	2023-09-01 04:23:57 +00:00
Justin Chu	c0d8a4af0a	[BE] Enable ruff's UP rules and autoformat ao/ (#105430 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430 Approved by: https://github.com/albanD, https://github.com/malfet	2023-07-19 13:44:37 +00:00
Xuan Xie	6261055471	dst_bin_of_end_center is defined twice (#102755 ) (line 995 and line 1011) both definations are the same. Delete one of them. Fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/102755 Approved by: https://github.com/janeyx99	2023-06-06 21:17:07 +00:00
Jerry Zhang	df3455b716	[reland][quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 ) (#99767 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/99220 Previously we have two places we need to decide whether to insert observer or fake quantizer or not: (1) input arguments of a node (2) output of a node, and right now we have separate code to do this in this PR, the logic is unified in `_needs_obs_or_fq` helper function that takes the target_dtype and is_dynamic from previous output and target_dtype and is_dynamic for the current Tensor we are looking at let's use an example for conv node: ``` conv = convolution(input, weight, bias, ...) ``` let's say we have `input_node` object for argument `input`, and `conv_node` for `conv` node in the graph (1) input arguments, e.g. `input` the target_dtype/is_dynamic from previous output is the node that produces `input`, we get this from input_node.meta["target_dtype_info"]["output_act_obs_or_fq"] the taregt_dtype/is_dynamic for the current argument `input`, comes from conv_node.meta["target_dtype_info"]["input_act_obs_or_fq"] similarly for weight it comes from conv_node.meta["target"]["weightobs_or_fq"] etc. (2) output for conv node the target_dtype/is_dynamic from previous output will be the floating point output from the fp32 convolution operator, so it is hardcoded to be (torch.float, False), however, technically we should get this from node.meta["val"], but since the current code base is shared by fx graph mode quantization and pytorch 2.0 export quantization, we cannot do that, we can revisit after we decide to deprecate fx graph mode quantization the target_dtype/is_dynamic for the current output comes from conv_node.meta["target_dtype_info"]["output_act_obs_or_fq"] there is one caveat here about dynamic quantization, that is explained in the comment, so I won't repeat here Note: also fixed some places in `_get_arg_target_dtype_as_input_to_node` and `_get_arg_target_is_dynamic_as_input_to_node` to make sure "not specified" == specifying a fp32 placeholder observer as well Next: we can merge the two get target dtype and get is_dynamic function to reduce code duplication Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestQuantizePT2E python test/test_quantization.py TestQuantizePT2EModels Imported from OSS Differential Revision: D45198323 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99767 Approved by: https://github.com/kimishpatel	2023-04-25 16:53:02 +00:00
PyTorch MergeBot	75e754800f	Revert "[quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 )" This reverts commit `d56adb1b54`. Reverted https://github.com/pytorch/pytorch/pull/99220 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2023-04-21 18:04:21 +00:00
Jerry Zhang	d56adb1b54	[quant][pt2e][refactor] Cleanup the logic for deciding whether to insert observer/fq or not (#99220 ) Summary: Previously we have two places we need to decide whether to insert observer or fake quantizer or not: (1) input arguments of a node (2) output of a node, and right now we have separate code to do this in this PR, the logic is unified in `_needs_obs_or_fq` helper function that takes the target_dtype and is_dynamic from previous output and target_dtype and is_dynamic for the current Tensor we are looking at let's use an example for conv node: ``` conv = convolution(input, weight, bias, ...) ``` let's say we have `input_node` object for argument `input`, and `conv_node` for `conv` node in the graph (1) input arguments, e.g. `input` the target_dtype/is_dynamic from previous output is the node that produces `input`, we get this from input_node.meta["target_dtype_info"]["output_act_obs_or_fq"] the taregt_dtype/is_dynamic for the current argument `input`, comes from conv_node.meta["target_dtype_info"]["input_act_obs_or_fq"] similarly for weight it comes from conv_node.meta["target"]["weightobs_or_fq"] etc. (2) output for conv node the target_dtype/is_dynamic from previous output will be the floating point output from the fp32 convolution operator, so it is hardcoded to be (torch.float, False), however, technically we should get this from node.meta["val"], but since the current code base is shared by fx graph mode quantization and pytorch 2.0 export quantization, we cannot do that, we can revisit after we decide to deprecate fx graph mode quantization the target_dtype/is_dynamic for the current output comes from conv_node.meta["target_dtype_info"]["output_act_obs_or_fq"] there is one caveat here about dynamic quantization, that is explained in the comment, so I won't repeat here Note: also fixed some places in `_get_arg_target_dtype_as_input_to_node` and `_get_arg_target_is_dynamic_as_input_to_node` to make sure "not specified" == specifying a fp32 placeholder observer as well Next: we can merge the two get target dtype and get is_dynamic function to reduce code duplication Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestQuantizePT2E python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D45167585](https://our.internmc.facebook.com/intern/diff/D45167585) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99220 Approved by: https://github.com/kimishpatel	2023-04-21 16:58:35 +00:00
Kazuaki Ishizaki	a13a63ae9a	Fix typos under torch/ao directory (#97679 ) This PR fixes typos in comments and messages of `.py` files under `torch/ao` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/97679 Approved by: https://github.com/janeyx99, https://github.com/kit1980	2023-04-10 22:25:15 +00:00
yiliu30	2ea0cb1207	Fix the typo for the docstring of args in the observer (#95887 ) This PR fixes the typo in `torch.ao.quantization.observer.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/95887 Approved by: https://github.com/kit1980	2023-03-13 23:03:57 +00:00
Jacob Szwejbka	fc324d3485	[quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854 ) Summary: Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization for the input in dynamic quantization Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic" Reviewed By: digantdesai Differential Revision: D43134794 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854 Approved by: https://github.com/digantdesai	2023-02-28 19:39:31 +00:00
Xuehai Pan	5b1cedacde	[BE] [2/3] Rewrite `super()` calls in functorch and torch (#94588 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94588 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-10 21:16:33 +00:00
Jacob Szwejbka	bb48d90b00	[Executorch][Quant][BE] Refactor Choose_Qparams (#94338 ) Summary: Refactor so that it can be decomposed Test Plan: ci Differential Revision: D42681268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94338 Approved by: https://github.com/jerryzh168	2023-02-09 01:20:17 +00:00
Aaron Gokaslan	1e2d82b8e4	[BE] Merge isinstance calls together (#94419 ) Simplify and speeds up isinstance calls by checking for multiple types at the same time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419 Approved by: https://github.com/ezyang	2023-02-09 00:47:26 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
Nikita Shulga	c0dd9b3b67	Revert "[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 )" This reverts commit `59071ab1e7`. It breaks `quantization.jit.test_ondevice_quantization.TestOnDeviceDynamicPTQFinalize`, which is not run in OSS, but is mandatory for internal CI.	2023-01-23 09:13:02 -08:00
Jacob Szwejbka	59071ab1e7	[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 ) Summary: Should hopefully be a little faster. Definitely cleaner to not create an observer inside the op Test Plan: ci Differential Revision: D42154677 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92592 Approved by: https://github.com/jerryzh168	2023-01-20 01:36:47 +00:00
HDCharles	a01c1ee594	[ao] making _is_activation_post_process private with BC (#90554 ) same function in observer and quantize, consolidated to a single function note: this is a recreation of D40709276 which caused severa breakages due to not maintaining BC for models with cached code with calls to the old function name Differential Revision: [D41793604](https://our.internmc.facebook.com/intern/diff/D41793604/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D41793604/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/90554 Approved by: https://github.com/jcaip	2022-12-16 08:09:33 +00:00
HDCharles	e11650887e	[ao] fix incorrect integer cast on histogram observer bounds (#90355 ) Summary: A cast to int was added in https://github.com/pytorch/pytorch/pull/45630 to make mypy not complain. However this leads to unexpected behavior where the histogram doesn't actually capture the full range of activation values. note1: the test_histogram_observer_against_reference test was secretly broken, on master. The random parameters that normally get run apparently don't cause a test failure but if you make a loop repeatedly run the test, it would eventually fail. This was due to in some cases sum(<tensor>)!=torch.sum(<tensor>).item(). I was not able to reproduce this with a toy example but running this test in a loop and editing either observer to print the calculation for 'total' would break the test and show different behaviors. Fixing this test was necessary to land this PR since the changing histogram bounds changed things enough that this test would error. note2: updating histogram observer breaks some BC tests unless I regenerate the model using the HistogramObserver from this PR Test Plan: python test/test_quantization.py TestHistogramObserver.test_histogram_observer_correct_numel python test/test_quantization -k histogram Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90355 Approved by: https://github.com/vkuzo	2022-12-12 20:30:44 +00:00
Vasiliy Kuznetsov	22a1b5e243	quantization: deprecate observer compute_dtype and replace with is_dynamic (#85431 ) Summary: This PR deprecates the `compute_dtype` field on observers, and replaces it with the `is_dynamic` field on observers. This is better aligned with the reference model spec. Test plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85431 Approved by: https://github.com/jerryzh168	2022-11-24 07:07:34 +00:00
PyTorch MergeBot	9d209e7834	Revert "[ao] making _is_activation_post_process private (#87520 )" This reverts commit `45c62a3377`. Reverted https://github.com/pytorch/pytorch/pull/87520 on behalf of https://github.com/bigfootjon due to Diff reverted internally	2022-11-21 16:48:26 +00:00
Jerry Zhang	940959ebbf	[quant][fix] Add quant_min/quant_max for default dynamic quantization observer (#89267 ) Summary: This is needed for choose qparams, but previously it is not configurable, and in the reference quantization flow with decomposed Tensor, we are making this explicit Test Plan: tested in future PR Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/89267 Approved by: https://github.com/vkuzo	2022-11-19 16:08:31 +00:00
Kazuaki Ishizaki	1cd6ebe095	Fix typos in messages under torch (#89049 ) This PR fixes typos of messages in `.py` files under torch directory. Only in `torch/onnx/symbolic_opset16.py`, fix a typo in comment to make the operator name correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89049 Approved by: https://github.com/lezcano	2022-11-17 04:18:14 +00:00
HDCharles	45c62a3377	[ao] making _is_activation_post_process private (#87520 ) Summary: same function in observer and quantize, consolidated to a single function. Note the definitions were slightly different, I've changed the definition to be maximally inclusive so that the name of the function is more accurate Test Plan: python test/test_public_bindings.py python test/test_quantization.py Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D40709276](https://our.internmc.facebook.com/intern/diff/D40709276) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87520 Approved by: https://github.com/jcaip	2022-11-16 21:31:57 +00:00
Alan Lin	4fc0d5341c	[PyTorch][Fix] Improve numerical stability of HistogramObserver (#86522 ) Summary: As titled, HistogramObserver may fail in a certain scenario. Specifically, we originally compute `hist_bin_width` as `(self.max_val - self.min_val) / (self.bins * upsample_rate)`. It's possible that the numerator part is close the the FP32 threshold (1.4e-45) and conducting the division will cause overflow. Bring some redundent computations to avoid such scenario. Test Plan: https://pxl.cl/2ggD4 (`04490e90ea`) Differential Revision: D40149594 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86522 Approved by: https://github.com/jerryzh168	2022-10-11 01:21:16 +00:00
Digant Desai	071f875046	[quant] Fix per channel weight observer (#85883 ) Summary: `per_channel_weight_observer_range_neg_127_to_127` now correctly uses `PerChannelMinMaxObserver` instead of `MinMaxObserver` Test Plan: Adds a new test `quantization.core.test_top_level_apis ` to instansiate and run `forward()` on all `default` observers Differential Revision: D39916482 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85883 Approved by: https://github.com/salilsdesai	2022-09-30 22:02:44 +00:00
Vasiliy Kuznetsov	09965957cd	quantization: align observer dtype with reference model spec (#85345 ) Summary: Before this PR, the `dtype` attribute of observers was not clearly defined. It originally meant `interface_dtype` in the eager mode workflow, which is how the codebase before this PR is using it. In the new reference model spec, `dtype` attribute of an observer represents the `dtype` value which needs to be passed into a `quantize` function in the reference model spec. This PR aligns the codebase to this definition of dtype. In detail: 1. change util functions to interpret `dtype` using the reference model definition 2. change `prepare` to interpret `dtype` using the reference model definition 3. change observers for dynamic quantization to interpret `dtype` using the reference model definition. A future PR (left out of this one to keep LOC small) will deprecate the `compute_dtype` field and instead expose `is_dynamic` on observers. " Test plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Differential Revision: [D39675209](https://our.internmc.facebook.com/intern/diff/D39675209) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85345 Approved by: https://github.com/z-a-f, https://github.com/jerryzh168	2022-09-21 06:34:26 +00:00
Vasiliy Kuznetsov	1dabb51a16	quant: add `extra_repr` to HistogramObserver (#84760 ) Summary: Adds `extra_repr` to `HistogramObserver`. This is useful when debugging PTQ models because it allows to quickly check whether a `HistogramObserver` has received data or not. Test plan: ``` >>> import torch >>> obs = torch.ao.quantization.HistogramObserver() >>> obs(torch.randn(1, 3, 224, 224)) ... >>> print(obs) // before - hard to tell if observer has seen data HistogramObserver() // after HistogramObserver(min_val=-4.778339862823486, max_val=4.311892986297607) >>> ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84760 Approved by: https://github.com/andrewor14	2022-09-09 21:21:03 +00:00
Kimish Patel	5c7e801c50	[pytorch][on device quant] Finalize method for ondevice quant (#83571 ) Summary: After inserting quant dequant nodes in the graph, we need 1. Insert packed param creation and quantized op 2. Create packed_params attribute in the top module. For this we need graph that inlined except for calculate_qparams method calls. But they can be inlined too. So perhaps we need to make sure no other callmethods exist. 3. Insert SetAttr for the packed param 4. Insert GetAttr for the packed param 5. Use GetAttr output for quantized op where applicable, e.g. linear_dynamic The above is added to quantize_<method-name> method created inprevious step. Once the above steps are done clone the method into quantized_<method-name> Modify quantize_<method-name>: 1. Remove all outputs from the method. 2. Run dce 3. Remove all inputs from the method except self. Modify quantized_<method-name>: 1. Remove all packed_param setAttr nodes. 2. Run dce. This should result in removal of all nodes that generate packed param. Test Plan: To be written Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D38771416](https://our.internmc.facebook.com/intern/diff/D38771416) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83571 Approved by: https://github.com/jerryzh168	2022-08-29 17:53:11 +00:00
XiaobingSuper	31f151767b	add qscheme check for quantization observer (#80126 ) Motivation: each quantization observer only supports a limit qschemes, we need to do this check at the initiation step, rather than at the running step, such as MinMaxObserver with set qscheme with torch.per_channel_affine, there will have a runtime error at the running the calibration step: ``` AttributeError: 'MinMaxObserver' object has no attribute 'ch_axis' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80126 Approved by: https://github.com/jerryzh168	2022-08-25 10:03:19 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Andrew Or	c44317704a	[Quant][fx] Add default configs for fixed qparams ops (#80184 ) Summary: This commit adds qconfigs with special observers for fixed qparams ops in get_default_qconfig_mapping and get_default_qat_qconfig_mapping. For correctness, we also require users to use these special observers if we detect these fixed qparams ops in prepare. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184 Approved by: https://github.com/jerryzh168	2022-06-29 23:07:26 +00:00
dzdang	e2aa28a2d0	[quant][fx][improvement] Renamed default_affine_fixed_qparams_observer and default_symmetric_fixed_qparams_observer (#76637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76637 The previous naming convention `default_affine_fixed_qparams_observer` and `default_symmetric_fixed_qparams_observer` were uninformative, and users had to read the definition in order to understand what these observers are. The new naming convention reveals information about the range of the observers The analogous changes were also made for `default_symmetric_fixed_qparams_fake_quant` and `default_affine_fixed_qparams_fake_quant` Test Plan: ``` python test/test_quantization.py ``` ``` python test/test_quantization.py ``` Differential Revision: D36054169 D36054169 Reviewed By: vkuzo Pulled By: dzdang fbshipit-source-id: 215f7786a4b7abda7327f17cc61735697ec5cca9 (cherry picked from commit 21a4e6eda4467c8adca7fd534a506a14e975f9cf)	2022-05-04 02:39:20 +00:00
Vasiliy Kuznetsov	04369f637c	quant: rename _ObserverBase to UniformQuantizationObserverBase (#76461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76461 Renaming as the old name was confusing. The name represents better what this class is doing. Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D35976350 Pulled By: vkuzo fbshipit-source-id: 6da6c1767cec729c3959b13ae9dd939d0b2f622c (cherry picked from commit 065608ef42c599525bfad4603af74c5bdf0881c3)	2022-05-03 05:53:54 +00:00
Vasiliy Kuznetsov	31d5a300ac	quant: make RecordingObserver inherit from ObserverBase (#76460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76460 `RecordingObserver` inherits from `_ObserverBase` but does not use any functionality from it. Making it inherit from `ObserverBase` instead. This will make it simpler to rename `_ObserverBase` to something more meaningful in the next PR. Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D35976351 Pulled By: vkuzo fbshipit-source-id: 19c106bf0d48607c231702e2e048f42a7f48a5c6 (cherry picked from commit 4fd44123b0e9bcdcae546aecabe80d7642129cf5)	2022-05-03 05:53:54 +00:00
lkct	9fae0762b0	fix typing in `Module.state_dict` and `load_state_dict` Fixes #72707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73483 Approved by: https://github.com/albanD, https://github.com/jbschlosser	2022-05-02 17:27:54 +00:00
Digant Desai	09f32eba7a	[quant] Add default symmetric qat qconfig for qnnpack (#74507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74507 * This is the default symmetric qat qconfigs for qnnpack. * Support for symmetric quantization is not available from other backends. * Observers are similar to symmetric PTQ qconfigs for qnnpack. Reviewed By: jerryzh168 Differential Revision: D34804808 fbshipit-source-id: 22c11b89242a98f54029ac195f7b984e42809164 (cherry picked from commit ea751ded1174ba2c2f061bafc81573faaf248a9a)	2022-03-24 16:19:28 +00:00
Digant Desai	cfe1a41b01	[quant] Add default symmetric qconfig for qnnpack (#74396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74396 # New qconfig `default_symmetric_qnnpack_qconfig` Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same. ## Restrictions on weights Restrictions on weights include, 1. weight zero point is force zero. and 2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128. This is driven, in part, by the desire to achieve better performance by XNNPACK ops. ## qengine/backend = `qnnpack` and XNNPACK ops Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still `qnnpack`, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP. ## Updated EPS value: * From PyTorch: eps: ``` >>> import torch >>> torch.finfo(torch.float32).eps 1.1920928955078125e-07 >>> torch.finfo(torch.float32).eps.hex() '0x1.0000000000000p-23' ``` All scale values are float32 and `scale = max(scale, eps)` * Requirement from XNNPACK For both fp32 as well as rndnu requantization schema, `0x1p-32 <= requantization_scale < 256.0` Where, requantization_scale = (input_scale * kernel_scale) / (output_scale) * New minimum allowed scale value With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that, ``` minimum_requantization_value = xnnpack_lower_threshold input_scale * kernel_scale / output_scale = 0x1p-32 min_scale_value * min_scale_value / max_scale_value = 0x1p-32 min_scale_value * new_eps / 256 = 0x1p-32 min_scale_value*2 = 0x1p-24 min_scale_value = 0x1p-12 ``` With `scale_value >= 0x1p-12`, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels. Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than `0x1p-12` as EPS, but it is not easy to choose a smaller value empirically. Impact on accuracy is unclear as of writing this. Reviewed By: kimishpatel Differential Revision: D34625300 fbshipit-source-id: 005e6757ed1185b3940b58ac55246cba8b267828 (cherry picked from commit 61ed1a2a308a1792ccbfc316153a6dc39798f02a)	2022-03-18 13:42:41 +00:00
Charles David Hernandez	39605a5632	[ao] Removing memoryless observer args for MovingAverage (#73947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73947 The original implementation of memoryless observers used MinMaxObservers and a memoryless argument to manipulate the behavior of the observer such that it wouldn't keep track of previously observed min and max's. It was later pointed out that this was equivalent to a movingaverageobserver with averaging_constant=1 which is requires less overhead and no 1 off args (memoryless) so this PR refactors the memoryless arg and uses MovingAverage observers instead, although the memoryless adjective is still used, a complete definintion was also added to clarify error messages given these changes. TestPlan python test/test_quantization.py TestQuantizeEagerQAT python test/test_quantization.py TestObserver Test Plan: Imported from OSS Reviewed By: andrewor14 Differential Revision: D34732080 Pulled By: HDCharles fbshipit-source-id: 227a1ab29d18adae55093a684ea35ac34523d07a (cherry picked from commit 5238e70e8f90f3219c36f9c64b647951dcf64b5a)	2022-03-11 00:21:49 +00:00
Terry Chen	f67cf03526	[Quant] Add qint32 quantization support (#72472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72472 Add dtype=int32 support for observer Test Plan: python3 test/test_quantization.py TestObserver.test_per_tensor_observers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34056640 fbshipit-source-id: 4fa15a7274cfbb6a7dd4e698e3989cc0c0626e7b (cherry picked from commit `bf4351de45`)	2022-02-16 03:45:15 +00:00
Mike Ruberry	7680a0ae9d	Deprecates _aminmax (#71576 ) Summary: Replaces https://github.com/pytorch/pytorch/pull/62432. Existing callsites are updated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71576 Reviewed By: ngimel Differential Revision: D33689960 Pulled By: mruberry fbshipit-source-id: fad1ba78347ecec7fd48f21862c3eb606662b8f4 (cherry picked from commit `6cd438e9a1`)	2022-01-21 09:23:29 +00:00
Terry Chen	33a5905cc6	[quant] fix reduce_range warning (#71027 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71027 Fix issue #61054. remove warning reduce_range=True which caused the error message "UserWarning: Please use quant_min and quant_max to specify the range for observers". Test Plan: python test/test_quantization.py TestFakeQuantizeOps Imported from OSS Reviewed By: jerryzh168 Differential Revision: D33484341 fbshipit-source-id: 97c3d4658926183f88a0c4665451dd7f913d30e6	2022-01-10 20:05:36 -08:00
Vasiliy Kuznetsov	574dbb584d	quant tests: fix log spew for HistogramObserver (#70107 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70107 Histogram observer used floor division on tensors, which is a deprecated behavior. There was a warning printed: ``` /Users/vasiliy/pytorch/torch/ao/quantization/observer.py:905: UserWarning: __floordiv__ is deprecated, and i ts behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' funct ion NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='flo or'). ``` This PR fixes the warning. Test Plan: ``` python test/test_quantization.py TestHistogramObserver ``` Reviewed By: ejguan Differential Revision: D33187926 Pulled By: vkuzo fbshipit-source-id: 9c37de4c6d6193bee9047b6a28ff37ee1b019753	2021-12-28 06:27:51 -08:00
Charles David Hernandez	fc2614537b	Updating quantization documentation (#68907 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68907 Added information about symmetric qschemes and corrected an error in reference to https://github.com/pytorch/pytorch/issues/68540 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D32662033 fbshipit-source-id: 9052c597f61991934b86850fea8b6eab78397450	2021-12-08 08:32:33 -08:00
Jerry Zhang	ca945d989a	[quant][graphmode][fx] Add default_replay_qconfig for ops like reshape (#69249 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69249 This PR added default_replay_qconfig and default_replay_observer which is used when we want to configure an operator to reuse the observer from input, if the input Tensor for the operator is not observed, we will not observe the output of this operator either, if the input Tensor is observed, we will observe the output of the operator with the same observer. e.g. ``` x1 = x0.reshape() ``` if reshape is configured with default_replay_qconfig: 1. if x0 is observed with observer_0, we'll observe x1 with the same observer instance 2. if x0 is not observed, we won't observe x1 either Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_replay_qconfig ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32774723 fbshipit-source-id: 26862b2bc181d0433e2243daeb3b8f7ec3dd33b2	2021-12-06 22:56:14 -08:00
andrewor	79b67d9a4a	[Quant] Refactor handling of FixedQParams operators (#68143 ) Summary: Summary: FixedQParams operators do not need fake quantization in the prepare step. This commit introduces FixedQParamsObserver and makes FixedQParamsFakeQuantize a simple wrapper around this observer. It also removes the fake quantize logic in forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68143 Test Plan: Added two tests: python3 test/test_quantization.py TestQuantizeFx.test_fixed_qparams_patterns python3 test/test_quantization.py TestQuantizeFx.test_register_patterns Reviewers: Jerry Zhang Subscribers: Jerry Zhang, Supriya Rao Tasks: T104942885 Tags: pytorch Reviewed By: albanD Differential Revision: D32484427 Pulled By: andrewor14 fbshipit-source-id: 5a048b90eb4da79074c5ceffa3c8153f8d8cd662	2021-11-23 15:26:10 -08:00
Charles David Hernandez	f455030931	Adding a docstring for memoryless in observer args (#67690 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67690 see title [skip ci] Test Plan: python setup.py develop Imported from OSS Reviewed By: ejguan Differential Revision: D32107512 fbshipit-source-id: da5668339716d44720672f7b71a991b23530461e	2021-11-03 12:46:44 -07:00

1 2

58 Commits