Summary: This PR does 2 things:
1) Previously this would simply error, now it will ignore any
torch.inf values that it recieves. note: The code checks for torch.inf after
aminmax that way if there are no torch.inf values found, the perf is a
relatively unchanged
2) as mentioned in https://github.com/pytorch/pytorch/issues/100051,
values close to (but not quite at) the maximum/minimum float value could
overflow to infinity in the course of _adjust_min_max() (when this large
value would be multiplied by something in the middle of a calculation
that would otherwise result in a non inf value). This was fixed by
rearranging the order of operations for the lines in question without
altering the actual equations. Specifically, where operations in lines
1095, 1098 and 1100 have multiplication and division of large values,
its better to divide the two large values before multiplying, rather
than multiplying the two large values together (creating overflow) before dividing like it had been.
Test Plan: python test/test_quantization.py
TestObserver.test_histogram_observer_ignore_infinity
python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467
Approved by: https://github.com/andrewor14
Summary:
This is to allow easier extension of quant workflow in the future, as we are seening more
diverse ways of doing quantization
putting up this for feedbacks first
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_observer_callback
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115001
Approved by: https://github.com/kimishpatel
Summary:
FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to
convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag
for all observers so that it's possible to convert these observers to the pattern.
However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1)
for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity
checks in other observers to make sure the is_dynamic is False.
Test Plan:
python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113288
Approved by: https://github.com/kimishpatel
Summary:
att, this is because histogram observer does not work for a corner case in mobilebert (observing a scalar tensor of float32 max value)
because histc operator errors out when the value is larger than certain number
Test Plan:
python test/test_quantization.py -k test_mul_float32_max
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113405
Approved by: https://github.com/mcr229
Summary:
This is a util for numeric suite in pt2 export so that we can build
a more streamlined UX for numerical debugging in quant + executorch stack
Test Plan:
python test/test_quantization.py TestGenerateNumericDebugHandle
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315
Approved by: https://github.com/zhxchen17
Summary:
Current order of implicit sharing breaks common annotation patterns of SharedQuantizationSpec, so we changed the order here.
But it's not going to work in all possible annotation cases, so quantizer implementors still need to be careful.
In general if people only refer to node/edges that comes before the current node/edge in SharedQuantizationSpec, it should work I think
Test Plan: CI, make sure this Fixed some internal tests
Differential Revision: D51605918
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114704
Approved by: https://github.com/andrewor14
It appears that `mypy` is now checking a few more previously-unchecked files; these files
are being found via import-following. Not sure exactly why they weren't being checked before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114160
Approved by: https://github.com/eellison
ghstack dependencies: #114162
**Summary**
To annotate a conv-binary pattern, should skip the pattern if the conv node has more than one user.
**Test Plan**
```
python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary2
python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary2
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114540
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
Summary: Previously the PT2 QAT code only supported conv2d-bn.
This commit extends all existing QAT fusion support to conv1d-bn,
including support for all variants like relu, no bias, literal
args, cuda etc. This commit also refactors the code such that
we can support conv3d-bn easily in the future.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT_ConvBn1d
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Differential Revision: [](https://our.internmc.facebook.com/intern/diff/)
Differential Revision: [D51428979](https://our.internmc.facebook.com/intern/diff/D51428979)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113714
Approved by: https://github.com/jerryzh168
Summary:
Previously it is scatter in two different places: before inserting observer and during observer,
this PR moved everything before we insert observer
* Next: refactor QuantizationSpec and check more fields for sharing
Test Plan:
CI (regression tests)
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113458
Approved by: https://github.com/kimishpatel
Fixes#112988
For files
__init__.py
_correct_bias.py
_equalize.py
_learnable_fake_quantize.py
backend_config
experimental
fake_quantize.py
fuse_modules.py
fuser_method_mappings.py
Correct the following
__init__.py:1 at module level:
D104: Missing docstring in public package
__init__.py:144 in public function `default_eval_fn`:
D205: 1 blank line required between summary line and description (found 0)
__init__.py:144 in public function `default_eval_fn`:
D400: First line should end with a period (not 'f')
__init__.py:144 in public function `default_eval_fn`:
D401: First line should be in imperative mood; try rephrasing (found 'Default')
__init__.py:152 in private class `_DerivedObserverOrFakeQuantize`:
D204: 1 blank line required after class docstring (found 0)
__init__.py:152 in private class `_DerivedObserverOrFakeQuantize`:
D205: 1 blank line required between summary line and description (found 0)
__init__.py:152 in private class `_DerivedObserverOrFakeQuantize`:
D210: No whitespaces allowed surrounding docstring text
__init__.py:152 in private class `_DerivedObserverOrFakeQuantize`:
D400: First line should end with a period (not 's')
_correct_bias.py:20 in public function `get_module`:
D200: One-line docstring should fit on one line with quotes (found 2)
_correct_bias.py:20 in public function `get_module`:
D210: No whitespaces allowed surrounding docstring text
_correct_bias.py:20 in public function `get_module`:
D300: Use """triple double quotes""" (found '''-quotes)
_correct_bias.py:20 in public function `get_module`:
D400: First line should end with a period (not 'l')
_correct_bias.py:25 in public function `parent_child_names`:
D200: One-line docstring should fit on one line with quotes (found 2)
_correct_bias.py:25 in public function `parent_child_names`:
D300: Use """triple double quotes""" (found '''-quotes)
_correct_bias.py:25 in public function `parent_child_names`:
D400: First line should end with a period (not 'e')
_correct_bias.py:25 in public function `parent_child_names`:
D401: First line should be in imperative mood (perhaps 'Split', not 'Splits')
_correct_bias.py:34 in public function `get_param`:
D205: 1 blank line required between summary line and description (found 0)
_correct_bias.py:34 in public function `get_param`:
D210: No whitespaces allowed surrounding docstring text
_correct_bias.py:34 in public function `get_param`:
D300: Use """triple double quotes""" (found '''-quotes)
_correct_bias.py:34 in public function `get_param`:
D400: First line should end with a period (not 's')
_correct_bias.py:44 in public class `MeanShadowLogger`:
D204: 1 blank line required after class docstring (found 0)
_correct_bias.py:44 in public class `MeanShadowLogger`:
D205: 1 blank line required between summary line and description (found 0)
_correct_bias.py:44 in public class `MeanShadowLogger`:
D400: First line should end with a period (not 'n')
_correct_bias.py:47 in public method `__init__`:
D107: Missing docstring in __init__
_correct_bias.py:56 in public method `forward`:
D205: 1 blank line required between summary line and description (found 0)
_correct_bias.py:56 in public method `forward`:
D210: No whitespaces allowed surrounding docstring text
_correct_bias.py:56 in public method `forward`:
D300: Use """triple double quotes""" (found '''-quotes)
_correct_bias.py:56 in public method `forward`:
D401: First line should be in imperative mood; try rephrasing (found 'The')
_correct_bias.py:77 in public method `clear`:
D102: Missing docstring in public method
_correct_bias.py:85 in public function `bias_correction`:
D205: 1 blank line required between summary line and description (found 0)
_correct_bias.py:85 in public function `bias_correction`:
D210: No whitespaces allowed surrounding docstring text
_correct_bias.py:85 in public function `bias_correction`:
D300: Use """triple double quotes""" (found '''-quotes)
_correct_bias.py:85 in public function `bias_correction`:
D400: First line should end with a period (not 's')
_correct_bias.py:85 in public function `bias_correction`:
D401: First line should be in imperative mood (perhaps 'Use', not 'Using')
_equalize.py:22 in public function `set_module_weight`:
D103: Missing docstring in public function
_equalize.py:28 in public function `set_module_bias`:
D103: Missing docstring in public function
_equalize.py:34 in public function `get_module_weight`:
D103: Missing docstring in public function
_equalize.py:40 in public function `get_module_bias`:
D103: Missing docstring in public function
_equalize.py:47 in public function `max_over_ndim`:
D200: One-line docstring should fit on one line with quotes (found 2)
_equalize.py:47 in public function `max_over_ndim`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:47 in public function `max_over_ndim`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:47 in public function `max_over_ndim`:
D400: First line should end with a period (not 's')
_equalize.py:47 in public function `max_over_ndim`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
_equalize.py:55 in public function `min_over_ndim`:
D200: One-line docstring should fit on one line with quotes (found 2)
_equalize.py:55 in public function `min_over_ndim`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:55 in public function `min_over_ndim`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:55 in public function `min_over_ndim`:
D400: First line should end with a period (not 's')
_equalize.py:55 in public function `min_over_ndim`:
D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies')
_equalize.py:63 in public function `channel_range`:
D200: One-line docstring should fit on one line with quotes (found 2)
_equalize.py:63 in public function `channel_range`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:63 in public function `channel_range`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:63 in public function `channel_range`:
D400: First line should end with a period (not 'l')
_equalize.py:63 in public function `channel_range`:
D401: First line should be in imperative mood (perhaps 'Find', not 'finds')
_equalize.py:63 in public function `channel_range`:
D403: First word of the first line should be properly capitalized ('Finds', not 'finds')
_equalize.py:76 in public function `cross_layer_equalization`:
D205: 1 blank line required between summary line and description (found 0)
_equalize.py:76 in public function `cross_layer_equalization`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:76 in public function `cross_layer_equalization`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:76 in public function `cross_layer_equalization`:
D400: First line should end with a period (not 't')
_equalize.py:120 in public function `equalize`:
D205: 1 blank line required between summary line and description (found 0)
_equalize.py:120 in public function `equalize`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:120 in public function `equalize`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:120 in public function `equalize`:
D400: First line should end with a period (not 'l')
_equalize.py:159 in public function `converged`:
D205: 1 blank line required between summary line and description (found 0)
_equalize.py:159 in public function `converged`:
D210: No whitespaces allowed surrounding docstring text
_equalize.py:159 in public function `converged`:
D300: Use """triple double quotes""" (found '''-quotes)
_equalize.py:159 in public function `converged`:
D400: First line should end with a period (not 's')
_equalize.py:159 in public function `converged`:
D401: First line should be in imperative mood (perhaps 'Test', not 'Tests')
_learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`:
D204: 1 blank line required after class docstring (found 0)
_learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`:
D205: 1 blank line required between summary line and description (found 0)
_learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`:
D210: No whitespaces allowed surrounding docstring text
_learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`:
D400: First line should end with a period (not 'h')
_learnable_fake_quantize.py:68 in private method `enable_param_learning`:
D205: 1 blank line required between summary line and description (found 0)
_learnable_fake_quantize.py:68 in private method `enable_param_learning`:
D400: First line should end with a period (not 'd')
_learnable_fake_quantize.py:68 in private method `enable_param_learning`:
D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables')
_learnable_fake_quantize.py:78 in private method `enable_static_estimate`:
D205: 1 blank line required between summary line and description (found 0)
_learnable_fake_quantize.py:78 in private method `enable_static_estimate`:
D400: First line should end with a period (not 'f')
_learnable_fake_quantize.py:78 in private method `enable_static_estimate`:
D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables')
_learnable_fake_quantize.py:87 in private method `enable_static_observation`:
D205: 1 blank line required between summary line and description (found 0)
_learnable_fake_quantize.py:87 in private method `enable_static_observation`:
D400: First line should end with a period (not 't')
_learnable_fake_quantize.py:87 in private method `enable_static_observation`:
D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables')
fake_quantize.py:1 at module level:
D205: 1 blank line required between summary line and description (found 0)
fake_quantize.py:1 at module level:
D400: First line should end with a period (not 'n')
fake_quantize.py:61 in public class `FakeQuantizeBase`:
D205: 1 blank line required between summary line and description (found 0)
fake_quantize.py:61 in public class `FakeQuantizeBase`:
D210: No whitespaces allowed surrounding docstring text
fake_quantize.py:61 in public class `FakeQuantizeBase`:
D400: First line should end with a period (not 'e')
fake_quantize.py:74 in public method `__init__`:
D107: Missing docstring in __init__
fake_quantize.py:83 in public method `forward`:
D102: Missing docstring in public method
fake_quantize.py:87 in public method `calculate_qparams`:
D102: Missing docstring in public method
fake_quantize.py:91 in public method `enable_fake_quant`:
D102: Missing docstring in public method
fake_quantize.py:95 in public method `disable_fake_quant`:
D102: Missing docstring in public method
fake_quantize.py:99 in public method `enable_observer`:
D102: Missing docstring in public method
fake_quantize.py:103 in public method `disable_observer`:
D102: Missing docstring in public method
fake_quantize.py:107 in public method `with_args`:
D102: Missing docstring in public method
fake_quantize.py:115 in public class `FakeQuantize`:
D205: 1 blank line required between summary line and description (found 0)
fake_quantize.py:115 in public class `FakeQuantize`:
D210: No whitespaces allowed surrounding docstring text
fake_quantize.py:115 in public class `FakeQuantize`:
D412: No blank lines allowed between a section header and its content ('Attributes')
fake_quantize.py:150 in public method `__init__`:
D107: Missing docstring in __init__
fake_quantize.py:188 in public method `calculate_qparams`:
D102: Missing docstring in public method
fake_quantize.py:191 in public method `forward`:
D102: Missing docstring in public method
fake_quantize.py:214 in public method `extra_repr`:
D102: Missing docstring in public method
fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`:
D205: 1 blank line required between summary line and description (found 0)
fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`:
D210: No whitespaces allowed surrounding docstring text
fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`:
D400: First line should end with a period (not 'n')
fake_quantize.py:268 in public method `__init__`:
D107: Missing docstring in __init__
fake_quantize.py:279 in public method `calculate_qparams`:
D102: Missing docstring in public method
fake_quantize.py:283 in public method `extra_repr`:
D102: Missing docstring in public method
fake_quantize.py:292 in public class `FusedMovingAvgObsFakeQuantize`:
D205: 1 blank line required between summary line and description (found 0)
fake_quantize.py:292 in public class `FusedMovingAvgObsFakeQuantize`:
D400: First line should end with a period (not 'e')
fake_quantize.py:307 in public method `__init__`:
D107: Missing docstring in __init__
fake_quantize.py:322 in public method `calculate_qparams`:
D102: Missing docstring in public method
fake_quantize.py:326 in public method `extra_repr`:
D102: Missing docstring in public method
fake_quantize.py:342 in public method `forward`:
D102: Missing docstring in public method
fake_quantize.py:480 in private function `_is_fake_quant_script_module`:
D200: One-line docstring should fit on one line with quotes (found 2)
fake_quantize.py:480 in private function `_is_fake_quant_script_module`:
D210: No whitespaces allowed surrounding docstring text
fake_quantize.py:480 in private function `_is_fake_quant_script_module`:
D300: Use """triple double quotes""" (found '''-quotes)
fake_quantize.py:480 in private function `_is_fake_quant_script_module`:
D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
fake_quantize.py:491 in public function `disable_fake_quant`:
D400: First line should end with a period (not ':')
fake_quantize.py:502 in public function `enable_fake_quant`:
D400: First line should end with a period (not ':')
fake_quantize.py:513 in public function `disable_observer`:
D400: First line should end with a period (not ':')
fake_quantize.py:524 in public function `enable_observer`:
D400: First line should end with a period (not ':')
fuse_modules.py:1 at module level:
D100: Missing docstring in public module
fuse_modules.py:39 in public function `fuse_known_modules`:
D205: 1 blank line required between summary line and description (found 0)
fuse_modules.py:39 in public function `fuse_known_modules`:
D400: First line should end with a period (not 'd')
fuse_modules.py:39 in public function `fuse_known_modules`:
D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
fuse_modules.py:104 in public function `fuse_modules`:
D400: First line should end with a period (not 'e')
fuse_modules.py:167 in public function `fuse_modules_qat`:
D200: One-line docstring should fit on one line with quotes (found 2)
fuse_modules.py:167 in public function `fuse_modules_qat`:
D210: No whitespaces allowed surrounding docstring text
fuse_modules.py:167 in public function `fuse_modules_qat`:
D400: First line should end with a period (not '`')
fuser_method_mappings.py:1 at module level:
D100: Missing docstring in public module
fuser_method_mappings.py:18 in public function `fuse_conv_bn`:
D400: First line should end with a period (not 'e')
fuser_method_mappings.py:55 in public function `fuse_conv_bn_relu`:
D400: First line should end with a period (not 'e')
fuser_method_mappings.py:102 in public function `fuse_linear_bn`:
D400: First line should end with a period (not 'e')
fuser_method_mappings.py:131 in public function `fuse_convtranspose_bn`:
D400: First line should end with a period (not 'e')
fuser_method_mappings.py:154 in private function `_sequential_wrapper2`:
D205: 1 blank line required between summary line and description (found 0)
fuser_method_mappings.py:154 in private function `_sequential_wrapper2`:
D210: No whitespaces allowed surrounding docstring text
fuser_method_mappings.py:154 in private function `_sequential_wrapper2`:
D400: First line should end with a period (not 's')
fuser_method_mappings.py:182 in public function `get_fuser_method`:
D205: 1 blank line required between summary line and description (found 0)
fuser_method_mappings.py:182 in public function `get_fuser_method`:
D210: No whitespaces allowed surrounding docstring text
fuser_method_mappings.py:182 in public function `get_fuser_method`:
D300: Use """triple double quotes""" (found '''-quotes)
fuser_method_mappings.py:182 in public function `get_fuser_method`:
D400: First line should end with a period (not ',')
fuser_method_mappings.py:205 in private function `_get_valid_patterns`:
D205: 1 blank line required between summary line and description (found 0)
fuser_method_mappings.py:205 in private function `_get_valid_patterns`:
D400: First line should end with a period (not ',')
fuser_method_mappings.py:205 in private function `_get_valid_patterns`:
D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
fuser_method_mappings.py:238 in public function `get_fuser_method_new`:
D205: 1 blank line required between summary line and description (found 0)
fuser_method_mappings.py:238 in public function `get_fuser_method_new`:
D210: No whitespaces allowed surrounding docstring text
fuser_method_mappings.py:238 in public function `get_fuser_method_new`:
D400: First line should end with a period (not 'd')
fuser_method_mappings.py:238 in public function `get_fuser_method_new`:
D401: First line should be in imperative mood; try rephrasing (found 'This')
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112992
Approved by: https://github.com/kit1980
Summary: This commit significantly simplifies the QAT fusion
code for the `conv-bn` pattern by removing add and relu nodes
from the match and replacement patterns. This does not reduce
functionality; patterns like `conv-bn-relu`, `conv-bn-add`,
and `conv-bn-add-relu` are still supported. We simply do not
match these extra nodes, since there is actually no need to
replace them.
This has the additional benefit of reducing the number of
patterns being matched by 16x, since for each add and relu
variant of the `conv-bn` pattern there is also an in-place
variant. This also enables more flexible `conv-bn` pattern
matching in the future and keeps the number of patterns
more scalable.
One important change needed in this commit was to remove
the match filter that requires the input and output
activations to be quantized. This was necessary because
otherwise we would always expect q-dq nodes immediately
after the getitem node, instead of after the add or relu
nodes for example. This has another side benefit of
keeping QAT fusion flexible enough to support weight
only quantization.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113006
Approved by: https://github.com/jerryzh168
Summary:
new version of this: https://www.internalfb.com/diff/D49110166?dst_version_fbid=252052334533986
Fix Assign device error, when module has multiple devices
If fc_fp16_quantization enabled for CPU model.
And module REMOTE_OTHER has multiple devices: {device(type='meta'), device(type='cpu')}
We fail on this assertion:
fbcode/caffe2/torch/ao/quantization/fx/utils.py
232
assert len(devices) <= 1, (
Since CPU models work on CPU devices, added a condition before the assertion.
In case, we have CPU in module list of devices. Set device as CPU.
Please see debug details:
https://docs.google.com/document/d/1pMPCeJyMPA15NhFc2uAyNDkS9azR40uaNyOP0DIgHjU/edit
Test Plan:
AIMP_DISAGG_CPU=true buck run mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true lego/scripts:lego_cli -- run-locally --model_entity_id 959168967 --config_version 28 --publish_context OFFLINE_PUBLISH --lego_pipeline aiplatform.modelstore.model_generation.lego.lego_pipeline_builder.gmpp_lego_pipeline --gmpp_config '{"gmpp_pipeline_descriptor": "aiplatform.modelstore.model_generation.v1.ads_pipelines.aimp_pyper_pipeline.model_generation_pipeline", "worker_process_number":12, "worker_thread_per_process_number": 6, "use_work_assignment": true}' 2>&1 | tee /tmp/gmpp_lc.txt
Snapshot:
https://www.internalfb.com/manifold/explorer/ads_storage_fblearner/tree/user/facebook/fblearner/predictor/959168967/47
Differential Revision: D51226114
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113509
Approved by: https://github.com/jerryzh168
Summary:
For a Node: node1 and edge: (node1, node2), since they are observing the same
Tensor, we may want to implicitly share observers, this flag allows people to
turn off this behavior for the output of the node
See the test_allow_implicit_sharing test for use case
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_allow_implicit_sharing
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112929
Approved by: https://github.com/kimishpatel
Summary: Previously we only copied over q/dq args for the per
tensor case. This was because the qparams for `quantize_per_tensor`
are literals while the qparams for `quantize_per_channel` are
`get_attr` nodes (tensors), which disappear from the original
nodes in the graph after subgraph rewriting.
However, this is problematic because, in the per channel case,
not all q/dq args are tensors. In particular, the args after
the qparams (axis, qmin, qmax, dtype) are all literals. For
these literal args we simply used the hardcoded ones
(0, -127, 127, torch.int8 respectively), even if the user
explicitly specified to use a different weight dtype. This
commit fixes this by copying over these literal args for the
per channel case as well.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112612
Approved by: https://github.com/jerryzh168
Summary: Previously QAT fusion assumes bias is not quantized.
This works for the existing XNNPACKQuantizer, but not for custom
quantizers that wish to quantize the bias. This commit supports
this by adding the necessary patterns. This requires refactoring
the code, however, since it previously assumed that there will
only be one pair of q-dq (from conv weight) in the matched
pattern, and this is no longer true.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Differential Revision: [D50856377](https://our.internmc.facebook.com/intern/diff/D50856377)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112528
Approved by: https://github.com/jerryzh168
Summary: This commit refactors q-dq patterns used in QAT fusion,
reducing code duplication. This is important for future efforts
to support quantizing bias.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112279
Approved by: https://github.com/jerryzh168
ghstack dependencies: #112159
Summary: Today, we have special handling for special qspecs like
`SharedQuantizationSpec` or `DerivedQuantizationSpec`, since these
qspecs refer to other nodes in the graph and these node references
need to be updated after replacement (since they referred to nodes
in the original graph that no longer exist in the new graph).
However, we only do the above for special nodes like conv, bn,
getitem, and relu. This doesn't cover the common use case of
having conv bias derive its qparams from those of conv input
activations and conv weight. This commit adds support for this
use case by also replacing the node references for these nodes.
Test Plan:
python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec
Reviewers: jerryzh168, kimishpatel
Subscribers: jerryzh168, kimishpatel, supriyar
Differential Revision: [D50697078](https://our.internmc.facebook.com/intern/diff/D50697078)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112159
Approved by: https://github.com/jerryzh168
Summary: att, after SharedQuantizationSpec bug fix we are doing some checks before hand, this can simplify the logic when we insert observers
Test Plan:
contbuild & OSS CI, see bf998a2c5d
Test plan from GitHub:
python test/test_quantization.py TestQuantizePT2E
CIs
Differential Revision: D50816224
Pulled By: jerryzh168
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112453
Approved by: https://github.com/andrewor14
Summary:
att, after SharedQuantizationSpec bug fix we are doing some checks before hand, this can simplify the logic when we insert observers
Test Plan:
python test/test_quantization.py TestQuantizePT2E
CIs
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111828
Approved by: https://github.com/kimishpatel
ghstack dependencies: #111827
This pull request addresses an inconsistency in the representation of the Hadamard product across PyTorch documentation. Currently, the notation varies among different modules:
- In `torch.nn.LSTM` documentation the Hadamard product is represented with $\odot$
- In `torch.nn.GRU` documentation the Hadamard product is represented with $*$
- In `torch.nn.LSTMCell` documentation the Hadamard product is represented with $*$
- In `torch.nn.GRUCell` documentation the Hadamard product is represented with $*$
- In `torch.ao.nn.quantized.dynamic.GRU` documentation the Hadamard product is represented with $*$
This PR proposes consistently representing the Hadamard product throughout the documentation to enhance clarity and align with established standards.
The notation $\odot$ will be uniformly adopted, following the convention in the [Deep Learning Book](https://www.deeplearningbook.org/contents/linear_algebra.html).
**Changes Made:**
- Modified `torch.nn.GRU` documentation to represent the Hadamard product with $\odot$
- Modified `torch.nn.LSTMCell` documentation to represent the Hadamard product with $\odot$
- Modified `torch.nn.GRUCell` documentation to represent the Hadamard product with $\odot$
- Modified `torch.ao.nn.quantized.dynamic.GRU` documentation to represent the Hadamard product with $\odot$
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111763
Approved by: https://github.com/albanD
Summary:
Previously we actually did not really support this, this PR added the support.
Next
* clean up insert observer logic
* add allow_transitive_sharing boolean flag to allow people to turn this op for certain edges
Test Plan:
python test/test_quantization.py TestQuantizePT2E.test_shared_qspec_transitivity
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: [D50250789](https://our.internmc.facebook.com/intern/diff/D50250789)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111172
Approved by: https://github.com/kimishpatel