Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47415
nn.ReLU works for both float and quantized input, we don't want to define an nn.quantized.ReLU
that does the same thing as nn.ReLU, similarly for nn.quantized.functional.relu
this also removes the numerical inconsistency for models quantizes nn.ReLU independently in qat mode
Test Plan: Imported from OSS
Reviewed By: z-a-f
Differential Revision: D24747035
fbshipit-source-id: b8fdf13e513a0d5f0c4c6c9835635bdf9fdc2769
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47514
Previosuly the scale and zero_point were returned on the CPU even if
the input tensor was on the GPU.
This is because `copy_()` doesn't respect the device when copying over the tensor.
Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id)
in the calculate_qparams function
Test Plan:
python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24800495
fbshipit-source-id: d7a76c59569842ed69029d0eb4fa9df63f87e28c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47423
Since the dtype of this fake_quant is `quint8`, the output range should be
from 0 to 255. Fixing. This should address the numerical inaccuracies with
sigmoid and hardsigmoid with `FixedQParamsFakeQuantize` attached compared
to their quantized counterparts.
In a future PR, might be safer to also make the activation functions
using `FixedQParamsFakeQuantize` to explicitly specify their expected
output range and zero_point. Leaving that for later, as this bugfix
should be landed urgently.
Test Plan:
Manual script which gives low SQNR before this PR and high SQNR after
this PR: https://gist.github.com/vkuzo/9906bae29223da72b10d6b6aafadba42https://github.com/pytorch/pytorch/pull/47376, which can be landed after
this, adds a proper test.
Imported from OSS
Reviewed By: ayush29feb, jerryzh168
Differential Revision: D24751497
fbshipit-source-id: 4c32e22a30116caaceeedb4cd47146d066054a89
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47068
Filter the dtype config before performing the quantization in linear
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D24627907
fbshipit-source-id: 162fa47b3fcf6648049f8bc0438e41ee97ac19e9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47085
Both in train and eval mode
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24632457
fbshipit-source-id: 486aee4e073fb87e9da46a344e8dc77e848a60cf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47032
these are not top level apis, not supposed to be called directly by user.
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24610602
fbshipit-source-id: c5510f06b05499387d70f23508470b676aea582c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46955
Initially we were thinking of adding a `invalidate_quantized_float_parameters` option to free the memory
of quantized floating parameters, but it turns out we will do module swap just like in eager mode for the modules
that are quantized, so the old floating point module will not be referenced after quantization. therefore this feature
is only needed for functionals, since most people are using quantization with modules we may not need this.
we'll revisit after we find there is a need for this.
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D24579400
fbshipit-source-id: fbb0e567405dc0604a2089fc001573affdade986
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46786
Previously we only support static quant, this PR added support for other types of quantization.
Note qat is actually orthogonal to these quant types, this is referring to the convert step where we
convert the observed module to a quantized module.
for qat, user will provide a CustomModule -> FakeQuantizedCustomModule in prepare_custom_config_dict
and FakeQuantizedCustomModule -> static/dynamic/weight_only quantized CustomModule in convert_custom_config_dict.
Test Plan: Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D24514701
fbshipit-source-id: 2918be422dd76093d67a6df560aaaf949b7f338c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46751
Currently we assume the first input for add/mul is node (Tensor), but it might not be the case
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_quantized_add
python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul
python test/test_quantization.py TestQuantizeFxOps.test_quantized_add_relu
python test/test_quantization.py TestQuantizeFxOps.test_quantized_mul_relu
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D24494456
fbshipit-source-id: ef5e23ba60eb22a57771791f4934306b25c27c01
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46895
Bug: models after the FX graph mode quant prepare step lost information,
such as the extra attributes defined in `Quantizer.save_state`,
if the user performed `copy.deepcopy` on them. The information was lost
because `GraphModule` does not copy attributes which are not present on
`nn.Module` by default.
Fix: define a custom `__deepcopy__` method on observed models and
whitelist the attributes we care about.
This is needed because users sometimes run `copy.deepcopy` on their
models during non-quantization related preparations, and we should make
sure that quantization related state survives these calls.
Test Plan:
```
python test/test_quantization.py TestQuantizeFx.test_deepcopy
python test/test_quantization.py TestQuantizeFx.test_standalone_module
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D24556035
fbshipit-source-id: f7a6b28b6d2225fa6189016f967f175f6733b124
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46346
Allow user to provide additional fusion/quant patterns for fx graph mode
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317437
fbshipit-source-id: 719927cce50c74dffa4f848bd5c98995c944a26a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46345
Allow user to add more fusion mappings
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317439
fbshipit-source-id: 3b144bbc305e41efbdf3e9fb25dbbeaad9e86c6a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46657
This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid
Test Plan:
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24451406
fbshipit-source-id: 26cc140c00f12bdec9a8f9dc880f4c425f4d4074
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45538
This is used to simulate fake quantize operation for ops with fixed quantization parameters
e.g. hardsigmoid
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24004795
fbshipit-source-id: fc4797f80842daacd3b3584c5b72035774634edd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46337
We plan to pass around the mappings instead of using global registration api to keep
the mappings local to the transformations user is performing
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24317436
fbshipit-source-id: 81569b88f05eeeaa9595447e482a12827aeb961f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46298
Allow user to specify a list of qualified names for non traceable submodule
or type of the non traceable submodule
See quantize_fx.py for api
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24294210
fbshipit-source-id: eb1e309065e3dfbf31e63507aaed73587f0dae29
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46504
As titled, so we can start seeing who is using this.
Test Plan: CI
Reviewed By: hx89
Differential Revision: D24375254
fbshipit-source-id: ff7b5560d0a6a175cecbf546eefc910759296dbb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46325
Otherwise, mutating them would make the uses/users lists inaccurate.
You can still mutate the node by assigning a new value to .args or .kwargs
Test Plan: Imported from OSS
Reviewed By: jamesr66a
Differential Revision: D24308672
Pulled By: zdevito
fbshipit-source-id: a5305e1d82668b36e46876c3bc517f6f1d03dd78