pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Andrew Or	c7b4eec233	[Quant][fx][bc-breaking] Replace qconfig_dict with a config object (#78452 ) Summary: Previously, FX graph mode quantization configurations were specified through a dictionary of qconfigs. However, this API was not in line with other core APIs in PyTorch. This commit replaces this dictionary with a config object that users will create and pass to prepare and convert. This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new API is as follows: ``` from torch.ao.quantization import QConfigMapping from torch.ao.quantization.quantize_fx import prepare_fx qconfig_mapping = QConfigMapping() .set_global(qconfig) .set_object_type(torch.nn.Linear, qconfig) .set_module_name_regex("foo.bar", qconfig) .set_module_name("mod", qconfig) prepare_fx(model, qconfig_mapping) ``` For backwards compatibility, `prepare_fx`, `prepare_qat_fx`, and `convert_fx` will continue to accept qconfig_dicts, which will be converted to QuantizationConfigs internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in qconfig_dict as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Test Plan:* python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: D36747998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452 Approved by: https://github.com/jerryzh168	2022-05-30 18:30:07 +00:00
Jerry Zhang	8225f42a8a	[quant][fx][equalization] Fix example_inputs follow ups in test_equalize_fx Summary: as a followup to https://github.com/pytorch/pytorch/pull/76496, we defined model specific example_inputs for the test models in common_quantization.py and used these in test_equalize_fx Test Plan: python test/test_quantization.py TestEqualizeFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78314 Approved by: https://github.com/vkuzo	2022-05-26 01:42:24 +00:00
Jerry Zhang	7ea5fa3dd4	[reland][quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78286 Approved by: https://github.com/dzdang	2022-05-25 23:31:51 +00:00
Jerry Zhang	716f76716a	[quant] Skip some broken tests due to hypothesis Summary: Some quantization tests failed when we didn't touch any code related to the tests, all of them are using hypothesis, it's likely that hypothesis is the problem. We will skip these tests for now and gradually remove all hypothesis tests from quantization test code, or skip running the hypothesis tests in CI Test Plan: ossci Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78302 Approved by: https://github.com/suo, https://github.com/dzdang	2022-05-25 21:46:11 +00:00
Vasiliy Kuznetsov	53e05ad4b2	ns for fx: remove restriction on nodes with no args and only kwargs Summary: Removes the restriction from NS for FX on handling nodes which have no positional arguments, such as `F.linear(input=x, weight=w, bias=b). In order to achieve this, we delete all places in the code which were doing things like ``` node.args[0] ``` And replace them with ``` _get_normalized_nth_input(node, gm, 0) ``` The `_get_normalized_nth_input` function is a best effort way to get the n'th normalized input. This is needed because some FX tools output nodes normalized to be kwargs only, and we need to be able to handle this in NS. Test plan: ``` python test/test_quantization.py -k test_linear_kwargs_shadow ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78181 Approved by: https://github.com/z-a-f, https://github.com/hx89	2022-05-25 17:00:39 +00:00
PyTorch MergeBot	87148f2b59	Revert "[quant] Add utility function get_fqn_to_example_inputs" This reverts commit `50a44fe461`. Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master	2022-05-25 06:37:32 +00:00
Jerry Zhang	50a44fe461	[quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146 Approved by: https://github.com/vkuzo	2022-05-25 03:07:16 +00:00
dzdang	2aad28a539	[quant][core][gpu][feature] Implemented quantized cuda gelu Summary: Support for quantized cuda gelu has been provided by using `dequantize -> fp32 cuda gelu kernel -> quantize`. Mathematically, this is not equivalent to doing int8 gelu, so we have opted for this approach for now. It might be possible to write a variant of the int8 gelu that's equivalent to `dequantize -> fp32 cuda gelu kernel -> quantize`, which can be a topic for future work. Test function `test_qgelu` was amended to test gelu for quantized cuda backends. Test Plan: ``` python test/test_quantization.py -k test_qgelu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77212 Approved by: https://github.com/jerryzh168	2022-05-24 22:31:45 +00:00
Jerry Zhang	416899d1a9	[quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249 ) (#77608 ) Summary: X-link: https://github.com/facebookresearch/d2go/pull/249 X-link: https://github.com/fairinternal/ClassyVision/pull/104 X-link: https://github.com/pytorch/benchmark/pull/916 X-link: https://github.com/facebookresearch/ClassyVision/pull/791 X-link: https://github.com/facebookresearch/mobile-vision/pull/68 FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors. Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base. As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now. If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to pass the arguments by keyword BC-breaking Note: Before: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict) # or m = prepare_qat_fx(m, qconfig_dict) ``` After: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) # or m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)\| \|Modified Pages\| Reviewed By: vkuzo, andrewor14 Differential Revision: D35984526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608 Approved by: https://github.com/dzdang	2022-05-21 21:03:48 +00:00
Zafar	44c91383d3	[quant][ao_migration] Base package in tests Adding a base package as an argument to the testing routines. That will allow us to test other locations that are being migrated. For example ``` AOMigrationTestCase._test_package_import('my_mackage', base='quantization') ``` would check if `torch.quantization.my_package` and `torch.ao.quantization.my_package` are the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77064 Approved by: https://github.com/jerryzh168	2022-05-20 18:43:37 +00:00
Xiang Gao	f274558018	Bitwise ops improvements (#77621 ) - Bitwise shift remove floating point support - Bitwise and, or, xor add (scalar, tensor) overload - Use `test_ops.py` to test these ops, including error cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/77621 Approved by: https://github.com/ngimel	2022-05-17 21:16:42 +00:00
PyTorch MergeBot	981719fe5a	Revert "[quant][core][gpu][feature] Implemented quantized cuda gelu" This reverts commit `b892b85b88`. Reverted https://github.com/pytorch/pytorch/pull/77212 on behalf of https://github.com/facebook-github-bot	2022-05-14 00:17:51 +00:00
dzdang	b892b85b88	[quant][core][gpu][feature] Implemented quantized cuda gelu Summary: Support for quantized cuda gelu has been provided by using `dequantize -> fp32 cuda gelu kernel -> quantize`. Mathematically, this is not equivalent to doing int8 gelu, so we have opted for this approach for now. It might be possible to write a variant of the int8 gelu that's equivalent to `dequantize -> fp32 cuda gelu kernel -> quantize`, which can be a topic for future work. Test function `test_qgelu` was amended to test gelu for quantized cuda backends. Test Plan: ``` python test/test_quantization.py -k test_qgelu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77212 Approved by: https://github.com/jerryzh168	2022-05-13 20:59:24 +00:00
Vasiliy Kuznetsov	d8479098a6	ns for fx: remove quantized ReLU6 from mapping Summary: This module is no longer swapped by FX graph mode quantization, because it can take quantized inputs. Removing it from NS for FX mappings. Test plan: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76992 Approved by: https://github.com/jerryzh168	2022-05-13 20:38:31 +00:00
Vasiliy Kuznetsov	6a33b80191	ns for fx: remove GroupNorm from mapping Summary: GroupNorm quantization is defined but it looks like FX graph mode quantization does not have it enabled. Removing it from NS for FX. Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76991 Approved by: https://github.com/jerryzh168	2022-05-13 20:33:27 +00:00
Vasiliy Kuznetsov	20b75e3e5f	ns for fx: clean up convtranspose mappings Summary: Fixes a couple of problems with `ConvTranspose` in NS mappings: 1. deletes the dynamic versions, as they do not work yet 2. deletes `ConvTranspose3d`, as it's not swapped yet in the quantization workflow 3. removes a duplicate set Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76980 Approved by: https://github.com/jerryzh168	2022-05-13 20:22:42 +00:00
Jiayi Sun	e867831b84	extend replaceConvolutionWithAtenConv to handle conv_transpose3d (#76888 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/76888 Approved by: https://github.com/eellison	2022-05-13 16:40:12 +00:00
dzdang	af80329ca9	[quant][core][gpu][feature] Implemented quantized conv1d cudnn op Summary: Previously, only quantized conv2d cudnn op has been implemented. This PR implements the 1d variant. Because cuDNN does not have direct support for conv1d, we have cast the 1d case to a 2d case by adding a dummy weight dimension of 1 for the input and weight tensors. This is analogous to how it was done for quantized cpu conv1d (e.g., see `quantized/cpu/qconv.cpp`) A corresponding test case was added in `test_quantized_op.py`. This function should ideally be merged with `test_qconv1d` when cuDNN flags are enabled and available in pytorch. Test Plan: ``` python test/test_quantization.py -k test_qconv1d_cudnn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77175 Approved by: https://github.com/jerryzh168	2022-05-12 03:25:30 +00:00
dzdang	1d7b294574	[quant][better-engineering][bc-breaking] Removed quant_min/quant_max from fake_quant modules Summary: FakeQuantize class has quant_min/quant_max and activation_post_process attributes, the latter of which already includes quant_min/max. As such, we can remove quant_min/quant_max from FakeQuantize and use FakeQuantize.activation_post_process.quant_m* directly. Test plan: ``` python test/test_quantization.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76674 Approved by: https://github.com/vkuzo	2022-05-11 14:23:05 +00:00
Vasiliy Kuznetsov	3a8752db86	ns for fx: skip shadowing ops if copy subgraph is not implemented (#76663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76663 Subgraph copy does not handle all edge cases. It's high eng time to handle them all, and currently an unhandled edge case crashes the script. This PR adds a function to check if the subgraph copy is supported, and skips shadowing if it is not supported. This way the model can still go through the shadowing APIs without an exception. Test Plan: ``` python test/test_quantization.py -k FXNumericSuite ``` Reviewed By: hx89 Differential Revision: D36069304 Pulled By: vkuzo fbshipit-source-id: 6b38b8d8e43396a4cf2373b247223a19d451d096 (cherry picked from commit e2322ca0635c51a4701e60fa90f77915a3c46d0f)	2022-05-05 13:19:53 +00:00
Vasiliy Kuznetsov	d3e338935a	ns for fx: skip shadowing for torch.cat, and also for nodes with only kwargs (#76561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76561 User model had syntax like `torch.cat(tensors=[x])`. This PR fixes two errors to unbreak this in NS shadow model: 1. skip nodes which only have kwargs (instead of throwing an exception) 2. explicitly skip shadowing of `torch.cat` (since it's not supported anyways) Test Plan: ``` python test/test_quantization.py -k test_op_with_only_kwargs_skips_shadowing python test/test_quantization.py -k test_op_mul_add_cat_skips_shadowing ``` Reviewed By: hx89 Differential Revision: D36017356 Pulled By: vkuzo fbshipit-source-id: 0da4840a62c2dac183f8294c2cec4fce262474b3 (cherry picked from commit 88409c1576e7f690708957b2baa285fc7961e9d6)	2022-05-05 13:19:53 +00:00
dzdang	e2aa28a2d0	[quant][fx][improvement] Renamed default_affine_fixed_qparams_observer and default_symmetric_fixed_qparams_observer (#76637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76637 The previous naming convention `default_affine_fixed_qparams_observer` and `default_symmetric_fixed_qparams_observer` were uninformative, and users had to read the definition in order to understand what these observers are. The new naming convention reveals information about the range of the observers The analogous changes were also made for `default_symmetric_fixed_qparams_fake_quant` and `default_affine_fixed_qparams_fake_quant` Test Plan: ``` python test/test_quantization.py ``` ``` python test/test_quantization.py ``` Differential Revision: D36054169 D36054169 Reviewed By: vkuzo Pulled By: dzdang fbshipit-source-id: 215f7786a4b7abda7327f17cc61735697ec5cca9 (cherry picked from commit 21a4e6eda4467c8adca7fd534a506a14e975f9cf)	2022-05-04 02:39:20 +00:00
Vasiliy Kuznetsov	e155e2584a	ns for fx: skip operator.add and operator.mul when shadowing (#76504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76504 Shadowing for add and mul is not implemented, this PR fixes the skipping logic to also skip the `operator.add` and `operator.mul` flavor of these operators. Test Plan: ``` python test/test_quantization.py -k test_mul_add_skips_shadowing ``` Reviewed By: dzdang Differential Revision: D35985997 Pulled By: vkuzo fbshipit-source-id: f832e54a5461d3b182df4bb905357d6c66742e98 (cherry picked from commit 93ae9592f68873865ebfdc438bffb1c9486dd1c1)	2022-05-03 05:58:46 +00:00
Vasiliy Kuznetsov	31d5a300ac	quant: make RecordingObserver inherit from ObserverBase (#76460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76460 `RecordingObserver` inherits from `_ObserverBase` but does not use any functionality from it. Making it inherit from `ObserverBase` instead. This will make it simpler to rename `_ObserverBase` to something more meaningful in the next PR. Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D35976351 Pulled By: vkuzo fbshipit-source-id: 19c106bf0d48607c231702e2e048f42a7f48a5c6 (cherry picked from commit 4fd44123b0e9bcdcae546aecabe80d7642129cf5)	2022-05-03 05:53:54 +00:00
dzdang	8c47e9dc81	[quant][core][gpu][improvement] Added support for padding quantized cudnn conv2d operator Summary: cudnn v8.4.0 expects input channels for conv2d to be a multiple of 4. If it is not, we need to explicitly pad it to a multiple of 4 ourselves as cudnn does not currently support padding intriniscally. The padding implemented here is limited to groups=1; however, this should be a straightforward adaption to groups > 1 since we're only padding a single dimension. When cudnn enables support for padding, we can remove the padding on our end. Test plan: ``` python test/test_quantization.py -k test_qconv2d_cudnn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76184 Approved by: https://github.com/jerryzh168	2022-04-29 00:13:48 +00:00
dzdang	bbc263eb5d	[quant][core][gpu][feature] Implemented quantized cuda adaptive average pool2d op (#76081 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76081 The current implementation of quantized cuda adaptive average pooling uses the following: dequant -> fp32 adaptive average pooling -> quant. This is the same numerically as quantized adaptive average pooling. This is not the ideal implementation, as we desire to operate on the quantized values directly. However, we are currently blocked on this as we are waiting for cudnn's 8.5.0 release, which is anticipated to support adaptive average pooling. When that support is made available, we will use it directly. Test Plan: ``` python test/test_quantization.py TestQuantizedOps.test_adaptive_avg_pool ``` ``` python test/test_quantization.py TestQuantizedOps.test_adaptive_avg_pool ``` Differential Revision: D35768751 D35768751 Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: ad06fd06d6941b92105bcabf0fd54b9e27a029d5 (cherry picked from commit 4e1805dd62a9d5e94c61340ac46bcd7aa4e49dd9)	2022-04-28 12:37:20 +00:00
dzdang	ad88816c86	[quant][core][gpu][feature] Added support for float->quantized cuda tensor copying (#76177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76177 Previously, support for copying a fp tensor to a quantized tensor was limited to CPU tensors. This PR extends the support to GPU tensors. A corresponding test was added to test_qtensor_float_assignment for cuda tensors Test Plan: ``` python test/test_quantization.py -k test_qtensor_float_assignment ``` Imported from OSS Differential Revision: D35817832 D35817832 Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: e5a4a0bb2d8a56f3f1a88806a534b5cb38275cf2 (cherry picked from commit 9173e07b51bb1b853244b205ddf3e36000f01b64)	2022-04-28 02:23:14 +00:00
Vasiliy Kuznetsov	35545d85dc	fx quant: add quantized Softmax workflow integration (#75106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75106 In https://github.com/pytorch/pytorch/pull/75017 a quantized softmax kernel was added. This PR adds the FX graph mode quantization workflow integration to swap `nn.Softmax` to `nnq.Softmax`. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops ``` Reviewed By: kimishpatel, andrewor14 Differential Revision: D35324817 Pulled By: vkuzo fbshipit-source-id: 710ae3bedf8a6ad1dc411cd9808fdd0ce743e757 (cherry picked from commit d67603c0fbb1d3469d97bd538cec38aa8b03324b)	2022-04-20 21:54:26 +00:00
dzdang	e20793b054	[quant][core][gpu][cudnn] Added support for nhwc tensors in quantized cudnn add_relu op (#75806 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75806 When using the the quantized cudnn add operator, if the input tensors are 4D, cudnn requires NHWC format in v. 8.4.0 (older versions may have relaxed this constraint). Previously, all tensors defaulted to NCHW format. Test Plan: ``` python test/test_quantization.py -k test_qadd_relu_cudnn ``` Reviewed By: vkuzo Differential Revision: D35651368 Pulled By: dzdang fbshipit-source-id: b6ce49cf100b88c6fa29513ec50b38d445c3c02f (cherry picked from commit 5936fe6783a02827bd93feb80d137da508d6facc)	2022-04-20 13:48:40 +00:00
Salil Desai	c358c5d7d8	[PyTorch Edge] Using Qnnpack in Quantized Softmax Op (#75799 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75799 Use Qnnpack's quantized softmax in quantized::softmax op when available Test Plan: From fbcode ```buck test caffe2/test:quantization -- test_qsoftmax``` # Benchmarking (Naive Quantized from D35469257 v1, rest D34996486 v14) \|Shape\|Fp32\|Naive Quantized\|Qnnpack Quantized\|Qnnpack Quantized with Permute\| \|(1, 5, 49, 49)\|[6.6757](https://www.internalfb.com/intern/aibench/details/916894241135767)\|[7.5981](https://www.internalfb.com/intern/aibench/details/504937774229694)\|[1.5579](https://www.internalfb.com/intern/aibench/details/197716001861453)\|[2.8446](https://www.internalfb.com/intern/aibench/details/59311708375203)\| \|(1, 9, 16, 128)\|[7.8485](https://www.internalfb.com/intern/aibench/details/135980349949180)\|[9.0499](https://www.internalfb.com/intern/aibench/details/10150813869685)\|[1.8865](https://www.internalfb.com/intern/aibench/details/58396904565184)\|[3.5282](https://www.internalfb.com/intern/aibench/details/24583753477273)\| \|(1, 5, 49, 64)\|[7.0626](https://www.internalfb.com/intern/aibench/details/232201930202347)\|[8.1091](https://www.internalfb.com/intern/aibench/details/57639118425406)\|[1.801](https://www.internalfb.com/intern/aibench/details/656994017385942)\|[3.2989](https://www.internalfb.com/intern/aibench/details/518979104130992)\| \|(1, 3, 196, 64)\|[16.4717](https://www.internalfb.com/intern/aibench/details/895795134460898)\|[18.1987](https://www.internalfb.com/intern/aibench/details/909875420196348)\|[3.5657](https://www.internalfb.com/intern/aibench/details/206864227381228)\|[8.4519](https://www.internalfb.com/intern/aibench/details/84462467166362)\| \|(1, 6, 49, 128)\|[15.9872](https://www.internalfb.com/intern/aibench/details/417436371026264)\|[17.4556](https://www.internalfb.com/intern/aibench/details/183113464145486)\|[3.3912](https://www.internalfb.com/intern/aibench/details/616978041358188)\|[8.019](https://www.internalfb.com/intern/aibench/details/849820562672950)\| \|(1, 3, 196, 196)\|[47.3636](https://www.internalfb.com/intern/aibench/details/633568439089073)\|[52.0079](https://www.internalfb.com/intern/aibench/details/742080402804069)\|[8.5009](https://www.internalfb.com/intern/aibench/details/685773806433926)\|[13.5807](https://www.internalfb.com/intern/aibench/details/871998384861927)\| \|(1, 6, 16, 64)\|[4.0205](https://www.internalfb.com/intern/aibench/details/380419433454222)\|[4.5973](https://www.internalfb.com/intern/aibench/details/923432861470595)\|[1.0569](https://www.internalfb.com/intern/aibench/details/176718883676884)\|[2.0519](https://www.internalfb.com/intern/aibench/details/303780226597723)\| \|(1, 6, 16, 16)\|[1.8299](https://www.internalfb.com/intern/aibench/details/599824935422385)\|[2.3109](https://www.internalfb.com/intern/aibench/details/669753943440643)\|[0.808](https://www.internalfb.com/intern/aibench/details/956331973568963)\|[1.6406](https://www.internalfb.com/intern/aibench/details/924887465284668)\| \|(1, 9, 16, 49)\|[4.5134](https://www.internalfb.com/intern/aibench/details/946070183169117)\|[5.2282](https://www.internalfb.com/intern/aibench/details/623403709385332)\|[2.8195](https://www.internalfb.com/intern/aibench/details/635876531473203)\|[2.2251](https://www.internalfb.com/intern/aibench/details/507256033953952)\| \|(1, 6, 49, 196)\|[23.9811](https://www.internalfb.com/intern/aibench/details/605021113223196)\|[26.2834](https://www.internalfb.com/intern/aibench/details/991778071254930)\|[4.5338](https://www.internalfb.com/intern/aibench/details/626603993142478)\|[9.3877](https://www.internalfb.com/intern/aibench/details/962263658487065)\| table made with https://www.internalfb.com/intern/anp/view/?id=1714217&revision_id=686803042569716 Reviewed By: kimishpatel Differential Revision: D34953197 fbshipit-source-id: 57418757fce17903583c04dffd51c886f9e1bc0e (cherry picked from commit 8978222623f0cbacdb0373c405136ec94c035da6)	2022-04-19 22:29:57 +00:00
arindamroy-eng	7478ce187a	ROCM:Unskip more tests for ROCM5.0 Re-enabling more tests which are working on ROCM5.0 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75353 Approved by: https://github.com/ezyang	2022-04-19 19:45:55 +00:00
dzdang	982be19638	[quant][core][gpu][improvement] Suported int8 matmul for quantized linear cudnn op Summary: This PR requires cudnn v8.4.0, which enables support for int8 matmul. Previous implementation of quantized linear cudnn operator did used cudnn v8.3.3, which did not have have support for int8 matmul (we had to convert our int8 matmul to fp matmul) Test plan: ``` python test/test_quantization.py -k test_qlinear_cudnn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75418 Approved by: https://github.com/jerryzh168	2022-04-19 17:24:34 +00:00
Jerry Zhang	74454bdb46	[quant][fx] Move backend_config folder to torch.ao.quantization Summary: Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that it can be shared by different workflows Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py) Test Plan: python test/teset_quantization.py TestQuantizeFx python test/teset_quantization.py TestQuantizeFxOps python test/teset_quantization.py TestQuantizeFxModels python test/test_quantization.py TestAOMigrationQuantization python test/test_quantization.py TestAOMigrationQuantizationFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823 Approved by: https://github.com/vkuzo	2022-04-19 15:38:57 +00:00
dzdang	6dc71461e1	[quant][core][gpu][bug-fix] Added additional caching support in quantized cudnn add_relu op Summary: Previous caching strategy for quantized cudnn add_relu operator was insufficient as it did not properly record all the necesary information. This PR adds several items to the CacheKey (e.g., input sizes, input dimensions, etc..) that enables proper caching Test plan: ``` python test/test_quantization.py -k test_qadd_relu_cudnn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75772 Approved by: https://github.com/jerryzh168	2022-04-18 16:53:18 +00:00
dzdang	7d8b366223	[quant][improvement][gpu] Fixed errors in test_qlinear_cudnn Summary: Previously, test_qlinear_cudnn had some hard coded parameters that are now removed, and bias and relu are now enabled. Test plan: ``` python test/test_quantization.py TestQuantizedLinear.test_qlinear_cudnn ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75446 Approved by: https://github.com/jerryzh168	2022-04-15 23:14:41 +00:00
Andrew Or	5dcbcc6de8	[Quant][fx] Fix get_default_qconfig_dict for fused modules Summary: Calling `prepare_fx` with `get_default_qconfig_dict` failed for models with fused modules, such as `ConvReLU2d`. This commit fixes this by adding qconfig entries for ReLU and BatchNorm as well. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qconfig_dict_with_fused_modules Reviewers: jerryzh168 Subscribers: jerryzh168, vkuzo Issue: https://github.com/pytorch/pytorch/issues/75825 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75838 Approved by: https://github.com/jerryzh168	2022-04-15 22:37:26 +00:00
dzdang	515d61f2fc	[quant][core][bug fix] Corrected at::to(memory_format=...) support for quantized tensors Summary: Previously, at::to support for quantized tensors did not work properly, and we had to, instead, use at::contiguous. This PR allows us to use at::to(memory_format=...) or torch.Tensor.to(memory_format=....) on the back- and front-ends. Test plan: python test/test_quantization.py -k test_qtensor_to_memory_format Pull Request resolved: https://github.com/pytorch/pytorch/pull/75540 Approved by: https://github.com/jerryzh168	2022-04-15 03:20:50 +00:00
Thiago Crepaldi	9bbe1d632e	Fix ONNX ATen fallback for non-caffe2 engines This PR introduces 3 BC changes: First, this PR propagates `BUILD_CAFFE2` flag to `libtorch` and `libtorch_python`, which is necessary for non-caffe2 ONNX runtimes when using `ONNX_ATEN_FALLBACK` operator export type. Second, as a complement of https://github.com/pytorch/pytorch/pull/68490, this PR refactors Caffe2's Aten ops symbolics to consider not only the `operator_export_type` (aka `ONNX_ATEN_FALLBACK`) to emit Caffe2 Aten ops, but also whether `BUILD_CAFFE2` (which is called `torch.onnx._CAFFE2_ATEN_FALLBACK` in python binding) is set. Lastly, it renames `onnx::ATen` to `aten::ATen` for ONNX spec consistency in a BC fashion. ONNX doesn't have `ATen` op on its spec, but PyTorch ONNX converter emits them. Non-Caffe2 backend engines would be mislead by such operator's name/domain. A non-ideal workaround would be to have Aten ops handled based on its name and ignore the (non-complaint) domain. Moreover, users could incorrectly file bugs to either ONNX or ONNX Runtime when they inspect the model and notice the presence of an unspecified ONNX operator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73954 Approved by: https://github.com/BowenBao, https://github.com/malfet, https://github.com/garymm, https://github.com/jiafatom	2022-04-14 23:18:45 +00:00
Jerry Zhang	0c08fcff32	[quant][fx] Cleanup some unused states and args Summary: * Removed "patterns" from observed module since it's no longer needed * Removed an arg from insert_observer * Removed some unused keys in checking the validity of qconfig_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75521 Approved by: https://github.com/andrewor14	2022-04-14 13:18:00 +00:00
Vasiliy Kuznetsov	63c6209d09	ns for fx: reenable tests disabled by #62608 Summary: In https://github.com/pytorch/pytorch/pull/62608 various tests in FX NS were disabled due to lack of dtype inference. https://github.com/pytorch/pytorch/pull/75471 fixes some of these issues, the issue fixed by this PR is probably why the tests were disabled. This PR reenables the tests and adjusts them for the new behavior in https://github.com/pytorch/pytorch/pull/62608. Test plan: ``` python test/test_quantization.py -k NumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75511 Approved by: https://github.com/jerryzh168	2022-04-13 19:44:47 +00:00
Vasiliy Kuznetsov	f1f185f6f9	ns for fx: fix bug to enable again on torchvision models Summary: The tests were disabled by https://github.com/pytorch/pytorch/pull/61687, but this specific behavior broke some time after while these tests were disabled. The issue was that: 1. `torch.add` is present in these models 2. In the common codepath of comparing fp32 to int8, torch.ops.quantized.add was already filtered out because it did not have a dtype specified 3. In the less common codepath of comparing fp32 to fp32, torch.add was eligible for shadowing, but the logic was broken This PR fixes (3) by disabling shadowing on ops which do not support it, by op type. The support may be built later, if needed. Test plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_resnet18 python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_mobilenet_v2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75472 Approved by: https://github.com/jerryzh168	2022-04-13 19:44:46 +00:00
Vasiliy Kuznetsov	ae3210420e	ns for fx: fix issue with shadowing nodes of unknown dtype Summary: In https://github.com/pytorch/pytorch/pull/61687, a couple of FX Numeric Suite tests were disabled. This PR reenables one of these tests. We update the dtype inference logic of NS to always return a specific type instead of sometimes returning "fp32 or int8". When the type cannot be deduced by the current logic, we do not shadow the node. As a better version of dtype inference becomes available in FX Graph Mode Quantization, we could migrate this code to use it. Future PRs in the stack will unbreak other things to enable NS for FX to work on torchvision again. Test plan: ``` python test/test_quantization.py -k NumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75471 Approved by: https://github.com/jerryzh168	2022-04-13 19:44:46 +00:00
Jerry Zhang	761bb06292	[quant][fx] Use native backend_config_dict in convert Summary: Previously the list of qat modules, fused modules etc. are hardcoded in the convert code, in this PR we get these information from backend_config_dict instead Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestFXNumericSuiteCoreAPIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75520 Approved by: https://github.com/vkuzo	2022-04-12 17:59:24 +00:00
Jerry Zhang	f83d047338	[quant][fx] Use native backend_config_dict in prepare Summary: Previously we are still relying on the registration mechnism and get the default quantize handlers that are registered, now we have moved all registration to backend_config_dict we can get all quant patterns just from backend_config_dict now. This PR enables using native backend_config_dict everywhere in prepare when the backend_config_dict is None, we'll also do similar changes in convert as well Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75469 Approved by: https://github.com/vkuzo	2022-04-12 17:05:31 +00:00
Salil Desai	ca0ef52382	[PyTorch Edge] Add Quantized Softmax Op (Naive Implementation) (Re-land) Summary: Reland of D34943147 (`8d7242a18b`) + Revert of D35404312, after mitigation of S267077 Test Plan: ```buck test caffe2/test:quantization -- test_qsoftmax``` Differential Revision: D35432475 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75415 Approved by: https://github.com/kimishpatel	2022-04-11 22:39:50 +00:00
Yulv-git	ac2d2e3a3d	Fix some typos. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75561 Approved by: https://github.com/albanD	2022-04-11 21:55:59 +00:00
Jerry Zhang	72d3d160fb	[quant][fx] Remove additional_object_mapping from the docs (#75389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75389 This seems to be removed before, so won't mark this PR as bc-breaking, this use case is now enabled with backend_config_dict api Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451960 fbshipit-source-id: 21a8f19c1968af44bf4fa603f16ee8c6f5080e5a (cherry picked from commit 2862f17b57f846b55736bc6b5d10df4256567adf)	2022-04-11 10:40:11 +00:00
Jerry Zhang	dd667b6e97	[quant][fx] Move all fusion registrations to backend_config_dict (#75318 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75318 This PR moves the registrations for fusion patterns to backend_config_dict Also fixed one issue in numeric suite graph matcher, since now (torch.nn.ReLU, torch.nn.BatchNorm3d) would appear in quant patterns, (previously only in fusion pattern), and we need to match sure (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d)) can match before (torch.nn.ReLU, torch.nn.BatchNorm3d), but previously, it looks like (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d)) is not really matched since `end_node_matches_reversed_fusion` is expecting a flattened pattern like (torch.nn.ReLU, torch.nn.BatchNorm3d, torch.nn.Conv3d), for now we'll manually flatten this pattern, but in the future I think we might want to use the matching function `is_match` under torch.ao.quantization.fx.match_utils to do this matching. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo, andrewor14 Differential Revision: D35423788 fbshipit-source-id: a54093ccebae9c59aeee9399669ddb2c48bfb9aa (cherry picked from commit 6a55ea8eb2740cedafb9972888fedf68e927586d)	2022-04-09 05:08:37 +00:00
Andrew Or	0bdf9a9833	[Quant][fx] Decouple prepare_*fx from training/eval modes (#75401 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75401 This commit removes asserts that require prepare_fx to be run in eval mode and prepare_qat_fx to be run in training mode. Test Plan: python test/test_quantization.py TestQuantizeFx.test_prepare_mode Imported from OSS Reviewed By: vkuzo, jerryzh168 Differential Revision: D35457100 fbshipit-source-id: 13a55b13d9e389991f69c06c6a70bc51cdebba36 (cherry picked from commit fb0685e0873dc8e807da3213be403b51e8b4a687)	2022-04-08 15:34:08 +00:00
Jerry Zhang	9905b1f29a	[quant][fx] Move rnn ops to backend_config_dict (#75316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75316 att, similar to previous PRs, this one moves dynamically quantized rnn ops to backend_config_dict Currently the dtype check is not yet enabled, so we provided the dtype_configs but it is not really used yet, we will enable it a bit later after we moved everything to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: malfet Differential Revision: D35423792 fbshipit-source-id: ef862ea1be5bfb4c28130775c3b2158df28d3e22 (cherry picked from commit 0247f3a768a2c165f482a66c4225b3357e33e966)	2022-04-08 08:58:50 +00:00

1 2 3 4 5 ...

1079 Commits