pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
macandro96	f87d8c2f62	[ao][sparsity] Basic implementation of activation sparsifier (#80886 ) The Activation sparsifier class aims to sparsify/prune activations in a neural network. The idea is to attach the sparsifier to a layer (or layers) and it zeroes out the activations based on the mask_fn (or sparsification function) input by the user. The mask_fn is applied once all the inputs are aggregated and reduced i.e. mask = mask_fn(reduce_fn(aggregate_fn(activations))) Note:: The sparsification mask is computed on the input before it goes through the attached layer. Test Plan: ```python test/test_ao_sparsity.py TestActivationSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80886 Approved by: https://github.com/HDCharles	2022-07-22 21:43:33 +00:00
vspenubarthi	75aab6540e	[ao] Update DynamicStatic Detector to account for Conv (#81972 ) Summary: This updates the DynamicStatic Detector to also provide insight into whether Conv layers should use dynamic or static quantization. Before, this was not included because as of now, Dynamic quantization is not supported for Conv layers, but this adds a check for Conv layers and if dynamic is recommended, it will also give a disclaimer that it is not currently supported but will be in the future. Test Plan: python test/test_quantization.py TestFxModelReportDetectDynamicStatic Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81972 Approved by: https://github.com/jerryzh168	2022-07-22 21:00:29 +00:00
vspenubarthi	0cacaf070f	[ao] Fix to InputWeightEqualization detector to handle Conv groups (#81971 ) Summary: The current implementation of the InputWeightEqualization detector broke when it was tested on MobileNetV2, and the reason for this is that it wasn't able to properly handle groups in Conv layers, and there also had to be some minor reshaping of the weights to handle this as well. In addition, the output was correspondingly tuned so that instead of giving on output for each channel on each layer, it gives a single suggestion per module and just lets it know how many of the channels could benefit from input-weight equalization, and suggests it if it's more than half. There was also the realization that the test class didn't do a good job of testing different dimensions for the batch vs. height vs. width, so this was updated to be more comprehensive as well. Test Plan: python test/test_quantization TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81971 Approved by: https://github.com/jerryzh168	2022-07-22 20:56:15 +00:00
macandro96	e66986421d	[ao][sparsity] Training-aware data sparsity callback for lightning (#80371 ) This callback aims to sparsify the model inside lightning module after training. Note that the model is copied and then sparsified, so the existing model is not modified The sparsified model can be used for comparison and can be accessed using <callback_obj>.sparsified Test Plan: ```python torch/ao/sparsity/_experimental/data_sparsifier/lightning/tests/test_callbacks.py TestTrainingAwareCallback``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80371 Approved by: https://github.com/z-a-f	2022-07-21 16:41:43 +00:00
macandro96	eecf34fbe7	[ao][sparsity] Post training data sparsifier callback for lightning (#80370 ) Lightning callback that enables post-training sparsity. This callback aims to sparsify the model inside lightning module after training. Note that the model is copied and then sparsified, so the existing model is not modified The sparsified model can be used for comparison and can be accessed using <callback_obj>.sparsified Test Plan ```python torch/ao/sparsity/_experimental/data_sparsifier/lightning/tests/test_callbacks.py TestPostTrainingCallback``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80370 Approved by: https://github.com/z-a-f	2022-07-21 16:39:13 +00:00
Weiwen Xia	2edd6aaeaa	Add prelu op and module for quantized CPU backend (#73491 ) Add prelu op and module for quantized CPU backend. The PR includes: - Quantized version of prelu op - Native prelu kernel for quantized CPU - Prelu modules in `nn` and `nn.quantized` - FX support for prelu - Unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491 Approved by: https://github.com/jerryzh168	2022-07-20 07:48:15 +00:00
vspenubarthi	589e8a1da5	[ao] Get feature and module names from ModelReportVisualizer class (#81647 ) Summary: Added the functionality to be able to get the feature names and module_fqns from the ModelReportVisualizer class. The purpose of this addition is so that users can see the exact set of module_fqns or feature names that they can filter based on, and use this information to perform their filtering. Test Plan: python test/test_quantization.py TestFxModelReportVisualizer.test_get_modules_and_features Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81647 Approved by: https://github.com/andrewor14	2022-07-20 03:03:03 +00:00
vspenubarthi	1d3935a77d	[ao] Add method in ModelReport to generate visualizer (#81589 ) Summary: We created a ModelReportVisualizer class, and the primary way it is envisioned that it is accessed is: ``` model_report_visualizer = model_reporter.generate_visualizer() ``` This method only works after reports have been generated and it takes in the generated reports and reformats them to be ordered by module, into the format required by the ModelReportVisualization. It then generates the visualizer instance and returns that to the user. Test Plan: python test/test_quantization.py TestFxModelReportClass.test_generate_visualizer Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81589 Approved by: https://github.com/andrewor14	2022-07-20 02:58:52 +00:00
vspenubarthi	d0ce1fbbe2	[ao] Created Skeleton for ModelReportVisualizer class (#81523 ) Summary: This introduces the skeleton for the ModelReportVisualizer class. This class helps visualize the information generated by the ModelReport class `generate_report()` output. This class aims to provide visualizations in a table, plot (line graph) and histogram view. This also introduces an empty test class for testing visualizations. As implementations start occuring for this class, tests will also be approrpriately added. This includes the high level descriptions for each of the methods as well. Expected use cases will be added to the class description in a future commit as that gets finalized. Test Plan: python test/test_quantization.py TestFxModelReportVisualizer Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81523 Approved by: https://github.com/andrewor14	2022-07-20 02:39:14 +00:00
vspenubarthi	8a6d1289d8	[ao] Revised ModelReport API to take in model at initialization (#81588 ) Summary: Currently, the ModelReport API only takes in detectors at the beginning and for each of its methods, you have to pass in the model each time, which doesn't really make sense because: 1. you will always want to be working on the same model 2. passing in a different model could break things, so more fault-tolerant if we keep the model internally and make calls on it Therefore, now the model will be passed in in intialization, and will just be used for the rest of the operations with the local link. All the ModelReport tests have been adjusted to account for this, and this change must pass all the tests to ensure a successful API transition. If you wish to see how the updated API looks, the Expected Usage in the ModelReport clas description has been updated to reflect the changes. The README has also been updated with these changes as well. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81588 Approved by: https://github.com/jerryzh168	2022-07-19 16:11:46 +00:00
vspenubarthi	e907a8d966	[ao] Updated dict keys of detectors to have consistent naming scheme (#81587 ) Summary: Currently, all the detectors have pretty accurate naming schemes that give an idea of what they do. However, since now there are more and more detectors being developed, there is a need to make sure that the naming scheme for detectors are consistent for their keys. This updates the keys of the returned dictionary keys to better highlight if something is an activation stat or weight stat, etc. Test Plan: python test/test_quantization.py TestFxModelReportDetector python test/test_quantization.py TestFxModelReportObserver python test/test_quantization.py TestFxModelReportDetectDynamicStatic python test/test_quantization.py TestFxModelReportClass python test/test_quantization.py TestFxDetectInputWeightEqualization python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81587 Approved by: https://github.com/jerryzh168	2022-07-19 08:50:30 +00:00
asl3	368018530e	[quant] Implement forward and backward autograd functions for fake quantize (#81438 ) ### Summary: This PR implements custom autograd functions for forward and backward to be used in APoT fake quantization. The implementation follows this doc about custom autograd functions: https://pytorch.org/tutorials/beginner/examples_autograd/polynomial_custom_function.html ### Test Plan: Run tests with: `python test/quantization/core/experimental/test_fake_quantize.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/81438 Approved by: https://github.com/jerryzh168	2022-07-19 02:15:30 +00:00
vspenubarthi	8a3f88b5e0	[ao] Standardized InputWeightEqualizationDetector output to single level (#81586 ) Summary: Currently the InputWeightEqualizationDetector has a multi-layered output. Example ``` {'block1.linear': {'channel_axis_selected': 1, 'channel_comparison_metrics': tensor([0.8736, 0.6594, 0.2916], grad_fn=<DivBackward0>), 'input_range_info': {'global_max': tensor(9.), 'global_min': tensor(-10.), 'per_channel_max': tensor([9., 9., 9.]), 'per_channel_min': tensor([-10., -10., -10.])}, 'input_weight_equalization_recommended': [True, False, False], 'threshold': 0.8, 'weight_range_info': {'global_max': tensor(0.5618, grad_fn=<UnbindBackward0>), 'global_min': tensor(-0.2211, grad_fn=<UnbindBackward0>), 'per_channel_max': tensor([0.3764, 0.5618, 0.2894], grad_fn=<NotImplemented>), 'per_channel_min': tensor([-0.2211, 0.2213, 0.2228], grad_fn=<NotImplemented>)}}, } ``` With all the levels, it can be hard to parse the information for anything, especially the planned visualization feature where the data has to be reorganized. Therefore, to make it standardized across all detectors, all outputs will be limited to one level. The new format is: ``` {'block1.linear': { 'channel_axis_selected': 1, 'channel_comparison_metrics': tensor([0.5705, 0.9457, 0.8891], grad_fn=<DivBackward0>), 'activation_global_max': tensor(9.), 'activation_global_min': tensor(-10.), 'activation_per_channel_max': tensor([9., 9., 9.]), 'activation_per_channel_min': tensor([-10., -10., -10.]), 'input_weight_equalization_recommended': [False, True, True], 'threshold': 0.8, 'weight_global_max': tensor(0.4258, grad_fn=<UnbindBackward0>), 'weight_global_min': tensor(-0.4958, grad_fn=<UnbindBackward0>), 'weight_per_channel_max': tensor([0.1482, 0.3285, 0.4258], grad_fn=<NotImplemented>), 'weight_per_channel_min': tensor([-0.1517, -0.4958, -0.3027], grad_fn=<NotImplemented>)}, } ``` The README will also be updated to reflect this change. Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81586 Approved by: https://github.com/jerryzh168	2022-07-19 01:00:40 +00:00
vspenubarthi	2ddb722bc6	[ao] Standardize PerChannelDetector Output to be single level (#81585 ) Summary: Currently, the PerChannelDetector has a multi-layered output. Example: ``` {'backend': 'qnnpack', 'per_channel_status': {'block1.linear': {'per_channel_supported': True, 'per_channel_used': False}, 'block2.linear': {'per_channel_supported': True, 'per_channel_used': False}}} ``` The issue with this is that when it comes to future features such as visualizations where we need to go through this dictionary, it can be hard because of the variable number of layers. This changes the output format of the PerChannelDetector to have a standard format. Ex.) ``` {'block1.linear': {'backend': 'qnnpack', 'per_channel_supported': True, 'per_channel_used': False}, 'block2.linear': {'backend': 'qnnpack', 'per_channel_supported': True, 'per_channel_used': False}} ``` Test Plan: python test/test_quantization.py TestFxModelReportDetector Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81585 Approved by: https://github.com/HDCharles	2022-07-18 22:16:08 +00:00
vspenubarthi	845792db3c	[ao] Fix for extra lines after return in Outlier Detector (#81499 ) Summary: There were accidently two lines added after a return statement in the OutlierDetecor insertion that was not caught by either the linter nor the tests nor i, that were harmless, but some odd merge issue. This removes those two lines. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81499 Approved by: https://github.com/kit1980	2022-07-15 00:10:59 +00:00
vspenubarthi	0f3c8c939f	[ao] Added README for ModelReport functionality (#81369 ) Summary: This adds a README for the ModelReport functionality that contains an overview of the class, what it does, and how it works, an example of usage, information on how to implement a new detector (since this is how core functionality is added), folder structure information, and finally information on tests and where they are located. The ModelReport class is still in development and will, in the future, get additional features such as visualizations, and the README will be updated with this information as it is added. Test Plan: Just a new README, no code is added, README will be reviewed for accuracy and ease of use/ easiness to read. Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81369 Approved by: https://github.com/jerryzh168	2022-07-14 19:17:52 +00:00
vspenubarthi	8f743d7a70	[ao] Updated detector observer insert args to be vars not strings (#81382 ) Summary: Before for the detectors, the determine_observer_insert_points() function for all of them would have hard coded strings as the keys for the dictionary that would be returned to the ModelReport instance, and those same hard-coded keys would be used to actually extract information from them. Since all detectors used the same string keys, these were just made default variables at the top of the detector.py file, and all detectors just used those. The same ones are imported and now used in ModelReport file as well. This way, there is less of a chance of an error because of incorrectly typed strings. The test plan primarily tests the ModelReport class because this uses the same new vars as well for the strings and is the primary one calling each of the detector instances' determine_observer_insert_points() Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81382 Approved by: https://github.com/jerryzh168	2022-07-14 19:17:28 +00:00
Jerry Zhang	446edadd95	[quant][fx] Follow up fixes for qconfig validations for fixedqparams ops (#81010 ) Summary: This adds a few things on top of https://github.com/pytorch/pytorch/pull/80184, 1). node.target was assumed to be "tanh", torch.nn.Tanh etc. this PR handles that properly 2). adds FixedQParamsFakeQuantize support 3). extends the comparison function _partial_wrapper_equals to work with FakeQuantize.with_args(observer=...) Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D37735193](https://our.internmc.facebook.com/intern/diff/D37735193) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81010 Approved by: https://github.com/andrewor14	2022-07-14 18:06:23 +00:00
Andrew Or	c657c3d3ab	[Quant][fx] Rename convert_to_reference to convert_to_reference_fx (#81326 ) Summary: This commit renames the convert_to_reference function to convert_to_reference_fx, which is more descriptive and matches prepare_fx and prepare_qat_fx better. Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: jerryzh168 Subscribers: jerryh168 Differential Revision: [D37787876](https://our.internmc.facebook.com/intern/diff/D37787876) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81326 Approved by: https://github.com/jerryzh168	2022-07-13 22:18:46 +00:00
vspenubarthi	a25df29cc4	[ao] Updated ModelReport function calls to show not dependent on Fx GraphMode (#81252 ) Summary: Before, all the function calls for the ModelReport object were dependent on the Fx Graph Mode workflow. However, in reality, this was not true and the only requirement that was needed was for the model to be a traceable GraphModule. This also helped keep the ModelReport class as detached from the Fx Workflow as possible so that it can be used as a more all purpose tool in the future. This updated all the references to make sure that it wasn't specifically referencing that a Fx Graph Mode workflow is needed, and is instead more general since all we really need is a traceable model. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81252 Approved by: https://github.com/jerryzh168	2022-07-13 20:24:37 +00:00
vspenubarthi	5eec908700	[ao] Update ModelReport class with class usage in description. (#81251 ) Summary: This adds a example usage description to the ModelReport class so that people can see how it can be used right in the class documentation without having to consult external sources. The example usage depicts how it can be used using the QuantizationTracer, which was a decision taken to illustrate how there is no strict requirement on using this tool with only Fx Graph Mode workflow. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81251 Approved by: https://github.com/jerryzh168	2022-07-13 20:21:37 +00:00
vspenubarthi	6366c99e5b	[ao] Added Collab link for Outlier Detector ratio val choice (#81250 ) Summary: A huge part of the work for the Outlier detector was figuring out what a good nth percentile to compare against the 100th percentile was while also figuring out what a good comparision ratio would be. This commit adds the link to a collab to the documentation of the function so that people can go and see what the calculations used to determine those values are and realize that they are not just randomly thrown in there. At a high level, this collab contains work that includes: - Figuring out whether to use interpolation or lower as the rule for finding quantile between two indices - Figuring out what a good value for reference_percentile is - Figuring out what a good value for ratio_threshold is Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81250 Approved by: https://github.com/jerryzh168	2022-07-13 20:19:24 +00:00
vspenubarthi	9c298fff2e	[ao] Added constant channel check to Outlier Detector (#81249 ) Summary: The current Outlier detector does a good job of finding whether data distributions passing through layers have outliers. However, suppose we have a completely constant channel. The outlier detector would not detect it as an outlier, but that is still something we want to highlight because a constant channel usually is a result of a bad configuration or something really wrong with the data. To address this there are two additions to the outlier detector that this commit makes: - The first is to add whether there are any constant batches at all and let the user know in the text report - The second is to let the user know the number of total constant batches found for each channel, so they can figure out if there are any unnecessary channels present. The exisiting outlier detector tests were modified to do a quick check for this feature. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81249 Approved by: https://github.com/andrewor14	2022-07-13 20:16:33 +00:00
vspenubarthi	229762dcd9	[ao] Added statistical threshold arg in Outlier Detector (#81174 ) Summary: The outlier detector has a feature where it's able to notify the user if below the whole set of batches that passed through were used in Outlier calculation, which mainly happens as a result of 0-errors. This changes the code so that instead of comparing against a value like 30 as we were before, we now let the user pass in an optional fractional value and if the ratio of the batches used was below that value, the detector alerts the user. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81174 Approved by: https://github.com/andrewor14	2022-07-13 20:13:46 +00:00
vspenubarthi	893d763276	[ao] Implemented Outlier Detection Report Generation (#80937 ) Summary: This adds the implementation for the report generation for the Outlier Detector class. This includes both the generation of a dictionary containing each module that had an observer attached and any relavent stats collected by the observer that can help shed light on outlier relavent data or computed metrics. It also includes a string denoting specific modules that had outliers and gives a bit of insight into what channels they are contained in. This contains both the implementation for the report generation for the outlier detector as well as a test class to test the report generation functionality. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80937 Approved by: https://github.com/andrewor14	2022-07-12 19:56:33 +00:00
zaf	55d1b376ea	[ao][sparsity] Vectorized WeightNormSparsifier (#80059 ) The previous implementation was using loops to compute the sparsity within a block in a mask, as well as across the mask blocks. This implements the vectorized version. ## Vectorization: A high level overview of the vectorization procedure falls into a two step process: ### Tensor-level masking A tensor-level masking is a mask generation routine that has a granularity of `sparse_block_shape`. That means that only patches of that shape can be considered sparse/dense. To vectorize: 1. Reshape the data such that one of the dimensions represents the patches of sparse_block_shape. 2. Create a mask of the same shape as the reshaped data 3. Find the smallest `k` elements in the the data, given the dimension of the sparse "patches". `k` represents a derived paramter specifying the sparsity level. 4. Apply the 0/1 to the patches in the mask 5. Reshape the mask back to the original dimensions Note: because the shape of the mask might not be multiple of the sparse_block_shape, we nudge the sshape of the mask, and truncate it afterwards. ## Block-level masking A block-level masking is a mask generation routine that concerns itself only with sparsity within a patch of shape `sparse_block_shape`. This is useful when block sparsity allows partial block sparsification. To vectorize: Overall the block-level masking follows the same routine as the tensor-level algorithm described above. One distinction is that when reshaping the data/mask tensors we aim for creating a dimension that captures the internals of each patch. For example, if a `sparse_block_shape` is `(2, 2)`, we want to reshape the data/mask into `(2, 2, -1)`. That allows us to sort the internal elements on the last axis, and zero-out the ones that obey the sparse logic. Differential Revision: [D37352494](https://our.internmc.facebook.com/intern/diff/D37352494/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37352494/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/80059 Approved by: https://github.com/jerryzh168	2022-07-12 19:16:44 +00:00
PyTorch MergeBot	caee732aa1	Revert "[quant][fx] Support keyword arguments for functional linear (#79095 )" This reverts commit `d71fb40d98`. Reverted https://github.com/pytorch/pytorch/pull/79095 on behalf of https://github.com/jerryzh168 due to broken master	2022-07-09 21:45:01 +00:00
Jerry Zhang	d71fb40d98	[quant][fx] Support keyword arguments for functional linear (#79095 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/78117 Fixes: https://github.com/pytorch/pytorch/issues/73463 This PR adds a normalization pass that normalizes all the args to keyword args in positional order and fixes lowering code that previously only uses node.args to use both args and kwargs instead. Also tried to add a test for F.conv2d, but since conv2d matches multiple schemas we are doing an extra schema match, and because we are using symbolic values in `transform`, we don't have a schema match, so F.conv2d still fails with runtime errors. we can resolve this issue later when there is a need. Another thing I'm considering is to do the normalization with real inputs instead of symbolic inputs and not rely on operator_schemas (which is based on torchscript), and rely on inspect.signature, I tried this briefly but didn't get too far, it looks like we cannot get the python signature for `torch._C._nn.linear`, it might be possible to fix as well, but will need follow up discussions. The goal for this PR is just to introduce normalization in our codebase so that we can adapt some downstream code to this, and also fix the F.linear issue. Test Plan: python test/test_quantization.py TestQuantizeFx.test_normalize_args Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D37163228](https://our.internmc.facebook.com/intern/diff/D37163228) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79095 Approved by: https://github.com/andrewor14	2022-07-09 20:01:09 +00:00
Zafar	68ec793cfd	[ao] Moving the sparsity/experimental to sparsity/_experimental (#81149 ) The experimental code in the sparsity does not have user-facing api, and should reside under the proivate package. This involves pruner and base_sparsifier. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81149 Approved by: https://github.com/macandro96	2022-07-09 03:00:11 +00:00
Andrew Or	8fab682e47	[Quant][fx][bc-breaking] Do not move models to CPU in convert (#80555 ) Summary: Previously, we automatically moved the model to CPU in torch.ao.quantization.fx.convert to work around the issue where certain functions called by convert expect CPU arguments. This commit pushes this responsibility to the caller since it is the user's decision of which device to use. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps BC-breaking Notes: Before: ``` model = resnet18(...) model = prepare_fx(model, qconfig_mapping, example_inputs) ... # calibrate model = convert_fx(model) ``` After: ``` model = resnet18(...) model.cpu() model = prepare_fx(model, qconfig_mapping, example_inputs) ... # calibrate model = convert_fx(model) ``` Reviewers: jerryzh168 Subscribers: jerryzh168 Differential Revision: [D37528830](https://our.internmc.facebook.com/intern/diff/D37528830) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80555 Approved by: https://github.com/jerryzh168	2022-07-08 19:23:57 +00:00
vspenubarthi	6a7ed56d79	[ao] Added OutlierDetector observer insert implementation (#80880 ) Summary: This adds the implementation for observer insertion point selection for the OutlierDetector. For this detector, the insertion points are to insert a ModelReportObserver before any leaf level module to study the distribution of data that passes into the module to detect outliers. This commit contains the implementation of the observer insertion as well as the relavent test case. Some code from the InputWeightEqualization was abstracted and made more modular so the same helper function could be used for multiple outlier class tests. As a part of the work for this, there was testing done to determine what a good default ratio threshold and reference percentile would be, and the work to determine this (based on a normal distribution) was then analyzed to find good paramters. We still want to keep thresholds and reference percentile as something the user can input because these were based on a normal distribution, and it can definately vary depending on the type of data a user has. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80880 Approved by: https://github.com/andrewor14	2022-07-08 15:36:20 +00:00
Salil Desai	5c12cd224f	[PyTorch Edge] Add qnnpack bcsr matrix unpacking and use unpacking in Linear module (#80475 ) Having unpacking removes the need to store the original dense weights in the python Linear module Differential Revision: [D34699287](https://our.internmc.facebook.com/intern/diff/D34699287/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D34699287/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/80475 Approved by: https://github.com/qihqi	2022-07-07 15:32:21 +00:00
Salil Desai	eaf817df3a	[PyTorch Edge] Add serialization/deserialization of Sparse Quantize Linear Packed Params (#80474 ) Packed Params are serialized/deserialized in sparse form Differential Revision: [D34392761](https://our.internmc.facebook.com/intern/diff/D34392761/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D34392761/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/80474 Approved by: https://github.com/qihqi	2022-07-07 15:30:02 +00:00
Salil Desai	523b081a64	[PyTorch Edge] Remove Original Weight Tensor from QNNPack Sparse Quantized Linear Packed Params (#80473 ) We plan to add serialization/deserialization wihout the original weight tensor, so we no longer need to store it Differential Revision: [D34617321](https://our.internmc.facebook.com/intern/diff/D34617321/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80473 Approved by: https://github.com/qihqi	2022-07-07 15:11:44 +00:00
macandro96	daf00e843a	[ao][sparsity] Bug Fix: data norm sparsifier not working with 1D tensors/parameters (#80465 ) Issue: Previously, the L1/L2 norm data sparsifier was not supported with 1D tensors or parameters. Fix: If the tensor is 1D, then unsqueeze it to make it look 2D and perform the rest as usual. Also, added some 1D tensor in the unit test to test this issue. Test Plan: ```python test/test_ao_sparsity.py TestNormDataSparsifiers``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80465 Approved by: https://github.com/z-a-f	2022-07-06 21:04:19 +00:00
macandro96	ec594dd305	[ao][sparsity] Bug fix: data not correctly attached to the sparsifier (#80394 ) Issue: Previously, the data was not "attached" to the data sparsifier. Meaning the data sparsifier created a copy of the actual data inside it's container. So, when the data was modified outside of the sparsifier, the changes was not reflected in the sparsifier. Fix: Use register_buffer() instead of nn.Parameter(..) to store the data inside the container. Also, added a unit-test to reference this issue. Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` ```python test/test_ao_sparsity.py TestNormDataSparsifiers``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80394 Approved by: https://github.com/z-a-f	2022-07-06 20:57:32 +00:00
Vasiliy Kuznetsov	ce0786add2	fx quant: fix warning in util function when cloning tensors (#80883 ) Summary: Some of the util functions in FX graph mode quantization throw warnings such as: ``` /Users/vasiliy/pytorch/torch/ao/quantization/fx/utils.py:410: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach(). requires_grad_(True), rather than torch.tensor(sourceTensor). ``` This PR fixes the warnings by moving the code to the recommended syntax if the value is a tensor. Test plan: ``` python test/test_quantization.py -k test_conv_linear_reference // warning appeared before this PR and disappeared after this PR ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80883 Approved by: https://github.com/jerryzh168	2022-07-06 12:44:10 +00:00
Jiaxu Zhu	280f4704b7	[torch.fx] Check node type before fetching .users (#80166 ) Summary: as title currently it fails when `node` is actually a constant instead of `fx.Node` Test Plan: existing unit tests Differential Revision: D37389003 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80166 Approved by: https://github.com/jerryzh168	2022-07-05 23:32:22 +00:00
asl3	5b493ba18b	[quant] Refactor quantize clamping into float_to_apot util function (#80885 ) ### Summary: This PR moves the clamping functionality from `quantize` to `float_to_apot` util function to align with the uniform quantize workflow in the codebase. ### Test Plan: Run unit tests with: python pytorch/test/quantization/core/experimental/test_quantizer.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/80885 Approved by: https://github.com/dzdang	2022-07-05 19:28:37 +00:00
vspenubarthi	e5162dcfa7	[ao] Added framework for ModelReport Outlier Detector (#80743 ) Summary: This adds the class framework for the ModelReport OutlierDetector. This detector will be in charge of looking at activation data and figuring out whether there are significant oultiers present in them. It will average this data across batches to make a recommendation / warning if significant outliers are found. This commit contains just the class framework and a base test class. Implementations will follow in following commits. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80743 Approved by: https://github.com/HDCharles	2022-07-01 01:03:31 +00:00
PyTorch MergeBot	b64096a264	Revert "Add prelu op and module for quantized CPU backend (#73491 )" This reverts commit `3a6d6bc3cc`. Reverted https://github.com/pytorch/pytorch/pull/73491 on behalf of https://github.com/malfet due to Broke Windows builds, see `3a6d6bc3cc`	2022-06-30 12:54:39 +00:00
Weiwen Xia	3a6d6bc3cc	Add prelu op and module for quantized CPU backend (#73491 ) Add prelu op and module for quantized CPU backend. The PR includes: - Quantized version of prelu op - Native prelu kernel for quantized CPU - Prelu modules in `nn` and `nn.quantized` - FX support for prelu - Unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/73491 Approved by: https://github.com/jerryzh168	2022-06-30 06:50:22 +00:00
Jerry Zhang	1a7e560ade	[quant] Refactor quantization tracer to a separate file (#80268 ) Summary: att, since we need to reuse the tracer in some other places Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D37435748](https://our.internmc.facebook.com/intern/diff/D37435748) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80268 Approved by: https://github.com/vkuzo	2022-06-30 00:49:57 +00:00
HDCharles	fa6b6842e1	[ao][sparsity] removing leading '.' from fqn in utils (#79774 ) Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #79774 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79774 Approved by: https://github.com/z-a-f	2022-06-30 00:00:56 +00:00
HDCharles	3f1dc7ec00	[quant] Create default custom modules for LSTM and MHA (#79960 ) Summary: Currently we expect the users to provide custom modules for LSTM and MHA. However, as we almost always ask the users to use those modules in the custom context, it is better to make this behavior default. In this case we try to align with the base quantization API, if the user specifies a custom_config_dict then that is used, however if the value is left as None then the default is used. If a user would like to both use the default and modify it, they have to do so manually, however the default is accessible by get_default_custom_config_dict Additionally, the NS which uses prepare to insert custom observers for its purposes had to be slightly modified to pass in an empty custom_config_dict in order to avoid modifying the custom modules. due to weird CI issues with previous PR, previous discussion can be found: https://github.com/pytorch/pytorch/pull/71192 Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79960 Approved by: https://github.com/z-a-f	2022-06-30 00:00:46 +00:00
Andrew Or	c44317704a	[Quant][fx] Add default configs for fixed qparams ops (#80184 ) Summary: This commit adds qconfigs with special observers for fixed qparams ops in get_default_qconfig_mapping and get_default_qat_qconfig_mapping. For correctness, we also require users to use these special observers if we detect these fixed qparams ops in prepare. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37396379](https://our.internmc.facebook.com/intern/diff/D37396379) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80184 Approved by: https://github.com/jerryzh168	2022-06-29 23:07:26 +00:00
Andrew Or	17104d3d7f	[Quant][fx][bc-breaking] Replace is_reference with convert_to_reference (#80091 ) Summary: This PR removes the is_reference flag from the existing convert_fx API and replaces it with a new convert_to_reference function. This separates (1) converting the prepared model to a reference model from (2) lowering the reference model to a quantized model, enabling users to call their custom lowering function for custom backends. For the native fbgemm backend, for example, the following are equivalent: ``` from torch.ao.quantization.quantize_fx import prepare_fx, convert_fx prepared = prepare_fx(model, ...) quantized = convert_fx(prepared, ...) ``` ``` from torch.ao.quantization.fx import lower_to_fbgemm from torch.ao.quantization.quantize_fx import ( prepare_fx, convert_to_reference ) prepared = prepare_fx(model, ...) reference = convert_to_reference(prepared, ...) quantized = lower_to_fbgemm(reference, ...) ``` Note that currently `lower_to_fbgemm` takes in two other arguments that are difficult for users to provide. A future commit will remove these arguments to make the helper function more user friendly. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: [D37359946](https://our.internmc.facebook.com/intern/diff/D37359946) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80091 Approved by: https://github.com/jerryzh168	2022-06-29 23:01:27 +00:00
asl3	5070f5d18f	[quant] Implement APoT fake quantization (#79845 ) ### Summary: This PR implements APoT fake quantization for the purpose of quantization aware training. This implements `calculate_qparams` and `forward `methods to be used in fake quantization. ### Test Plan: Run unit tests with: `python pytorch/test/quantization/core/experimental/test_fake_quantize.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79845 Approved by: https://github.com/dzdang	2022-06-28 18:15:26 +00:00
zaf	cb5ef130b6	[ao][sparsity] Fixing failing internal pruner tests (#80111 ) After a recent change in the base_sparsifier API, the internal pruner started failing. This adopts the testcases to the change: 1. Changed `module_groups` to `groups` 2. Changed the fusion logic from taking care of the whole fused module to handling the submodules individually. Differential Revision: [D37364801](https://our.internmc.facebook.com/intern/diff/D37364801/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D37364801/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/80111 Approved by: https://github.com/macandro96	2022-06-28 04:38:58 +00:00
Andrew Or	8aedd8fb25	[Quant][fx] Hide equalization_config from prepare APIs (#80164 ) Summary: This PR hides the equalization_config argument from prepare_fx. This is a private API that we do not wish to expose to users and have to maintain backward compatibility for. Test Plan: python test/test_quantization.py TestEqualizeFx Reviewers: jerryzh168 Subscribers: jerryzh168 Differential Revision: [D37394353](https://our.internmc.facebook.com/intern/diff/D37394353) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80164 Approved by: https://github.com/jerryzh168	2022-06-28 04:20:34 +00:00
vspenubarthi	d4f8a6d05b	[ao] Implemented report generation for InputWeightEqualization Detector (#80191 ) Summary: This adds the implementation for the InputWeightEqualization detector. This includes both the implementation and the relavent test cases. This detector is meant to be added to initialize a ModelReport instance and it will keep track of the necessary statistics to decide if for certain layers of interest (linear and conv for now), it makes sense to use input weight equalization and gives the suggestion to the user. This includes the implementation and subsequent tests for the report generation functionality of the detector. The full detector should now be fleshed out and complete with this addition. This included modifications to the ModelReportObserver class as well to capture min and max per channel values. In addition, instead of passing in the observer class to instantiate, the detectors now pass the ModelReport instance the observer instance that they themselves instantiate. Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80191 Approved by: https://github.com/HDCharles, https://github.com/andrewor14	2022-06-28 01:30:20 +00:00
asl3	2727d88569	[quant] Modify APoT global methods to align with uniform API (#80364 ) ### Summary: This PR updates the APoT global API method signatures and parameters for `dequantize_APoT` and `calculate_qparams` to align with their uniform counterparts in the codebase. ### Test Plan: Run unit tests with: `python pytorch/test/quantization/core/experimental/test_nonuniform_observer.py` `python pytorch/test/quantization/core/experimental/test_quantizer.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80364 Approved by: https://github.com/jerryzh168	2022-06-27 22:48:09 +00:00
vspenubarthi	4f94c8e039	[ao] Implemented InputWeightEqualization Detector observer insertion (#79962 ) Summary: This adds the implementation for the InputWeightEqualization detector. This includes both the implementation and the relavent test cases. This detector is meant to be added to initialize a ModelReport instance and it will keep track of the necessary statistics to decide if for certain layers of interest (linear and conv for now), it makes sense to use input weight equalization and gives the suggestion to the user. This implements the functionality of adding observer points for the input-weight equalization detector and contains the relavent tests for this functionality. The full Detector functionality will be fleshed out in a later commit. Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79962 Approved by: https://github.com/HDCharles, https://github.com/andrewor14	2022-06-27 22:45:42 +00:00
anjali411	f68f77610a	Add __all__ to torch.nn.quantized, fx.passes, ao.nn and amp submodules (#80376 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80376 Approved by: https://github.com/albanD	2022-06-27 21:36:27 +00:00
anjali411	3bcc19b29a	Add __all__ to various submodules in torch.fx, distributions, distributed, package (#80367 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80367 Approved by: https://github.com/albanD	2022-06-27 21:27:30 +00:00
asl3	777c12f2df	[quant] Modify APoT nonuniform quantization workflow (#80075 ) ### Summary: This PR updates the design of APoT Observer, Quantizer, and Tensor to be more consistent with their uniform counterparts in the PyTorch framework. APoT Observer now calculates alpha as the max between the absolute values of the max and min values in the input tensor. APoT Quantizer is modified so its instance methods quantize_APoT and dequantize_APoT are called by their global method counterparts. APoT Tensor is modified to account for the new method definition of the `quantize_APoT` from APoT Quantizer. ### Test Plan: Run APoT Observer class unit tests with: `python pytorch/test/quantization/core/experimental/test_nonuniform_observer.py` Run APoT Quantize class unit tests with: `python pytorch/test/quantization/core/experimental/test_quantizer.py` Run APoT Tensor class unit tests with: `python pytorch/test/quantization/core/experimental/test_quantized_tensor.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80075 Approved by: https://github.com/jerryzh168	2022-06-27 14:54:06 +00:00
asl3	0eee81aaad	[quant] Modify APoT qparam quantization levels calculation (#80303 ) ### Summary: This PR updates an error in the the computation for APoT quantization levels to match the formula defined in the APoT paper: https://arxiv.org/pdf/1909.13144.pdf. ### Test Plan: Run unit tests with:` python pytorch/test/quantization/core/experimental/test_nonuniform_observer.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/80303 Approved by: https://github.com/dzdang	2022-06-27 13:34:05 +00:00
macandro96	524d181267	[ao][sparsity] Implemented state_dict() and load_state_dict() functions (#79883 ) The state_dict() of the DataScheduler contains all the class attributes of the scheduler other than the data_sparsifier. Remember to store and restore the state_dict() of the data_sparsifier along with the scheduler Test Plan: ```python test/test_ao_sparsity.py TestBaseDataScheduler``` Differential Revision: [D37358607](https://our.internmc.facebook.com/intern/diff/D37358607) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79883 Approved by: https://github.com/HDCharles, https://github.com/z-a-f	2022-06-24 16:55:06 +00:00
macandro96	af4e2b2c42	[ao][sparsity] Implemented the step() function (#79822 ) The step() function calls the implemented get_schedule_param() and updates the data_group dictionary of the data sparsifier. Test Plan: ```python test/test_ao_sparsity.py TestBaseDataScheduler``` Differential Revision: [D37358609](https://our.internmc.facebook.com/intern/diff/D37358609) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79822 Approved by: https://github.com/z-a-f	2022-06-24 16:53:33 +00:00
macandro96	70b7bca423	[ao][sparsity] Base scheduler class for Data Schedulers (#79817 ) The BaseDataScheduler is the abstract scheduler class specifically for the BaseDataSparsifier class. This class controls a specific hyperparameter of the sparsifier class and varies it across the training process (or across time). Args: data_sparsifier (instance of BaseDataSparsifier) Implemented class data sparsifier class wherein the update_mask is implemented schedule_param (str) A specific hyperparameter of the passed sparsifier that needs to be scheduled/varied last_epoch (int, default=-1) This is specifically is passed when training needs to be resumed from a particular point. verbose (bool, default=False) Verbosity of the BaseDataScheduler The get_schedule_param() function needs to be implemented by the user. Test Plan: ```python test/test_ao_sparsity.py TestBaseDataScheduler``` Differential Revision: [D37358608](https://our.internmc.facebook.com/intern/diff/D37358608) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79817 Approved by: https://github.com/jerryzh168, https://github.com/z-a-f	2022-06-24 16:51:52 +00:00
vspenubarthi	845021db2c	[ao] Adds framework for InputWeightEqualization Detector (#79916 ) Summary: This adds the framework (method signatures and descriptors) for the InputWeightEqualization Detector. There is no code implemenation yet so the test suite for this is a simple pass. This Detector will be used to determine whether input weight equalization should be recommended. Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79916 Approved by: https://github.com/HDCharles	2022-06-24 14:51:15 +00:00
HDCharles	9cbc692ba8	[ao][sparsity] adding type hints to sparsifier and utils Summary: no functional changes to the code, only type hints/formatting Test Plan: python test/test_ao_sparsity.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79598 Approved by: https://github.com/z-a-f	2022-06-23 04:39:28 +00:00
HDCharles	655419fc59	[ao][sparsity] removing old sparsity API and updating tests Summary: per our design discussion about sparsity API, we're discontinuing the old API in favor of the new tensor_fqn based one. The pruning class has not been updated mostly because this change doesn't cause any further knock on effects. Test Plan: python test/test_ao_sparsity.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79597 Approved by: https://github.com/z-a-f	2022-06-22 23:55:20 +00:00
vspenubarthi	70be6f8470	[ao] Added generate report capability to ModelReport class Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the implementation and the tests for the generate_report method which is used on a callibrated fx model to generate reports based on data collected by the inserted observers during the callibration phase and also potentially remove those observers if desired. This also addresses and fixes a revert issue that has been fixed. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80054 Approved by: https://github.com/HDCharles	2022-06-22 21:19:29 +00:00
vspenubarthi	f714d8f574	[ao] Added insertion of observer capability to ModelReport class Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the implementation and tests for the prepare_detailed_calibration method which is used on a prepared fx model to insert the desired observers for the different detectors. This also fixes a revert issue with the applied fix. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80053 Approved by: https://github.com/HDCharles	2022-06-22 21:16:17 +00:00
vspenubarthi	01720ae3b6	[ao] Added ModelReport class outline for Fx Graph Modules Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the init method and the signatures and docs for each of the proposed helper functions. This also address and fixes a revert issue. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80052 Approved by: https://github.com/HDCharles	2022-06-22 21:12:58 +00:00
asl3	82a1961129	[quant] Implement APoT_tensor class Pull Request resolved: https://github.com/pytorch/pytorch/pull/79940 Approved by: https://github.com/dzdang	2022-06-22 18:18:39 +00:00
asl3	184443f1b4	[quant] Modify APoT utility function comments Pull Request resolved: https://github.com/pytorch/pytorch/pull/80028 Approved by: https://github.com/dzdang	2022-06-22 16:10:48 +00:00
PyTorch MergeBot	ea6fa8dc95	Revert "[ao] Added ModelReport class outline for Fx Graph Modules" This reverts commit `0f95e1846c`. Reverted https://github.com/pytorch/pytorch/pull/79595 on behalf of https://github.com/malfet due to Broke tests on MacOS, see `0f95e1846c`	2022-06-22 12:43:07 +00:00
PyTorch MergeBot	4bed8a23e2	Revert "[ao] Added insertion of observer capability to ModelReport class" This reverts commit `1cff414784`. Reverted https://github.com/pytorch/pytorch/pull/79752 on behalf of https://github.com/malfet due to Part of the stack that broke tests on MacOS, see `0f95e1846c`	2022-06-22 12:41:22 +00:00
PyTorch MergeBot	4e8e817dde	Revert "[ao] Added generate report capability to ModelReport class" This reverts commit `7751ed41a6`. Reverted https://github.com/pytorch/pytorch/pull/79792 on behalf of https://github.com/malfet due to Part of the stack that broke tests on MacOS, see `0f95e1846c`	2022-06-22 12:38:55 +00:00
vspenubarthi	7751ed41a6	[ao] Added generate report capability to ModelReport class Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the implementation and the tests for the generate_report method which is used on a callibrated fx model to generate reports based on data collected by the inserted observers during the callibration phase and also potentially remove those observers if desired. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79792 Approved by: https://github.com/HDCharles	2022-06-22 06:27:17 +00:00
vspenubarthi	1cff414784	[ao] Added insertion of observer capability to ModelReport class Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the implementation and tests for the prepare_detailed_calibration method which is used on a prepared fx model to insert the desired observers for the different detectors. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79752 Approved by: https://github.com/HDCharles	2022-06-22 06:03:14 +00:00
asl3	0b349f7e69	[quant] Dequantize apot tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/79530 Approved by: https://github.com/dzdang, https://github.com/jerryzh168	2022-06-22 05:15:06 +00:00
asl3	d6ec8398a9	[quant] Implement quantize APoT method Pull Request resolved: https://github.com/pytorch/pytorch/pull/79499 Approved by: https://github.com/dzdang, https://github.com/jerryzh168	2022-06-22 05:15:06 +00:00
asl3	f89e640810	[quant] Add quantizer class skeleton Pull Request resolved: https://github.com/pytorch/pytorch/pull/79936 Approved by: https://github.com/jerryzh168	2022-06-22 05:11:15 +00:00
vspenubarthi	0f95e1846c	[ao] Added ModelReport class outline for Fx Graph Modules Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the init method and the signatures and docs for each of the proposed helper functions. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79595 Approved by: https://github.com/andrewor14	2022-06-22 02:47:24 +00:00
vspenubarthi	0656e9e595	[ao] Adding model report detector base class and implemented detectors Summary: the goal is to add a base class to the model report detectors so that they can contain a lot more specific information compared to the primary model report class related to the observers and where they are inserted etc. Since this is just a base class, the testing will be with the implemenations of the classes that derive from the base class The two current detector methods were turned into Detector classes and the tests were modified to reflect this, but the same functionality was tested. As a result, _detector.py was changed to detector.py Test Plan: python test/test_quantization.py TestFxModelReportDetector python test/test_quantization.py TestFxModelReportDetectDynamicStatic Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79671 Approved by: https://github.com/andrewor14	2022-06-21 00:21:50 +00:00
HDCharles	644c3cfa0a	[ao][sparsity] add option for tensor_fqn to sparsity API Summary: updated sparsity api to accept tensor_fqn as primary specification method, i.e. [{'tensor_fqn': 'linear.weight'}] Pruning API also updated due to knock on changes. left old api for accepting module_fqns but changed 'fqn' to 'module_fqn' for clarity (this will break BC) updated variables in code to use module rather than layer updated state dict to use tensor_fqn rather than 'fqn' or 'module_fqn' Test Plan: python test/test_ao_sparsity.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79113 Approved by: https://github.com/z-a-f	2022-06-20 16:08:34 +00:00
asl3	228e082ca9	[quant] Refactor nonuniform quantization mapping functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/79790 Approved by: https://github.com/dzdang	2022-06-20 13:06:22 +00:00
Vasiliy Kuznetsov	7b4e92acef	fx quant: refactor qconfig setting out of find_matches Summary: Refactors `find_matches` function to only find subgraph matches and not assign qconfigs to them. Moves the qconfig assignment outside of the function. No logic change. This will useful for prototyping future tools for quantizing parts of the model. These tools will need to know the matches and will reuse the `find_matches` function, but they will assign their own qconfigs to them using a different strategy. Test plan: ``` python test/test_quantization.py -k Fx ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79713 Approved by: https://github.com/jerryzh168	2022-06-17 18:52:00 +00:00
Andrew Or	78144b9f35	[Quant][fx][bc-breaking] Replace custom_config_dict with config objects Pull Request resolved: https://github.com/pytorch/pytorch/pull/79066 Following https://github.com/pytorch/pytorch/pull/78452, this commit replaces the following config dicts with python objects: - prepare_custom_config_dict -> PrepareCustomConfig - convert_custom_config_dict -> ConvertCustomConfig - fuse_custom_config_dict -> FuseCustomConfig This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new APIs are as follows: ``` from torch.ao.quantization.fx.custom_config import PrepareCustomConfig prepare_custom_config = PrepareCustomConfig() \ .set_float_to_observed_mapping(float_class, observed_class) \ .set_non_traceable_module_names(["mod1", "mod2"]) \ .set_non_traceable_module_classes([class1, class2]) \ .set_input_quantized_indexes([0, 1]) \ .set_output_quantized_indexes([0]) \ .set_preserved_attributes(["attr1", "attr2"]) convert_custom_config = ConvertCustomConfig() \ .set_observed_to_quantized_mapping(observed_class, quantized_class) \ .set_preserved_attributes(["attr1", "attr2"]) model = prepare_fx( model, qconfig_mapping, example_inputs, prepare_custom_config=prepare_custom_config) model(data) model = convert_fx(model, convert_custom_config=convert_custom_config) ``` For backwards compatibility, prepare_fx, prepare_qat_fx, and convert_fx will continue to accept Dicts, which will be converted to the relevant CustomConfig object internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in Dicts as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Differential Revision: [D37088095](https://our.internmc.facebook.com/intern/diff/D37088095/) Approved by: https://github.com/jerryzh168	2022-06-16 17:50:07 +00:00
macandro96	751fbc4ce4	[ao][sparsity] Support for L2 norm based block data sparsifier L2-Norm Sparsifier This sparsifier computes the L2-norm of every sparse block and "zeroes-out" the ones with the lowest norm. The level of sparsity defines how many of the blocks is removed. This sparsifier is controlled by three variables: 1. `sparsity_level` defines the number of sparse blocks that are zeroed-out 2. `sparse_block_shape` defines the shape of the sparse blocks. Note that the sparse blocks originate at the zero-index of the tensor. 3. `zeros_per_block` is the number of zeros that we are expecting in each sparse block. By default we assume that all elements within a block are zeroed-out. However, setting this variable sets the target number of zeros per block. The zeros within each block are chosen as the smallest absolute values. Test Plan: ```python test/test_ao_sparsity.py TestNormDataSparsifiers``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79535 Approved by: https://github.com/z-a-f	2022-06-16 17:44:57 +00:00
macandro96	419b82e0fa	[ao][sparsity] L1 norm based block data sparsifier L1-Norm Sparsifier This sparsifier computes the L1-norm of every sparse block and "zeroes-out" the ones with the lowest norm. The level of sparsity defines how many of the blocks is removed. This sparsifier is controlled by three variables: 1. `sparsity_level` defines the number of sparse blocks that are zeroed-out 2. `sparse_block_shape` defines the shape of the sparse blocks. Note that the sparse blocks originate at the zero-index of the tensor. 3. `zeros_per_block` is the number of zeros that we are expecting in each sparse block. By default we assume that all elements within a block are zeroed-out. However, setting this variable sets the target number of zeros per block. The zeros within each block are chosen as the smallest absolute values. Test Plan: ```python test/test_ao_sparsity.py TestNormDataSparsifiers``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79534 Approved by: https://github.com/z-a-f	2022-06-16 17:43:22 +00:00
macandro96	70fc865237	[ao][sparsity] Support for embeddings and embedding bags in BaseDataSparsifier Added the embedding and embedding bags in the supported data types. Currently, the base data sparsifier extracts the weight and stores it as a parameter in the internal module container whose requires_grad=False. The embeddings inside the data sparsifier are non-trainable Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79254 Approved by: https://github.com/z-a-f	2022-06-16 17:41:38 +00:00
macandro96	490631aaf3	[ao][sparsity] Support for the nn.Parameter in BaseDataSparsifier Users can now just pass in a nn.Parameter (or layer.weight) to the Data Sparsifier. Note: The data sparsifier stores the passed nn.Parameter as a new parameter in the internal container module whose requires_grad=False. So, essentialy when the parameter is trained, it's new values are not reflected inside the data sparsifier class Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79253 Approved by: https://github.com/z-a-f	2022-06-16 17:40:06 +00:00
macandro96	69347e9f64	[ao][sparsity] Implemented state dict and serialization functionalities The state of the data sparsifier object contains the name->mask mapping, name -> config mapping and the state_dict() of the container. The load_state_dict() and __set_state__() automatically creates a container moduie and loads the named data internally without having the user to intervene. Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79427 Approved by: https://github.com/z-a-f	2022-06-16 17:38:25 +00:00
macandro96	596095dc82	[ao][sparsity] Support for sparsifying data operations on raw torch tensors. The users can now pass in raw torch tensors and the base class handles all the parametrizations and masking Example - >>> data_list = [('tensor_1', torch.randn(3,3)), ('tensor_2', torch.randn(4,4))] >>> defaults = {'sparsity_level': 0.7} >>> sparsifier = DerivedDataSparsifier(data_list = data_list, defaults) # Some sparsifier that inherits BaseDataSparsifier >>> new_tensor_to_add = {'name': 'tensor_3', 'data': torch.randn(5,5), 'sparsity_level': 0.3} >>> sparsifier.add_data(new_tensor_to_add) >>> # tensor_1 and tensor_2 will have sparsity_level of 0.7 but tensor_3 will have sparsity_level=0.3 Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79252 Approved by: https://github.com/HDCharles, https://github.com/z-a-f	2022-06-16 17:36:45 +00:00
macandro96	15828bcfd7	[ao][sparsity] Base class for Data Sparsifier Base Data Sparsifier class for all Data sparsifiers. The abstract class accepts raw torch tensors / embedding / embedding bags (refer to SUPPORTED_TYPES above) to prepare for sparsification. In this case, mask (and parametrizations) is owned by the class and not by the user. Specifically, the container object inside the class maintains the mask and parametrizations of the input data Test Plan: ```python test/test_ao_sparsity.py TestBaseDataSparsifier``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79251 Approved by: https://github.com/z-a-f, https://github.com/HDCharles	2022-06-16 17:31:22 +00:00
Andrew Or	61a1eef7fc	[Quant][fx] Add get_default_qconfig_mapping Summary: This follows https://github.com/pytorch/pytorch/pull/78452, which replaced the qconfig_dict with QConfigMapping. This PR additionally replaces get_default_qconfig_dict with get_default_qconfig_mapping. For backward compatibility, we deprecate the old functions instead of removing them. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/79618 Approved by: https://github.com/jerryzh168	2022-06-16 16:10:14 +00:00
asl3	afc037ae38	[quant] Add quantized levels visualization Pull Request resolved: https://github.com/pytorch/pytorch/pull/79198 Approved by: https://github.com/HDCharles	2022-06-16 06:10:34 +00:00
asl3	81f277002e	[quant] Add param calcs and tests for APoT observers Pull Request resolved: https://github.com/pytorch/pytorch/pull/78905 Approved by: https://github.com/dzdang	2022-06-15 23:24:48 +00:00
vspenubarthi	38952d9350	[ao] Added function to inform dynamic vs static appropriate Summary: The _detect_dynamic_vs_static function was added to take in a prepared fx graph model that already had ModelReportObservers built into it and uses the collected information to determine whether input and output are stationary or non-stationary and provides feedback on whether to make linear modules static or dynamic based on this information. This PR will be followed up soon with another PR that will more rigoursly test the whole end to end performance of this system, which is primarily how the function in this PR will be tested for functionality, which is why this one only has 1 test. Test Plan: python test/quantization/fx/test_model_report_fx.py TestModelReportDetectDynamicStatic Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79326 Approved by: https://github.com/HDCharles	2022-06-15 02:51:27 +00:00
vspenubarthi	8e05513152	[ao] Added ModelReportObserver to inform on dynamic vs static Summary: The purpose of this is to add to the module report functioality by creating an observer that will take a prepared fx module and suggest whether static or dynamic quantization is more appropriate. The tests for this have been written and included in the location indicated by the Test Plan Test Plan: python test/quantization/fx/test_model_report_fx.py TestModelReportObserver Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79243 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2022-06-14 19:08:40 +00:00
vspenubarthi	28c541776c	[ao] Added fx model report per_channel detector Summary: This code is meant to be a tool to help people get the most out of their backend by hinting them to use per_channel quantization if it's supported, which will help increase accuracy significantly. The code is completed and ready to be reviewed. Test Plan: test/quantization/fx/test_model_report_fx.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79104 Approved by: https://github.com/HDCharles	2022-06-10 08:09:59 +00:00
asl3	6fa202847e	Add TODO comment Pull Request resolved: https://github.com/pytorch/pytorch/pull/79068 Approved by: https://github.com/dzdang	2022-06-09 17:30:52 +00:00
Vasiliy Kuznetsov	d4aa204f11	ns for fx: further fixes for kwargs-only Summary: This is a follow-up to https://github.com/pytorch/pytorch/pull/78181, apparently that PR did not fix all errors in a Meta model using the NS shadow APIs. We do not have an OSS repro, so putting the PR up so we can test in fbcode. Test plan: ``` python test/test_quantization.py -k FXNumericSuite -f ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79015 Approved by: https://github.com/dzdang	2022-06-08 15:29:48 +00:00
Vasiliy Kuznetsov	71e1992b0d	quantization: remove most fp16 configs from fbgemm/qnnpack Summary: The fbgemm and qnnpack backends mostly support ops with quint8 activations. Historically, the default backend config has included ops with fp16 activations for other backends. This PR keeps the old config under a different name to keep the functionality tested, and makes the default config match fbgemm/qnnpack ops. Test plan: ``` python test/test_quantization.py -k TestQuantizeFx ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78528 Approved by: https://github.com/andrewor14	2022-06-06 19:02:53 +00:00
macandro96	a3468b7d4a	[ao][sparsity] Added the Nearly Diagonal Sparsifier This sparsifier creates a nearly diagonal mask to be applied to the weight matrix. Nearly Diagonal Matrix is a matrix that contains non-zero elements near the diagonal and the rest are zero. An example of a nearly diagonal matrix with degree (or nearliness) 3 and 5 are follows respectively. 1 1 0 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 1 0 1 1 1 Note that a nearly diagonal matrix with degree 1 is just a matrix with main diagonal populated This sparsifier is controlled by one variable: 1. `nearliness` defines the number of non-zero diagonal lines that are closest to the main diagonal. Currently - supports only odd number Note: This can be accelerated (vectorized) once the Spdiagonal feature (PR: #78439) is landed or the banded matrix feature is landed: https://stackoverflow.com/questions/52463972/generating-banded-matrices-using-numpy Test Plan: ``` python test/test_ao_sparsity.py TestNearlyDiagonalSparsifier ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78448 Approved by: https://github.com/z-a-f, https://github.com/HDCharles	2022-06-04 04:30:32 +00:00
Jerry Zhang	063c93665c	[quant] follow up fixes for prepare_fx/prepare_qat_fx calls in classyvision (#105 ) (#78660 ) Summary: X-link: https://github.com/fairinternal/ClassyVision/pull/105 As follow up for https://github.com/pytorch/pytorch/pull/76496, we fixes the TODOs in quantization tests by providing correct example_inputs in the tests Test Plan: classyvision sandcastle and ossci Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D36818665/V1/classyvision/)\| \|Modified Pages\| Differential Revision: D36818665 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78660 Approved by: https://github.com/vkuzo	2022-06-03 01:08:45 +00:00
asl3	308d813d45	Add nonuniform observer class and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78680 Approved by: https://github.com/dzdang	2022-06-02 16:29:21 +00:00
Jerry Zhang	22fd2f2e05	[quant] Factor out common operator configs from native.py (#78407 ) Summary: Some helper functions that generate operator configs based on dtype_configs are reused in native backend and tensorrt, so we factor out this part to a util file: common_operator_configs.py Test Plan: buck test mode/opt deeplearning/trt/fx2trt_oss/test/quant:test_quant_trt Differential Revision: D36728359 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78407 Approved by: https://github.com/vkuzo, https://github.com/andrewor14	2022-06-01 22:24:36 +00:00
asl3	7390658e80	Add APoT tensor class and tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/78577 Approved by: https://github.com/dzdang	2022-06-01 18:14:06 +00:00
Andrew Or	e41389f84b	[Quant][docs] Replace qconfig_dict with QConfigMapping in docs Summary: https://github.com/pytorch/pytorch/pull/78452 replaced qconfig_dict with QConfigMapping as the default API for prepare_fx, prepare_qat_fx, and convert_fx. We should update the docs to reflect this change as well. Test Plan: ``` cd docs make html cd build/html python -m server.http ``` Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/78533 Approved by: https://github.com/vkuzo	2022-06-01 15:10:48 +00:00
Nikita Shulga	8f7e3791ef	Make PyTorch importable on python-3.7.0 (#78500 ) By stringifying "typing.OrderedDict", as [`typing.OrderedDict`](https://docs.python.org/3.10/library/typing.html#typing.OrderedDict) were introduced by Python-3.7.2+ See similar fix in `21a82fb519` Partially addresses https://github.com/pytorch/pytorch/issues/78499 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78500 Approved by: https://github.com/atalman	2022-05-31 06:11:30 +00:00
Andrew Or	c7b4eec233	[Quant][fx][bc-breaking] Replace qconfig_dict with a config object (#78452 ) Summary: Previously, FX graph mode quantization configurations were specified through a dictionary of qconfigs. However, this API was not in line with other core APIs in PyTorch. This commit replaces this dictionary with a config object that users will create and pass to prepare and convert. This leads to better type safety and better user experience in notebook settings due to improved auto completion. The new API is as follows: ``` from torch.ao.quantization import QConfigMapping from torch.ao.quantization.quantize_fx import prepare_fx qconfig_mapping = QConfigMapping() .set_global(qconfig) .set_object_type(torch.nn.Linear, qconfig) .set_module_name_regex("foo.bar", qconfig) .set_module_name("mod", qconfig) prepare_fx(model, qconfig_mapping) ``` For backwards compatibility, `prepare_fx`, `prepare_qat_fx`, and `convert_fx` will continue to accept qconfig_dicts, which will be converted to QuantizationConfigs internally. Note that this commit does not modify existing tests to use the new API; they will continue to pass in qconfig_dict as before, which still works but triggers a deprecation warning. This will be handled in a future commit. Test Plan:* python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Differential Revision: D36747998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78452 Approved by: https://github.com/jerryzh168	2022-05-30 18:30:07 +00:00
Jerry Zhang	85f308275e	[fx2trt] Fix dummy weight initialization in conv1d converter (#78402 ) Summary: att, currently it errors out with the following error: ``` ---> 72 dummy_weight = trt.Weights(weight_shape) 73 layer = network.add_convolution_nd( 74 input=input_val, TypeError: __init__(): incompatible constructor arguments. The following argument types are supported: 1. tensorrt.tensorrt.Weights(type: tensorrt.tensorrt.DataType = <DataType.FLOAT: 0>) 2. tensorrt.tensorrt.Weights(a: numpy.ndarray) ``` full error: https://www.internalfb.com/phabricator/paste/view/P503598381 we need to pass arond a numpy ndarray instead of a shape here. and support conv1d in backend_config_dict for tensorrt Test Plan: ``` buck test mode/opt deeplearning/trt/fx2trt_oss/test/converters:test_convolution ``` ``` buck test mode/opt deeplearning/trt/fx2trt_oss/test/quant:test_quant_trt ``` Differential Revision: D36721313 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78402 Approved by: https://github.com/842974287	2022-05-27 04:48:45 +00:00
Jerry Zhang	7ea5fa3dd4	[reland][quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78286 Approved by: https://github.com/dzdang	2022-05-25 23:31:51 +00:00
Vasiliy Kuznetsov	53e05ad4b2	ns for fx: remove restriction on nodes with no args and only kwargs Summary: Removes the restriction from NS for FX on handling nodes which have no positional arguments, such as `F.linear(input=x, weight=w, bias=b). In order to achieve this, we delete all places in the code which were doing things like ``` node.args[0] ``` And replace them with ``` _get_normalized_nth_input(node, gm, 0) ``` The `_get_normalized_nth_input` function is a best effort way to get the n'th normalized input. This is needed because some FX tools output nodes normalized to be kwargs only, and we need to be able to handle this in NS. Test plan: ``` python test/test_quantization.py -k test_linear_kwargs_shadow ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78181 Approved by: https://github.com/z-a-f, https://github.com/hx89	2022-05-25 17:00:39 +00:00
PyTorch MergeBot	87148f2b59	Revert "[quant] Add utility function get_fqn_to_example_inputs" This reverts commit `50a44fe461`. Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master	2022-05-25 06:37:32 +00:00
Jerry Zhang	50a44fe461	[quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146 Approved by: https://github.com/vkuzo	2022-05-25 03:07:16 +00:00
Jerry Zhang	416899d1a9	[quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249 ) (#77608 ) Summary: X-link: https://github.com/facebookresearch/d2go/pull/249 X-link: https://github.com/fairinternal/ClassyVision/pull/104 X-link: https://github.com/pytorch/benchmark/pull/916 X-link: https://github.com/facebookresearch/ClassyVision/pull/791 X-link: https://github.com/facebookresearch/mobile-vision/pull/68 FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors. Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base. As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now. If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to pass the arguments by keyword BC-breaking Note: Before: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict) # or m = prepare_qat_fx(m, qconfig_dict) ``` After: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) # or m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)\| \|Modified Pages\| Reviewed By: vkuzo, andrewor14 Differential Revision: D35984526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608 Approved by: https://github.com/dzdang	2022-05-21 21:03:48 +00:00
Vasiliy Kuznetsov	c15fca1137	quant doc: improve rendered documentation for backend_config_dict Summary: This improves the documentation page for backend_config_dict to render the configurations in a human readable format, such as ``` { 'pattern': torch.nn.modules.pooling.AdaptiveAvgPool1d, 'dtype_configs': [ { 'input_dtype': torch.quint8, 'output_dtype': torch.quint8, }, { 'input_dtype': torch.float16, 'weight_dtype': torch.float16, 'bias_dtype': torch.float16, 'output_dtype': torch.float16, }, ], 'observation_type': ObservationType.OUTPUT_SHARE_OBSERVER_WITH_INPUT, }, ``` The results are also now sorted alphabetically by the normalized name of the root op in the pattern. A couple of utility functions are created to help with this. If in the future we convert backend_config_dict to use typed objects, we can move this logic to the objects at that time. Test plan: ``` cd docs make html cd build python -m server.http // renders correctly, example: https://gist.github.com/vkuzo/76adfc7c89e119c59813a733fa2cd56f ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77535 Approved by: https://github.com/andrewor14	2022-05-18 11:46:07 +00:00
Vasiliy Kuznetsov	b49a4be7ac	ns for fx: remove duplicated BNReLU mappings Summary: These mappings are already defined for `BatchNorm{n}d` as the root node, we don't need to specify them again. Removing to clean up the code. Test plan: ``` python test/test_quantization.py -k FXNumericSuite python test/test_quantization.py -k FXGraphMatcher ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76993 Approved by: https://github.com/jerryzh168	2022-05-13 20:41:42 +00:00
Vasiliy Kuznetsov	d8479098a6	ns for fx: remove quantized ReLU6 from mapping Summary: This module is no longer swapped by FX graph mode quantization, because it can take quantized inputs. Removing it from NS for FX mappings. Test plan: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76992 Approved by: https://github.com/jerryzh168	2022-05-13 20:38:31 +00:00
Vasiliy Kuznetsov	6a33b80191	ns for fx: remove GroupNorm from mapping Summary: GroupNorm quantization is defined but it looks like FX graph mode quantization does not have it enabled. Removing it from NS for FX. Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76991 Approved by: https://github.com/jerryzh168	2022-05-13 20:33:27 +00:00
Vasiliy Kuznetsov	6e05f76089	ns for fx: clean up linear in relationship mapping Summary: More cleanups in mappings: 1. makes the `nnqatd.Linear` entry be looked up dynamically 2. moves the `NonDynamicallyQuantizableLinear` down and marks it as edge case Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76990 Approved by: https://github.com/jerryzh168	2022-05-13 20:27:53 +00:00
Vasiliy Kuznetsov	9265cc2097	ns for fx: make torch.ops.quantized.dropout mapping dynamic Summary: Instead of hardcoding the relationship between `F.dropout` and `toq.dropout`, read it from the mapping. The mapping itself might need to be in the lowering file, but that's a separate issue. Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76989 Approved by: https://github.com/jerryzh168	2022-05-13 20:26:06 +00:00
Vasiliy Kuznetsov	cc59641acb	ns for fx: remove torch.ops.quantized.cat Summary: FX graph mode quantization no longer uses `torch.ops.quantized.cat`, instead `torch.cat` can use quantized inputs. This PR removes the outdated mapping from NS for FX. Test plan: ``` python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76988 Approved by: https://github.com/jerryzh168	2022-05-13 20:24:16 +00:00
Vasiliy Kuznetsov	20b75e3e5f	ns for fx: clean up convtranspose mappings Summary: Fixes a couple of problems with `ConvTranspose` in NS mappings: 1. deletes the dynamic versions, as they do not work yet 2. deletes `ConvTranspose3d`, as it's not swapped yet in the quantization workflow 3. removes a duplicate set Test plan: ``` python test/test_quantization.py -k FXGraphMatcher python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76980 Approved by: https://github.com/jerryzh168	2022-05-13 20:22:42 +00:00
Vasiliy Kuznetsov	0e067e4cc9	ns for fx to backend_config_dict [2/x]: native lowering mappings Summary: NS for FX mappings were originally hardcoded, because quantization op mappings were not easily reusable. Now that we have `backend_config_dict`, we can start moving NS for FX to use them and delete the hardcoded mappings. This PR deletes the hardcoded mappings from NS about the lowering step, and instead reads them from the lowering configs. Note: for now, there is no way to configure the tool to use lowering configs from a different lowering pass. That may be added at some future point, but it's not important now. Test plan: ``` python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76978 Approved by: https://github.com/jerryzh168	2022-05-13 20:21:01 +00:00
Vasiliy Kuznetsov	5419236946	ns for fx to backend_config_dict [1/x]: fused and qat modules Summary: NS for FX mappings were originally hardcoded, because quantization op mappings were not easily reusable. Now that we have `backend_config_dict`, we can start moving NS for FX to use them and delete the hardcoded mappings. This first PR deletes the hardcoded mappings for `nni` and `nniqat` modules, and instead reads these mappings from `backend_config_dict`. Future PRs will incrementally move more of the mappings over. Test plan: ``` python test/test_quantization.py -k FXNumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76958 Approved by: https://github.com/jerryzh168	2022-05-13 20:14:39 +00:00
dzdang	1d7b294574	[quant][better-engineering][bc-breaking] Removed quant_min/quant_max from fake_quant modules Summary: FakeQuantize class has quant_min/quant_max and activation_post_process attributes, the latter of which already includes quant_min/max. As such, we can remove quant_min/quant_max from FakeQuantize and use FakeQuantize.activation_post_process.quant_m* directly. Test plan: ``` python test/test_quantization.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76674 Approved by: https://github.com/vkuzo	2022-05-11 14:23:05 +00:00
Mengchi Zhang	d7035c1cbb	[FX qconfig] add weighted_op_qint8_dtype_config for int8 TRT and import linear config to get_tensorrt_backend_config_dict (#76877 ) Summary: - Add weighted_op_qint8_dtype_config - import import linear config to get_tensorrt_backend_config_dict() - Add qconfig for _get_linear_configs() Test Plan: CI Differential Revision: D36152319 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76877 Approved by: https://github.com/jerryzh168	2022-05-10 16:26:20 +00:00
Vasiliy Kuznetsov	3a8752db86	ns for fx: skip shadowing ops if copy subgraph is not implemented (#76663 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76663 Subgraph copy does not handle all edge cases. It's high eng time to handle them all, and currently an unhandled edge case crashes the script. This PR adds a function to check if the subgraph copy is supported, and skips shadowing if it is not supported. This way the model can still go through the shadowing APIs without an exception. Test Plan: ``` python test/test_quantization.py -k FXNumericSuite ``` Reviewed By: hx89 Differential Revision: D36069304 Pulled By: vkuzo fbshipit-source-id: 6b38b8d8e43396a4cf2373b247223a19d451d096 (cherry picked from commit e2322ca0635c51a4701e60fa90f77915a3c46d0f)	2022-05-05 13:19:53 +00:00
Vasiliy Kuznetsov	d3e338935a	ns for fx: skip shadowing for torch.cat, and also for nodes with only kwargs (#76561 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76561 User model had syntax like `torch.cat(tensors=[x])`. This PR fixes two errors to unbreak this in NS shadow model: 1. skip nodes which only have kwargs (instead of throwing an exception) 2. explicitly skip shadowing of `torch.cat` (since it's not supported anyways) Test Plan: ``` python test/test_quantization.py -k test_op_with_only_kwargs_skips_shadowing python test/test_quantization.py -k test_op_mul_add_cat_skips_shadowing ``` Reviewed By: hx89 Differential Revision: D36017356 Pulled By: vkuzo fbshipit-source-id: 0da4840a62c2dac183f8294c2cec4fce262474b3 (cherry picked from commit 88409c1576e7f690708957b2baa285fc7961e9d6)	2022-05-05 13:19:53 +00:00
dzdang	e2aa28a2d0	[quant][fx][improvement] Renamed default_affine_fixed_qparams_observer and default_symmetric_fixed_qparams_observer (#76637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76637 The previous naming convention `default_affine_fixed_qparams_observer` and `default_symmetric_fixed_qparams_observer` were uninformative, and users had to read the definition in order to understand what these observers are. The new naming convention reveals information about the range of the observers The analogous changes were also made for `default_symmetric_fixed_qparams_fake_quant` and `default_affine_fixed_qparams_fake_quant` Test Plan: ``` python test/test_quantization.py ``` ``` python test/test_quantization.py ``` Differential Revision: D36054169 D36054169 Reviewed By: vkuzo Pulled By: dzdang fbshipit-source-id: 215f7786a4b7abda7327f17cc61735697ec5cca9 (cherry picked from commit 21a4e6eda4467c8adca7fd534a506a14e975f9cf)	2022-05-04 02:39:20 +00:00
Michael Suo	fb0f285638	[lint] upgrade mypy to latest version Fixes https://github.com/pytorch/pytorch/issues/75927. Had to fix some bugs and add some ignores. To check if clean: ``` lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753 Approved by: https://github.com/malfet	2022-05-03 20:51:34 +00:00
PyTorch MergeBot	3d7428d9ac	Revert "[lint] upgrade mypy to latest version" This reverts commit `9bf18aab94`. Reverted https://github.com/pytorch/pytorch/pull/76753 on behalf of https://github.com/suo	2022-05-03 20:01:18 +00:00
Michael Suo	9bf18aab94	[lint] upgrade mypy to latest version Fixes https://github.com/pytorch/pytorch/issues/75927. Had to fix some bugs and add some ignores. To check if clean: ``` lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753 Approved by: https://github.com/malfet	2022-05-03 19:43:28 +00:00
Vasiliy Kuznetsov	e155e2584a	ns for fx: skip operator.add and operator.mul when shadowing (#76504 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76504 Shadowing for add and mul is not implemented, this PR fixes the skipping logic to also skip the `operator.add` and `operator.mul` flavor of these operators. Test Plan: ``` python test/test_quantization.py -k test_mul_add_skips_shadowing ``` Reviewed By: dzdang Differential Revision: D35985997 Pulled By: vkuzo fbshipit-source-id: f832e54a5461d3b182df4bb905357d6c66742e98 (cherry picked from commit 93ae9592f68873865ebfdc438bffb1c9486dd1c1)	2022-05-03 05:58:46 +00:00
Vasiliy Kuznetsov	385e5ba561	ns for fx: more meaningful error message when creating shadow model (#76468 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76468 This makes the error message when copying an unsupported node more verbose. This is useful to debug where specifically in a user model this is failing. Test Plan: 1. hardcode this condition to hit 2. run NS tests 3. verify the exception now prints details about the offending node Reviewed By: jerryzh168 Differential Revision: D35978652 Pulled By: vkuzo fbshipit-source-id: 9cc93dfa46469bf6ef60aa38d4011041b6709df9 (cherry picked from commit c6e382c2a69aba6ba66740f238bc14446521a433)	2022-05-03 05:58:46 +00:00
Vasiliy Kuznetsov	04369f637c	quant: rename _ObserverBase to UniformQuantizationObserverBase (#76461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76461 Renaming as the old name was confusing. The name represents better what this class is doing. Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D35976350 Pulled By: vkuzo fbshipit-source-id: 6da6c1767cec729c3959b13ae9dd939d0b2f622c (cherry picked from commit 065608ef42c599525bfad4603af74c5bdf0881c3)	2022-05-03 05:53:54 +00:00
Vasiliy Kuznetsov	31d5a300ac	quant: make RecordingObserver inherit from ObserverBase (#76460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76460 `RecordingObserver` inherits from `_ObserverBase` but does not use any functionality from it. Making it inherit from `ObserverBase` instead. This will make it simpler to rename `_ObserverBase` to something more meaningful in the next PR. Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D35976351 Pulled By: vkuzo fbshipit-source-id: 19c106bf0d48607c231702e2e048f42a7f48a5c6 (cherry picked from commit 4fd44123b0e9bcdcae546aecabe80d7642129cf5)	2022-05-03 05:53:54 +00:00
lkct	9fae0762b0	fix typing in `Module.state_dict` and `load_state_dict` Fixes #72707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73483 Approved by: https://github.com/albanD, https://github.com/jbschlosser	2022-05-02 17:27:54 +00:00
Jerry Zhang	b69d44daa5	[quant] Fix tensorrt config after the backend_config_dict refactor (#76414 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76414 Previously we refactored FX Graph Mode Quantization code base to use a native backend config dict for fbgemmq/qnnpack, because of this, we need to defien the backend config dict for tensorrt properly as well (previously it was relying on fbgemm/qnnpack configs), this PR added some configs to enable uru10x10 again Test Plan: buck run mode/dev-nosan -c fbcode.split-dwarf=true -c fbcode.platform=platform009 accelerators/workloads/models/uru10x10:uru_10x10_to_trt_eval -- --int8 Reviewed By: vkuzo Differential Revision: D35939944 fbshipit-source-id: c64ade5074f5a8ee74a833bb990cd7a91c2cb152 (cherry picked from commit 02855a5ef8c196fb5b0defdfff58d6f2b94c693e)	2022-04-28 06:12:26 +00:00
Jerry Zhang	8326af0117	[quant] Fix TensorRT tests (#76148 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76148 X-link: https://github.com/pytorch/fx2trt/pull/60 Recently, we landed some PRs to enable backend_config_dict by default in the quantization codebase, we also changes the config to include "fused_module" for a pattern, but we didn't update tensorrt backend config dict, this PR adds that configuration. Also adds the config for binary ops in TensorRT, since it was relying on fbgemm backend config dict previously Test Plan: Facebook internal tests Reviewed By: andrewor14, frankgt40 Differential Revision: D35789709 fbshipit-source-id: 9dc93b9f454eff6baefb38c4c1567f88da2a1506 (cherry picked from commit 7d30e5ecbfd096c32cdb1b68abde394bcba45f94)	2022-04-21 17:27:05 -07:00
Vasiliy Kuznetsov	35545d85dc	fx quant: add quantized Softmax workflow integration (#75106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75106 In https://github.com/pytorch/pytorch/pull/75017 a quantized softmax kernel was added. This PR adds the FX graph mode quantization workflow integration to swap `nn.Softmax` to `nnq.Softmax`. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_fixed_qparams_ops ``` Reviewed By: kimishpatel, andrewor14 Differential Revision: D35324817 Pulled By: vkuzo fbshipit-source-id: 710ae3bedf8a6ad1dc411cd9808fdd0ce743e757 (cherry picked from commit d67603c0fbb1d3469d97bd538cec38aa8b03324b)	2022-04-20 21:54:26 +00:00
Jerry Zhang	74454bdb46	[quant][fx] Move backend_config folder to torch.ao.quantization Summary: Following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md we implemented the backend configuration for fbgemm/qnnpack backend, currently it was under fx folder, but we'd like to use this for all different workflows, including eager, fx graph and define by run quantization, this PR moves it to torch.ao.quantization namespace so that it can be shared by different workflows Also moves some utility functions specific to fx to fx/backend_config_utils.py and some files are kept in fx folder (quantize_handler.py and fuse_handler.py) Test Plan: python test/teset_quantization.py TestQuantizeFx python test/teset_quantization.py TestQuantizeFxOps python test/teset_quantization.py TestQuantizeFxModels python test/test_quantization.py TestAOMigrationQuantization python test/test_quantization.py TestAOMigrationQuantizationFx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75823 Approved by: https://github.com/vkuzo	2022-04-19 15:38:57 +00:00
Andrew Or	5dcbcc6de8	[Quant][fx] Fix get_default_qconfig_dict for fused modules Summary: Calling `prepare_fx` with `get_default_qconfig_dict` failed for models with fused modules, such as `ConvReLU2d`. This commit fixes this by adding qconfig entries for ReLU and BatchNorm as well. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qconfig_dict_with_fused_modules Reviewers: jerryzh168 Subscribers: jerryzh168, vkuzo Issue: https://github.com/pytorch/pytorch/issues/75825 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75838 Approved by: https://github.com/jerryzh168	2022-04-15 22:37:26 +00:00
Jerry Zhang	0c08fcff32	[quant][fx] Cleanup some unused states and args Summary: * Removed "patterns" from observed module since it's no longer needed * Removed an arg from insert_observer * Removed some unused keys in checking the validity of qconfig_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75521 Approved by: https://github.com/andrewor14	2022-04-14 13:18:00 +00:00
Vasiliy Kuznetsov	f1f185f6f9	ns for fx: fix bug to enable again on torchvision models Summary: The tests were disabled by https://github.com/pytorch/pytorch/pull/61687, but this specific behavior broke some time after while these tests were disabled. The issue was that: 1. `torch.add` is present in these models 2. In the common codepath of comparing fp32 to int8, torch.ops.quantized.add was already filtered out because it did not have a dtype specified 3. In the less common codepath of comparing fp32 to fp32, torch.add was eligible for shadowing, but the logic was broken This PR fixes (3) by disabling shadowing on ops which do not support it, by op type. The support may be built later, if needed. Test plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_resnet18 python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels.test_mobilenet_v2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75472 Approved by: https://github.com/jerryzh168	2022-04-13 19:44:46 +00:00
Vasiliy Kuznetsov	ae3210420e	ns for fx: fix issue with shadowing nodes of unknown dtype Summary: In https://github.com/pytorch/pytorch/pull/61687, a couple of FX Numeric Suite tests were disabled. This PR reenables one of these tests. We update the dtype inference logic of NS to always return a specific type instead of sometimes returning "fp32 or int8". When the type cannot be deduced by the current logic, we do not shadow the node. As a better version of dtype inference becomes available in FX Graph Mode Quantization, we could migrate this code to use it. Future PRs in the stack will unbreak other things to enable NS for FX to work on torchvision again. Test plan: ``` python test/test_quantization.py -k NumericSuite ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75471 Approved by: https://github.com/jerryzh168	2022-04-13 19:44:46 +00:00
Jerry Zhang	bc371a2cd0	[quant][fx][fix] Add additional checks when tracing back during maybe share output observer function Summary: Currently in `maybe_make_input_output_share_observers` we trace back from a node to find the activation_post_process of the input node, we have internal use case which would error out during tracing back, this PR is adding a guard during this process to return False early when the node doesn't have any input Test Plan: not sure when this would happen, verify within the internal test case Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75650 Approved by: https://github.com/vkuzo	2022-04-13 00:33:49 +00:00
Jerry Zhang	761bb06292	[quant][fx] Use native backend_config_dict in convert Summary: Previously the list of qat modules, fused modules etc. are hardcoded in the convert code, in this PR we get these information from backend_config_dict instead Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels python test/test_quantization.py TestFXNumericSuiteCoreAPIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75520 Approved by: https://github.com/vkuzo	2022-04-12 17:59:24 +00:00
Jerry Zhang	f83d047338	[quant][fx] Use native backend_config_dict in prepare Summary: Previously we are still relying on the registration mechnism and get the default quantize handlers that are registered, now we have moved all registration to backend_config_dict we can get all quant patterns just from backend_config_dict now. This PR enables using native backend_config_dict everywhere in prepare when the backend_config_dict is None, we'll also do similar changes in convert as well Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75469 Approved by: https://github.com/vkuzo	2022-04-12 17:05:31 +00:00
Jerry Zhang	72d3d160fb	[quant][fx] Remove additional_object_mapping from the docs (#75389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75389 This seems to be removed before, so won't mark this PR as bc-breaking, this use case is now enabled with backend_config_dict api Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451960 fbshipit-source-id: 21a8f19c1968af44bf4fa603f16ee8c6f5080e5a (cherry picked from commit 2862f17b57f846b55736bc6b5d10df4256567adf)	2022-04-11 10:40:11 +00:00
Jerry Zhang	bcf6974c20	[qunat][fx] Remove "additional_fuser_method_mapping" key from prepare_custom_config_dict (#75388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75388 This is now replaced with backend_config_dict, we don't want to expose the implementation detail to users. We'll have docs for backend_config_dict later Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451958 fbshipit-source-id: 86e482d0782470ea02408836755cfc8531b8f66e (cherry picked from commit 072541824b454e30df2b48758f465ebd814b436e)	2022-04-11 05:30:52 +00:00
Jerry Zhang	55d479aca5	[qunat][fx][bc-breaking] Remove "additional_qat_mapping" key from prepare_custom_config_dict (#75387 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75387 This is now replaced with backend_config_dict, we don't want to expose the implementation detail to users. We'll have docs for backend_config_dict later Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451955 fbshipit-source-id: 77ede61f1d8f169dc1e1e6d847244ba99a97ab76 (cherry picked from commit 953576259fdc8827437acb6f5d04e584e37a7d64)	2022-04-11 05:03:49 +00:00
Jerry Zhang	f42bdff016	[qunat][fx][bc-breaking] Remove "additional_quant_pattern" key from prepare_custom_config_dict (#75386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75386 This is now replaced with backend_config_dict, we don't want to expose the implementation detail to users. We'll have docs for backend_config_dict later Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: ezyang Differential Revision: D35451957 fbshipit-source-id: 52ebb5fb20cd96c1f21410b07c3d0c448c58cdba (cherry picked from commit ccb38026f14644f9eb43335b7a7de5568c556394)	2022-04-09 16:43:41 +00:00
Jerry Zhang	f26891c8b7	[quant][fx] Using native backend_config_dict in fusion (#75378 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75378 Previously we are still relying on the registration mechnism and get the default fusion patterns that are registered, now we have moved all registration to backend_config_dict we can get all fusion patterns just from backend_config_dict now. This PR enables using native backend_config_dict everywhere in fusion when the backend_config_dict is None, we'll do similar changes for prepare and convert in the future, to fully enable backend_config_dict in quantization code base. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo Differential Revision: D35451962 fbshipit-source-id: 31d51850c669e061b67d6d9e0efec994f7ea79ed (cherry picked from commit 60cc2dcadce705a923f9279465e3fb0e8fddad48)	2022-04-09 16:12:28 +00:00
Jerry Zhang	689cec9493	[quant][fx] Remove "additional_fusion_pattern" from prepare_custom_config_dict (#75377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75377 This is in `prepare_custom_config_dict` but we never talked about them before, and we didn't find use cases internally So it should be OK to remove. We can now serve the same use case with `backend_config_dict` api Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35451961 fbshipit-source-id: 8a44c4518eecd50fab7ea2ff06697527b1cdb049 (cherry picked from commit 964183ed26bd8f367a4cf7fcc991eb519dc31a58)	2022-04-09 05:31:19 +00:00
Jerry Zhang	dd667b6e97	[quant][fx] Move all fusion registrations to backend_config_dict (#75318 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75318 This PR moves the registrations for fusion patterns to backend_config_dict Also fixed one issue in numeric suite graph matcher, since now (torch.nn.ReLU, torch.nn.BatchNorm3d) would appear in quant patterns, (previously only in fusion pattern), and we need to match sure (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d)) can match before (torch.nn.ReLU, torch.nn.BatchNorm3d), but previously, it looks like (torch.nn.ReLU, (torch.nn.BatchNorm3d, torch.nn.Conv3d)) is not really matched since `end_node_matches_reversed_fusion` is expecting a flattened pattern like (torch.nn.ReLU, torch.nn.BatchNorm3d, torch.nn.Conv3d), for now we'll manually flatten this pattern, but in the future I think we might want to use the matching function `is_match` under torch.ao.quantization.fx.match_utils to do this matching. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo, andrewor14 Differential Revision: D35423788 fbshipit-source-id: a54093ccebae9c59aeee9399669ddb2c48bfb9aa (cherry picked from commit 6a55ea8eb2740cedafb9972888fedf68e927586d)	2022-04-09 05:08:37 +00:00
Andrew Or	0bdf9a9833	[Quant][fx] Decouple prepare_*fx from training/eval modes (#75401 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75401 This commit removes asserts that require prepare_fx to be run in eval mode and prepare_qat_fx to be run in training mode. Test Plan: python test/test_quantization.py TestQuantizeFx.test_prepare_mode Imported from OSS Reviewed By: vkuzo, jerryzh168 Differential Revision: D35457100 fbshipit-source-id: 13a55b13d9e389991f69c06c6a70bc51cdebba36 (cherry picked from commit fb0685e0873dc8e807da3213be403b51e8b4a687)	2022-04-08 15:34:08 +00:00
Jerry Zhang	e4817e6c13	[quant][fx] Move embedding ops to backend_config_dict (#75317 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75317 att, similar to previous PRs, this one moves dynamically quantized rnn ops to backend_config_dict we have some temporary configs in backend_config_dict, but it will be removed soon, we want to migrate everything to backend_config_dict so that we can enable this path for all the code in the code base, starting from prepare, then to convert. We can start this process after this PR Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo Differential Revision: D35423789 fbshipit-source-id: 9391bde6f4cbceb45de4ce9aaee136c9bfde8ab7 (cherry picked from commit 909edb9f131e9ba047b49d51a6c300da77988cb3)	2022-04-08 15:10:12 +00:00
Jerry Zhang	9905b1f29a	[quant][fx] Move rnn ops to backend_config_dict (#75316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75316 att, similar to previous PRs, this one moves dynamically quantized rnn ops to backend_config_dict Currently the dtype check is not yet enabled, so we provided the dtype_configs but it is not really used yet, we will enable it a bit later after we moved everything to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: malfet Differential Revision: D35423792 fbshipit-source-id: ef862ea1be5bfb4c28130775c3b2158df28d3e22 (cherry picked from commit 0247f3a768a2c165f482a66c4225b3357e33e966)	2022-04-08 08:58:50 +00:00
Terry Chen	37dea0454d	[quant] add checking number of args when checking observer in same graph (#75460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75460 add checking for number of args checking observer in same graph Test Plan: python3 test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: malfet Differential Revision: D35479504 fbshipit-source-id: d7dc38a27fdf8e0b236b6976d484b0701c61184c (cherry picked from commit 45542f796f5e6f6259f3ec647dbd2a9fa69ababc)	2022-04-08 03:56:03 +00:00
Charles David Hernandez	8ac4729105	[ao][sparsity] Composability of fusion and sparsity (#74847 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74847 Similar to the other PRs in this stack, the main problem was that fusion needed to detect the original module type of parametrized module when sparse prepare was called before fusion. In addition, there was a potential issue with fusion before sparse_prepare but after the sparse_config is created. However, in practice fusion moves the references to the original modules into the fused module without issue. Thus the original sparse_config that pointed to the original modules gets automatically updated. If the fusion method changes this may cause an issue since no explicit handling or updating of these pointers was needed. Test Plan: python test/test_ao_sparsity.py TestComposability Imported from OSS Reviewed By: vkuzo, andrewor14, jerryzh168 Differential Revision: D35240273 fbshipit-source-id: 62ed66689b285c3fa68f1e149266ab877f1cdd8e (cherry picked from commit 2adb002c43f702fa1f18637157264fcbc545002a)	2022-04-08 00:44:12 +00:00
Jerry Zhang	7f0d79625b	[quant][fx] Move output share qparam with input ops to backend_config_dict (#75315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75315 att, similar to previous PRs, this one moves the ops whose output tensor shares qparams with input to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: malfet Differential Revision: D35423791 fbshipit-source-id: b24efc31b79bf6f0c98709a760fb9fba55610c0a (cherry picked from commit 7535d8da1d052b490566ee60eebce26e68f35ea2)	2022-04-07 20:13:54 +00:00
Jerry Zhang	e167244aa4	[quant][fx] Move the remaining fixed qparam ops to backend_config_dict (#75314 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75314 this is a refactor to use backend_config_dict for operators with fixed quantization parameters api is not final yet, we'll update the api after we moved everything to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: vkuzo Differential Revision: D35423790 fbshipit-source-id: a69ce19340e2e3c996f1435b887ba122de85f22f (cherry picked from commit 5d35983a3bac4281f8636f69ffb68adb358e9a5f)	2022-04-06 16:11:14 -07:00
Jerry Zhang	7adf59a3bd	[quant][fx] Add BatchNorm ops to backend_config_dict (#75260 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75260 This PR adds bn2d, bn3d and corresponding bn - relu to backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35403587 fbshipit-source-id: 0611acd54478030ca8a6dc08e8552b8e04e1777c (cherry picked from commit 1f6e83dc51810aaafbc0a45812d210f6fe2112ed)	2022-04-06 16:11:13 -07:00
Jerry Zhang	2f3a94996c	[quant][fx] Add cat to backend_config_dict (#75259 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75259 att Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: andrewor14 Differential Revision: D35403586 fbshipit-source-id: 066ace239a7ca5a49463f6fcc2fa10e3efef8794 (cherry picked from commit e4b7d91cc48f3a4c913940bf292272c5418c5cb0)	2022-04-06 16:11:13 -07:00
Jerry Zhang	86485f61c5	[quant][fx] Remove the remaining registrations in BinaryOpQuantizeHandler (#75258 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75258 att, the remaining registrations are for fp16 ops which are no longer used Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35403588 fbshipit-source-id: fc328d42f4cb80901ed545a11fdde49ee7ff8b2e (cherry picked from commit fbe2db090cf8d1221dd37d19636058d8dd44c728)	2022-04-06 16:11:13 -07:00
Jerry Zhang	53f7233004	[quant][fx] Move all binary op configs to backend_config_dict (#75241 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75241 We have a previous PR that enabled operator.add in backend_config_dict, this PR moved the rest binary ops to backend_config_dict. There are some ops left, which are not needed (previously fp16 ops), we will move them in the following PR Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs Imported from OSS Reviewed By: bdhirsh Differential Revision: D35403589 fbshipit-source-id: 663703b310944a6b7c5ade6d07a4d938a6ca082b (cherry picked from commit 5a76ce031872c4fed5fcab5bb3c84a9394b01118)	2022-04-06 16:11:13 -07:00
Jerry Zhang	ff7051781f	[quant][fx] Remove Standalone and CustomModule QuantizeHandler type checks in prepare (#75202 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75202 Instead of checking the type we use a method in the QuantizeHandler to check if a module is a standalone or custom module, not user facing Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: bdhirsh Differential Revision: D35379641 fbshipit-source-id: c2f970c7e27f74793fa67f8fd5a16a43525e35aa (cherry picked from commit 251500f06359c9046dd9067543cc80be24ddee33)	2022-04-06 17:47:33 +00:00
Jerry Zhang	a90bcd2066	[quant][fx] Support override observers and fake quantize module in backend_config_dict (#75135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75135 Some operators have fixed quantization parameters, this PR adds the support to override the qconfig in the backend_config_dict Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35334279 fbshipit-source-id: 390510bd8fc2d61004c36c54390989583e6519ce (cherry picked from commit ccf9bcd7eb4564ec97c5e0548b8ee926f640360b)	2022-04-06 07:00:32 +00:00
Jerry Zhang	9817875729	[quant][fx] Add support for BinarOpQuantizeHandler in backend_config_dict (#74882 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74882 This PR adds support for ops like add/mul in backend_config_dict, these ops have different observation_type based on the number of tensor inputs, when number of tensor inputs is 1, we will share the output observer with input, otherwise we'll have a new observer. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo, andrewor14 Differential Revision: D35236032 fbshipit-source-id: 7077f3ccee8a5d8d19b40107cf8ff16cceafc535 (cherry picked from commit a6f7a37f99fc727269d022d35cc5c0157b70c656)	2022-04-06 06:42:03 +00:00
Charles David Hernandez	9bb21fac95	[ao][sparsity] make sparsity compose with PTQ convert (#74846 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74846 This PR primarily allows the PTQ convert function to work with parametrized modules. Given that the parametrized weight is what is used by default in convert, as long as sparsifier.step() has already been called, the converted model will use the sparisified weights. There is currently no way to handle things if sparsifier.step() has not been called. Lastly, added the is_leaf_or_only_parametrized function because parametrized modules no longer look like leaves due to the parametrizations module attached to them Test Plan: python test/test_ao_sparsity.py TestComposability Imported from OSS Reviewed By: vkuzo Differential Revision: D35240275 fbshipit-source-id: 48529f2a83edfe6d8a2d2dff8ca3d08a3fb0d553 (cherry picked from commit 9d6361482e2885db964e02b0222cd23c9f4d469e)	2022-04-06 04:27:16 +00:00
Jerry Zhang	23bcab19a9	[quant][refactor] Refactor find_matches for easier future extension (#74878 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74878 Previously we record the matched node as a list of nodes: `List[Node]`, this does not generalize to a graph, which is needed for future use cases, in this PR we changed the recorded node as NodePattern instead, currently defined as ``` NodePattern = Union[Tuple[Node, Node], Tuple[Node, Tuple[Node, Node]], Any] ``` but can be more general. This will allow us to support more general patterns with backend_config_dict api, and is also needed for BinaryOpQuantizeHandler refactor Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35203616 fbshipit-source-id: f4bf5b056cfc0955455eea9c2bf1ac9f6dde3974 (cherry picked from commit b290c047e1861bbb62fb1bb576761e801b210220)	2022-04-05 06:53:35 +00:00
Charles David Hernandez	02e30a09f7	[ao][sparsity] make sparsity and PTQ compose (#74845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74845 This PR adds support for quantization flow to detect parametrized modules and match them using their original module types. This mainly involved using the new type_before_parametrizations function rather than type to check for module mathcing Test Plan: python test/test_ao_sparsity.py TestComposability Imported from OSS Reviewed By: jerryzh168 Differential Revision: D35240274 fbshipit-source-id: 7294d89c9c2e069e51d8b9bafa45c15f92bed124 (cherry picked from commit ed5cdb7b636c42e040d1b4a67b6b94604d06e1ff)	2022-04-05 03:35:41 +00:00
Jerry Zhang	88edc21828	[quant][fx] Fix lowering pass for cases when `to` is not called with positional args (#75146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75146 Previously we assume `to` must be called with positioanl args, but this may not be the case, e.g. we can do `to(dtype=?)` or `to(memory_format=?)` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: ejguan Differential Revision: D35342088 fbshipit-source-id: 22bfe78ae84e74141ae6560285c5c38bc068c999 (cherry picked from commit a3593c0bb658a4615559c951ee68c9a6f55074d5)	2022-04-04 23:52:15 +00:00
Andrew Or	ee9335a608	[Quant][fx] Define native backend_config_dict for linear and conv (#74636 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74636 This commit changes how quantization patterns for linear and conv are set up in prepare. Previously, these were set up through ConvReluQuantizeHandler and LinearReLUQuantizeHandler. After this commit, however, these were set up through the corresponding entries in the native backend_config_dict, rendering the above quantize handlers no longer necessary. In future commits, we will do the same for the remaining ops. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: jerryzh168, ngimel Differential Revision: D35225680 fbshipit-source-id: 4a79f63a11fce46701eb17aaf3619c1e827d72a4 (cherry picked from commit 475f599821cd32d3ba71ba086885ecdc4cbee755)	2022-04-04 14:07:15 +00:00
Jerry Zhang	bd032cd8d6	[quant][fx] Remove is_output_quantized from QuantizeHandler (#74843 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74843 is_output_quantized is used to check if we should quantize the op based on the dtype configuration in qconfig and what is supported by the backend, we'll skip inserting observer if the dtype configuration is not supported by the backend, this is now supported by backend_config_dict, and we can remove this function now. Also we previously supported fp16 static quantization for some ops for one of our internal use case, and now it is not required, so we can remove them Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35190541 fbshipit-source-id: 623d961810737ec01e1f8b269ec48a6a99bb284a (cherry picked from commit a405998c60c0146dbd5feef60e2d5cb3b0aa289c)	2022-04-02 16:21:54 +00:00
Jiaxu Zhu	8a7c9a5e01	[quant] Always match the first matchable pattern in fuse (#75047 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/75047 As title, For instance, We match two patterns ``` (add, (bn, conv), matchallnode) (add, matchallnode, (bn, conv)) ``` Against the model ``` conv1 -> bn1 \| conv2 -> bn2 + add ``` For the add node, both two patterns passes `is_match` and `apply_match` is executed twice. As a result, both `conv1 -> bn1` and `conv2 -> bn2` will be matched as `(bn, conv)` instead of one `(bn, conv)` one `matchallnode`. To fix this, stop trying all the other pattners once a pattern is matched. Test Plan: verified in D35252100 Reviewed By: jerryzh168 Differential Revision: D35300191 fbshipit-source-id: 383b2eb971d436072e1c28597c5b6a01d0f49c5a (cherry picked from commit 89d08ea2d2840e01ec3dd40da3f58405577c78fc)	2022-04-01 17:22:36 +00:00
Nikita Shulga	0b845bb645	Revert D35258695: [quant][fx] Cleanup unused to_fp16 check code in lowering Test Plan: revert-hammer Differential Revision: D35258695 (`ec6f767097`) Original commit changeset: 2297696493fe Original Phabricator Diff: D35258695 (`ec6f767097`) fbshipit-source-id: 8c2d608f3c585b8c00275f64a82478cb1af25b50 (cherry picked from commit b0b5f4f3e3414cb538ab35bc2968d39408157c7b)	2022-03-31 10:55:55 +00:00
Terry Chen	b82df92c33	[quant] Fix qmin/qmax when using customized qrange (#74717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74717 currently the weight map to 0 and max_float to 65535 due to incorrect qmin/qmax in qin16 customized qrange the expectation from the set observers is the integer representation is supposed to be a signed int16 i.e -32768 to 32767. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D35129924 fbshipit-source-id: 924902dd7e64c1218971422ba2451c2a484fd2f4 (cherry picked from commit 95659cdeeec7b3a01a64355244847e211c6dd2a6)	2022-03-31 07:49:17 +00:00
Terry Chen	3c701468dc	[quant][ns] Fix ns tool bug for mobilenetv2/v3 (#74149 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74149 mobilenet v2/v3 failed when using ns tool to analysis the model due to the empty the tensor, fixed it by filtering the empty tensor Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D34851886 fbshipit-source-id: db94fd5cef7d4a7a128d46bfe3f5ff4e532845fe (cherry picked from commit 4616a75105abf187a178d95165249cd33345515d)	2022-03-31 05:31:51 +00:00
Jerry Zhang	ec6f767097	[quant][fx] Cleanup unused to_fp16 check code in lowering (#74969 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74969 We can remove the check for fp16 ops now since we confirmed that fp16 ops are not used Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35258695 fbshipit-source-id: 2297696493feb62a4c959e7fbdd6123f59615ef1 (cherry picked from commit a1b4658e661ce610e264e083dfa738c31859ec1a)	2022-03-31 04:25:43 +00:00
Charles David Hernandez	bf091f78a6	[AO][bugfix] Fixing FX QAT but for untraceable modules (#74277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74277 see issue: https://github.com/pytorch/pytorch/issues/74240 this fixes that issue by skipping the children of untraceable modules during propagate_qconfig. This required extending said function to take the prepare_custom_config_dict as an optional argument. Test Plan: python test/test_quantization.py python test/test_quantization.py TestQuantizeFx.test_qat_skip_untraced Imported from OSS Reviewed By: vkuzo Differential Revision: D34916074 fbshipit-source-id: 11caba2cbf78566fb51adf698b01bbba0275de28 (cherry picked from commit 5324c48e4c3277bb12a716a4408151c86006ee47)	2022-03-30 15:08:45 +00:00
Jerry Zhang	5f94eea495	[quant][fx] Remove input_output_observed from BinaryOpQuantizeHandler (#74776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74776 when both inputs are scalars, fx tracing will directly calculate the result, instead of generating an op in the fx graph so num_tensor_args will always be greater than 1 for binary ops, so the input_output_observed will always return True for BinaryQuantizeHandler We will remove input_output_observed method after dynamic quantization in qconfig is properly supported Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: albanD Differential Revision: D35153531 fbshipit-source-id: fa777429eeb64a6a78a98f8d8dcd9e0903c8b209 (cherry picked from commit 676becb650daf29977dbfeb8307de1b19a8d9243)	2022-03-28 16:41:21 +00:00
Jerry Zhang	550d50ed0a	[quant][fx] Remove should_insert_output_observers (#74775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74775 We have simplified the way we insert observers, for add_scalar it now behaves the same way as general_tensor_value ops, which means we only need to keep is_general_tensor_value_op now, the other methods can be removed Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D35153532 fbshipit-source-id: 2d17189e167a9932bdbf5ae46b3ced25b7128c2f (cherry picked from commit 7cf7c8a522171f58954b227917e5c75cdfdddb1c)	2022-03-28 16:22:45 +00:00
Andrew Or	ea2d58a3df	[Quant][fx] Refactor lowering code (part 2) (#74619 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74619 This commit is part 2 of the effort to refactor the lowering code in _lower_to_native_backend.py. The main change included in this commit is generalizing the pattern matching code across different lowering functions. There should be no change in behavior with this PR. A future commit will further merge the static and dynamic lowering code paths. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: jerryzh168 Differential Revision: D35082210 fbshipit-source-id: 7f0347c9449cc9ca68fee5a807c792222f0d1749 (cherry picked from commit 16d34c13c7eb0553680713878b52ece9c8884a1f)	2022-03-28 14:40:31 +00:00
Jerry Zhang	0747bdbf11	[quant][fx] Removing more unused code (#74603 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74603 att Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35071546 fbshipit-source-id: 273a7f0cb2a8f306864eb118916056fad3bb1399 (cherry picked from commit 9c31a50a2bccb2e5b7a5db833085a75e5ebda707)	2022-03-25 16:39:48 +00:00
Jerry Zhang	66e07f2aef	[quant][fx] Merge is_general_tensor_shape_op into is_general_tensor_value_op in QuantizeHandler (#74601 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74601 Currently the behavior for general tensor shape op and general tensor value op are the same, so we can remove this flag and merge with the is_general_tensor_value_op flag. is_general_tensor_value_op flag is used in two places in prepare: (1). dtype propgation: we only do dtype propgation when this flag is true (this will be refactor in the future to be more systematic) (2). observer sharing, we'll use the input observer instance as output observer for an op if this flag is True Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: george-qi Differential Revision: D35071438 fbshipit-source-id: 5e8f5fd84e37db0433a63fe0a0e212ce3c5908d6 (cherry picked from commit b4bbc9fa0e65f3768eb97ca8e84b7cbd7e840b67)	2022-03-25 11:10:44 +00:00
Jerry Zhang	b347b8c191	[quant][fx] Support some default ops in the native backend config (#74600 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74600 Following https://github.com/pytorch/pytorch/pull/74210, this PR adds the support for some ops using the DefaultNodeQuantizeHandler in the backend_config_dict defintion for pytorch native backend TODO: There is still a few ops we didn't handle with backend_config_dict path: gelu and softmax, need to discuss if we still need them, if so we can change the test to use backend_config_dict and remove the DefaultNodeQuantizeHandler after that Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D35071437 fbshipit-source-id: 70351d2810ca1ac7dc09d4a9c239f6757ccb51ca (cherry picked from commit 5e68f755a32ba7d90d6c73db9c2017f9c58d7fa5)	2022-03-25 02:59:36 +00:00
Jiaxu Zhu	7c1f3cc89e	[quant] Populate FakeQuantize quant_min/quant_max to observer (#74581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74581 As title, currently the quant_min/quant_max of the FakeQuantize are not populated to the observer. We plan to populate when they are both not None. To do this we need to do 1. Remove the current default quant_min/quant_max value (0/255) as it's not universal for various dtype. 2. Move the upper bound/lower bound check before creating the observer. Test Plan: ``` [jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization -- --exact 'caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize)' Parsing buck files: finished in 0.8 sec Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 9.5 sec (100%) 18535/84579 jobs, 2/84579 updated Total time: 10.3 sec More details at https://www.internalfb.com/intern/buck/build/1cab97ef-0788-4d06-92ed-a828995e3bde BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: 24be645e-eebc-45d6-8111-052ef1225fa0 Trace available for this run at /tmp/tpx-20220323-094106.724238-24be645e-eebc-45d6-8111-052ef1225fa0/trace.log RemoteExecution session id: reSessionID-24be645e-eebc-45d6-8111-052ef1225fa0-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735 ✓ ListingSuccess: caffe2/test:quantization : 483 tests discovered (20.179) ✓ Pass: caffe2/test:quantization - test_quant_min_max_override (quantization.core.test_workflow_module.TestFakeQuantize) (18.896) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/5066549674998735 ``` Reviewed By: jerryzh168 Differential Revision: D34971236 fbshipit-source-id: 4407fd03116a296053256b333f7ce6d28dcc9c42 (cherry picked from commit f6980bccea802f220cc5b6dfe1bf3a3a3eef0a34)	2022-03-24 18:23:40 +00:00
Digant Desai	09f32eba7a	[quant] Add default symmetric qat qconfig for qnnpack (#74507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74507 * This is the default symmetric qat qconfigs for qnnpack. * Support for symmetric quantization is not available from other backends. * Observers are similar to symmetric PTQ qconfigs for qnnpack. Reviewed By: jerryzh168 Differential Revision: D34804808 fbshipit-source-id: 22c11b89242a98f54029ac195f7b984e42809164 (cherry picked from commit ea751ded1174ba2c2f061bafc81573faaf248a9a)	2022-03-24 16:19:28 +00:00
Jerry Zhang	93a1068d09	[quant][fx] Relax the constraint for input of custom module nodes (#74510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74510 Previously we require the dequantize before custom module to have one user, this is because we are removing the dequantize node before custom module while we transform an observed custom module to a quantized custom module, but actually we don't need to remove it, we can just change the input of custom module with quantize node instead. If the dequantize node only has one user, it will be removed by the dead code elimination pass that was added recently. Test Plan: python test/test_quantization.py TestQuantizeFx.test_custom_module_class_input_has_multiple_users Imported from OSS Reviewed By: dzdang Differential Revision: D35034626 fbshipit-source-id: eea9fbf9fb34c61f114c6431377be347632ce36d (cherry picked from commit 2878085a56bc529afef5e533bc5f49079d4adc52)	2022-03-23 18:50:49 +00:00
Jerry Zhang	e9776fe58c	[quant][fx] Support conv1d and its fusion variants in QAT (#74506 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74506 This PR supports qat Conv1d, ConvBn1d, ConvBnReLU1d, ConvReLU1d in qat in FX Graph Mode Quantization Test Plan: python test/test_quantization.py TestQuantizeFx.test_conv_bn_relu Imported from OSS Reviewed By: vkuzo Differential Revision: D35032995 fbshipit-source-id: 645da33f0d893aa44f35ee1384fd1539a9c788e7 (cherry picked from commit 6b583baa74c5a4fd2f50270d633f277e2fc94716)	2022-03-23 18:43:53 +00:00
Jerry Zhang	56f218edb0	[quant][fx] Remove unused method from QuantizeHandler (#74408 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74408 Removed * should_mark_output_quantized_from_input_quantized_status * _maybe_get_last_node_only_observer since they are used in the previous convert code, which arep no logner needed Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34984301 fbshipit-source-id: 0c46126576bd4ef633f4de530d01364e68f7ed39 (cherry picked from commit d14d094c4de308f08181920cd0611ea1bc664605)	2022-03-22 07:51:32 +00:00
Jerry Zhang	ae23ad19f8	[quant][fx] Cleanup quantization_patterns.py (#74407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74407 The convert method of QuantizeHandler is no longer used after the convert refactor, this PR removes them Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34983830 fbshipit-source-id: cf9a6a19bd0ae035ba33497eecf74e98658dd5c7 (cherry picked from commit d85eb0f77513ef5f5f10543df6dec8b65b4985a3)	2022-03-21 18:46:02 +00:00
Jerry Zhang	b86554abed	[quant][fx] Fix dynamic weighted op lowering when input is used multiple times (#74364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74364 if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Imported from OSS Reviewed By: yixin94 Differential Revision: D34952755 fbshipit-source-id: a950159fd6a661e84faf0baf1692f6783904cfb3 (cherry picked from commit 8a6896801fdd96a55476faca4ccb7ba0b0bdb058)	2022-03-18 23:09:33 +00:00
Jerry Zhang	dbf43d621d	[quant][fx] Only do reference moduel swapping for floating point fused modules (#74231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74231 Add a check to make sure the weighted modules we swap is actually a float fused module, since the reference fused module like reference version of linear - relu would have the same fused type as the floating point linear - relu (and the linear submodule will have different types) Test Plan: phabricator diff for now, can add a test case after we know exactly what the problem is Reviewed By: andrewor14 Differential Revision: D34888290 fbshipit-source-id: a7f53368a7c17f7d1a82afaa50d14d569b4923df (cherry picked from commit 458dac9fdf8b4f0d786bf9c815c2f2fe8df13bb4)	2022-03-18 22:20:16 +00:00
Digant Desai	cfe1a41b01	[quant] Add default symmetric qconfig for qnnpack (#74396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74396 # New qconfig `default_symmetric_qnnpack_qconfig` Returns a qconfig with signed activation and symmetric weights with range restrictions. Also adds per_channel variant for the same. ## Restrictions on weights Restrictions on weights include, 1. weight zero point is force zero. and 2. weight 8-bit signed quantized value are limited to [-127, +127] excluding the value +128. This is driven, in part, by the desire to achieve better performance by XNNPACK ops. ## qengine/backend = `qnnpack` and XNNPACK ops Qconfig returned by this function allows us to use faster XNNPACK quantized ops for CPUs w/ said restrictions. Although we are using XNNPACK ops the qengine is still `qnnpack`, and there are no plans to introduce a new qengine for XNNPACK ops. Support to use XNNPACK ops with asymmetric (returned by get_default_qconfig()) qconfig is WIP. ## Updated EPS value: * From PyTorch: eps: ``` >>> import torch >>> torch.finfo(torch.float32).eps 1.1920928955078125e-07 >>> torch.finfo(torch.float32).eps.hex() '0x1.0000000000000p-23' ``` All scale values are float32 and `scale = max(scale, eps)` * Requirement from XNNPACK For both fp32 as well as rndnu requantization schema, `0x1p-32 <= requantization_scale < 256.0` Where, requantization_scale = (input_scale * kernel_scale) / (output_scale) * New minimum allowed scale value With current float32 eps (=0x1p-23) as minimum, xnnpack lower bound is the problem. We haven’t observed upper bound issues so far with assuming the max scale value of 256. So focusing on the lower bound, to cover all possible cases of requantization value, conservatively, we must have the minimum possible requantization scale value such that, ``` minimum_requantization_value = xnnpack_lower_threshold input_scale * kernel_scale / output_scale = 0x1p-32 min_scale_value * min_scale_value / max_scale_value = 0x1p-32 min_scale_value * new_eps / 256 = 0x1p-32 min_scale_value*2 = 0x1p-24 min_scale_value = 0x1p-12 ``` With `scale_value >= 0x1p-12`, we should be able to avoid the lower threshold on requantization scale by xnnpack kernels. Obviously this is a very unlikely to happen. So practically, we should be get away with much smaller value than `0x1p-12` as EPS, but it is not easy to choose a smaller value empirically. Impact on accuracy is unclear as of writing this. Reviewed By: kimishpatel Differential Revision: D34625300 fbshipit-source-id: 005e6757ed1185b3940b58ac55246cba8b267828 (cherry picked from commit 61ed1a2a308a1792ccbfc316153a6dc39798f02a)	2022-03-18 13:42:41 +00:00
Jiaxu Zhu	dc0c94910f	[quant] Don't regard MatchAllNode as node matched (#74198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74198 As title, currently in the (add, X, MatchAllNode) pattnern, the node matched with MatchAllNode is regard as part of the pattern instead of the input. As a result, the possible patterns ends with that node will not be matched. For instance, we have two patterns 1. (nn.ReLU, (torch.add, MatchAllNode, (nn.BatchNorm2d, nn.Conv2d))) 2. (nn.ReLU, (nn.BatchNorm2d, nn.Conv2d)) And we wanna fuse the following model Conv2d -> BatchNorm2d -> ReLU + Conv2d -> BatchNorm2d ------ Add -> ReLU The pattern in the first row cannot be matched becaues the end node ReLU is recorded as MatchAllNode already. Test Plan: new unit test ``` [jiaxuzhu@devvm3400.frc0 /data/users/jiaxuzhu/fbsource/fbcode] buck test mode/dev //caffe2/test:quantization_fx -- --exact 'caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx)' Parsing buck files: finished in 0.9 sec Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 12.6 sec (100%) 18546/84011 jobs, 2/84011 updated Total time: 13.5 sec More details at https://www.internalfb.com/intern/buck/build/9d2decdb-d01e-4332-84f5-1728a65d4f7b BUILD SUCCEEDED Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details. Running with tpx session id: d92e10b8-9209-4e9e-95a6-2fcac02db251 Trace available for this run at /tmp/tpx-20220314-161230.347672-d92e10b8-9209-4e9e-95a6-2fcac02db251/trace.log RemoteExecution session id: reSessionID-d92e10b8-9209-4e9e-95a6-2fcac02db251-tpx Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263 ✓ ListingSuccess: caffe2/test:quantization_fx : 365 tests discovered (19.275) ✓ Pass: caffe2/test:quantization_fx - test_fusion_pattern_with_matchallnode (quantization.fx.test_quantize_fx.TestFuseFx) (17.760) Summary Pass: 1 ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/3377699814955263 ``` Reviewed By: jerryzh168 Differential Revision: D34873730 fbshipit-source-id: dc78455c7233ba33e9ab215f50754b1656b7dbc7 (cherry picked from commit 1cc74cadd7dc725be97064f57c910ef9d1bbe1a8)	2022-03-17 20:12:35 +00:00
Jerry Zhang	975c9f15bd	[quant] Rename _convert_do_not_use.py to convert.py (#74322 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74322 att, also change all references to _convert_do_not_use Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestAOMigrationQuantizationFx Imported from OSS Reviewed By: andrewor14 Differential Revision: D34936430 fbshipit-source-id: c96fb887847383bf47f0ec4219127e96e2b63b2d (cherry picked from commit 8ad5a9e031e6ca4ede2656d9b2f7906a82b57c1c)	2022-03-17 18:57:08 +00:00
Jerry Zhang	a6bed4deaa	[quant][fx] Remove convert.py since it is not used now (#74276 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74276 Removing convert.py since we have rerouted the traffic to _convert_do_not_use, we'll do a rename in the follow up PR Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34914261 fbshipit-source-id: 09ad520d95fa91c525222a69474930efb3571088 (cherry picked from commit 8aeb33206f3572132356fe78395aa3ce6aff11cd)	2022-03-17 18:57:08 +00:00
andrewor14	a705486915	[Quant][fx] Refactor lowering code (part 1) (#74128 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74128 Summary: This commit is the first step towards refactoring the lowering code in _lower_to_native_backend.py. The main changes included in this commit are: (1) Remove the use of the subgraph rewriter in lowering (2) Replace the use of `is_match` with manual pattern matching The motivation behind (2) is it simplifies the lowering code significantly; previously we had many different but similar patterns for slightly different models. There should be no change in behavior with this PR. Note that this is only part 1 of the refactoring. Part 2 will merge the static and dynamic lowering code paths and refactor the currently duplicate pattern matching / cleanup code into common helper functions. Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: jerryzh168 Subscribers: jerryzh168 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34910597 Pulled By: andrewor14 fbshipit-source-id: c6fea0c538ce5efc5afaf53e072922528988dda7 (cherry picked from commit fa05cb9fc0909fe6e199a6b50ea2001c9e9ac0ee)	2022-03-17 03:30:22 +00:00
Charles David Hernandez	c1d070d0f0	[ao] Fixing obs insertion through dtype propagation (#73274 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73274 As noticed in https://discuss.pytorch.org/t/calibration-of-model-in-post-training-static-quantization-using-fx-api/143661/6 and related to https://github.com/pytorch/pytorch/issues/72698 when using fx quantizaiton, if an op like view was used in a model and the index parameters were passed in to the ops with a variable rather than hard coded, fx would mistakenly insert observers for them, leading to an error when the observer tried to do tensor only operations on a non-tensor. To fix this, an API was added to specify non tensor arguments for various ops to enable better dtype propagation. NON_TENSOR_ARG_DICT is a nested dict whose first key is a named tuple which contains matching parameters for ops with nontensor args, the inner dict's keys are dtypes and the values are a list of those arg indices that take use such dtypes. Alternatively, instead of a list, the inner dict value can also be a function that takes the node as an argument and returns the list of arg indices. Theoretically this api can support arbitrary functions but the current implmentation is limited to simpler functions given the particular issue this fixes seems to be rare. Note: although torch.unsqueeze and torch.transpose are listed in quantization_patterns.py, those ops appear to be untraceable by fx. I've included tests for their cases but fixing this issue is beyond the scope of this PR Test Plan: python test/test_quantization.py test_non_reference_size ... python test/test_quantization.py test_non_reference_<op> Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34410122 fbshipit-source-id: fc09949ca8a2d6473876a4b6c214eb91e9a9dae2 (cherry picked from commit 3a1375d677b7c98d62b1f5c839645698c39b32b9)	2022-03-16 01:41:17 +00:00
Jerry Zhang	ca4348f628	[quant][fx] Allow incrementally remove the items in quantization_patterns.py (#74210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74210 This PR added a codepath for getting patterns (quantize handlers) for the backend_config_dict for native backend when backend_config_dict is None. This would allow us to incrementally define the backend_config_dict for pytorch native backend and gradually remove the entries in quantization_patterns.py Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: dzdang Differential Revision: D34899783 fbshipit-source-id: 7f31292948d7fc4566e51e175b41511f52d0a880 (cherry picked from commit a9f6ebd6478f362d5bb9c5ae04e02369e00f550c)	2022-03-16 00:20:52 +00:00
Jerry Zhang	9a0b7b4723	[quant] Fix implementation for `output_quantized_idxs` in convert (#74140 ) (#74229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74229 Previously we did not successfully remove the dequantize node for `dict`, this PR fixes that, tested with meta-only tests right now but we should follow up with oss tests (with dict output) since we called dead code elimination pass, some of the inplace operators are removed in the TestQuantizeFx.test_fixed_qparams_ops, in this PR we also just removed the calls to the inplace ops, and changed the expected results in the test case, in the future PR we can remove the support for inplace operators, since it is not really supported in fx, and it's OK for us to skip them as well Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D34888140 fbshipit-source-id: 48cea842b49e52baa8eee3ce0f4bfb4a3625ab2a (cherry picked from commit ef790315ebcf954930deb6b9d1c384992c1f1ec8)	2022-03-16 00:00:13 +00:00
Natalia Gimelshein	1e64c8a8e3	Revert D34846005: [quant] Fix implementation for `output_quantized_idxs` in convert Test Plan: revert-hammer Differential Revision: D34846005 (`a7f9fb997a`) Original commit changeset: 4313ed6adff4 Original Phabricator Diff: D34846005 (`a7f9fb997a`) fbshipit-source-id: c5719b0ad2514277b6ea026cbc0153613bf52d0c (cherry picked from commit 84ed43a6d185879209bedca8e8ed8dc5b0a24ded)	2022-03-15 05:04:31 +00:00
Jerry Zhang	a7f9fb997a	[quant] Fix implementation for `output_quantized_idxs` in convert (#74140 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74140 Previously we did not successfully remove the dequantize node for `dict`, this PR fixes that, tested with meta-only tests right now but we should follow up with oss tests (with dict output) Reviewed By: andrewor14 Differential Revision: D34846005 fbshipit-source-id: 4313ed6adff425d73ad19aabedde1200a98f1915 (cherry picked from commit 682abe9ecbd42c4ac1b41891bbc3b79ad522b78a)	2022-03-15 03:35:53 +00:00
Weiwen Xia	060f1b822a	Add onednn quant backend (#74137 ) Summary: Resolve the conflicts in https://github.com/pytorch/pytorch/pull/69820 jerryzh168 Please review. Thanks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74137 Reviewed By: samdow Differential Revision: D34840477 Pulled By: jerryzh168 fbshipit-source-id: 8aa60981ff7be211a1609644f273b16d18efd425 (cherry picked from commit de76bb808b315e9a2e45d8c5f1c1233a47d669c4)	2022-03-15 01:28:21 +00:00
Jerry Zhang	5a897536f3	Revert D33716039: [pytorch][PR] Add ONEDNN quantization backend Test Plan: revert-hammer Differential Revision: D33716039 (`989b24855e`) Original commit changeset: 6f7bb807e857 Original Phabricator Diff: D33716039 (`989b24855e`) fbshipit-source-id: ed233c5b99d4edb7d5a9d6c600825c78555f16d0 (cherry picked from commit d3e1f825b06ef67adb13623ccb7cbf1b700c1dd5)	2022-03-11 22:06:25 +00:00
Xia Weiwen	989b24855e	Add ONEDNN quantization backend (#69820 ) Summary: This PR adds a new quantization backend, ONEDNN, with quantized conv and linear kernels in the same code path as the FBGEMM backend The ONEDNN backend is an alternative of FBGEMM and QNNPACK backends. It takes advantage of features of the latest Intel® CPU products. It supports VNNI on Cascade Lake and the AMX instruction set to be available on Sapphire Rapids which has 8X int8 peak TOPS over VNNI. ONEDNN demonstrates better performance on conv kernels of popular CNN models than FBGEMM. It also supports more fused ops, such as convolution-add-ReLU, than FBGEMM and QNNPACK. To use this backend, users only need to set the quantization backend to 'onednn' before any calculation without a single change to models. ```python torch.backends.quantized.engine = 'onednn' ``` ## Design docs https://github.com/pytorch/pytorch/issues/21120#issuecomment-562371983 https://github.com/pytorch/pytorch/pull/67177#issuecomment-963787096 ## File changes Add ONEDNN to qengine list - aten/src/ATen/Context.cpp - c10/core/QEngine.h - torch/ao/quantization/qconfig.py - torch/backends/quantized/\_\_init\_\_.py Implement qconv & qlinear for ONEDNN backend - aten/src/ATen/native/quantized/cpu/conv_serialization.h - aten/src/ATen/native/quantized/cpu/fbgemm_utils.cpp - aten/src/ATen/native/quantized/cpu/onednn_utils.h - aten/src/ATen/native/quantized/cpu/qconv.cpp - aten/src/ATen/native/quantized/cpu/qconv_dynamic.cpp - aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp - aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp - aten/src/ATen/native/quantized/cpu/qlinear.cpp - aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp - aten/src/ATen/native/quantized/cpu/qlinear_prepack.cpp - aten/src/ATen/native/quantized/cpu/qlinear_unpack.cpp Skip tests that are not supported by ONEDNN - test/ao/sparsity/test_kernels.py - test/quantization/core/test_quantized_module.py - test/quantization/core/test_quantized_op.py ## Validation results This PR has passed `test_quantization.py` and `test_mkldnn.py`. Below are performance data of int8 2d convolution and linear on the Cascade Lake Xeon® platform: (Note: Tested with single instance on single core. Using the latest oneDNN library.) Table 1. Performance comparison of int8 2d convolution operator \|No.\| Shape\| FBGEMM\| ONEDNN\| Gain\| \|-\|-\|-\|-\|-\| \|1\| IC=128, OC=128, kernel=3, stride=1, N=4, H=32, W=32, G=1, pad=0\| 668.310us\| 535.630us\| 24.8%\| \|2\| IC=128, OC=128, kernel=3, stride=2, N=4, H=32, W=32, G=1, pad=0\| 290.630us\| 281.810us\| 3.1%\| \|3\| IC=128, OC=256, kernel=3, stride=1, N=4, H=32, W=32, G=1, pad=0\| 1.045ms\| 893.010us\| 17.0%\| \|4\| IC=128, OC=256, kernel=3, stride=2, N=4, H=32, W=32, G=1, pad=0\| 385.320us\| 373.720us\| 3.1%\| \|5\| IC=256, OC=256, kernel=3, stride=1, N=4, H=32, W=32, G=1, pad=0\| 1.876ms\| 1.641ms\| 14.3%\| \|6\| IC=256, OC=256, kernel=3, stride=2, N=4, H=32, W=32, G=1, pad=0\| 660.460us\| 638.470us\| 3.4%\| Table 2. Performance comparison of int8 linear operator \|No.\| Shape (m, n, k)\| FBGEMM\| ONEDNN\| Gap\| \|-\|-\|-\|-\|-\| \|1\| 64, 800, 320\| 80.550us\| 96.770us\| 20.10%\| \|2\| 64, 768, 512\| 101.230us\| 130.720us\| 29.10%\| \|3\| 16, 256, 512\| 30.230us\| 51.450us\| 70.20%\| \|4\| 128, 128, 128\| 33.810us\| 50.480us\| 49.30%\| \|5\| 256, 512, 256\| 154.490us\| 195.050us\| 26.30%\| \|6\| 1024, 1024, 1024\| 3.134ms\| 3.514ms\| 12.10%\| ONEDNN showed advantages over FBGEMM for convolution. However, it has performance gap to FBGEMM for Linear ops. The gap is a known issue and further optimization is in progress in the oneDNN library. On the latest platforms, better performance of ONEDNN is achieved for both conv and linear. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69820 Reviewed By: HDCharles Differential Revision: D33716039 Pulled By: jerryzh168 fbshipit-source-id: 6f7bb807e85798142dfcffccfca8b8bd652fb3dd (cherry picked from commit 91526b373560f42ba0ad307f9cccfc0eb5218b1f)	2022-03-11 20:31:49 +00:00
Jerry Zhang	7ddf212f33	[quant][fx] Fully align convert with the reference model design and simplify the implementation (#73863 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73863 This PR fully aligns the convert function with the design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md and simplifies the implementation of convert function by always produce a reference quantized model (with reference patterns) first, and then lower the model to a quantized model that is runnable with PyTorch native backend (fbgemm/qnnpack). This PR makes the convert.py much easier to understand than the previous implementation, and we are able to remove majority of code in quantization_patterns.py as well (in followup PRs). Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFXNumericSuiteCoreAPIs python test/test_quantization.py TestFXNumericSuiteCoreAPIsModels ``` and other internal/oss regression tests Imported from OSS Reviewed By: andrewor14 Differential Revision: D34778506 fbshipit-source-id: 0678b66addf736039a8749b352f6f569caca962b (cherry picked from commit 33ec9caf23f3ab373d827117efbd9db0668b2437)	2022-03-11 17:11:30 +00:00
Charles David Hernandez	39605a5632	[ao] Removing memoryless observer args for MovingAverage (#73947 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73947 The original implementation of memoryless observers used MinMaxObservers and a memoryless argument to manipulate the behavior of the observer such that it wouldn't keep track of previously observed min and max's. It was later pointed out that this was equivalent to a movingaverageobserver with averaging_constant=1 which is requires less overhead and no 1 off args (memoryless) so this PR refactors the memoryless arg and uses MovingAverage observers instead, although the memoryless adjective is still used, a complete definintion was also added to clarify error messages given these changes. TestPlan python test/test_quantization.py TestQuantizeEagerQAT python test/test_quantization.py TestObserver Test Plan: Imported from OSS Reviewed By: andrewor14 Differential Revision: D34732080 Pulled By: HDCharles fbshipit-source-id: 227a1ab29d18adae55093a684ea35ac34523d07a (cherry picked from commit 5238e70e8f90f3219c36f9c64b647951dcf64b5a)	2022-03-11 00:21:49 +00:00
Terry Chen	4e6aefaf72	[Qunat] Refactor reference module mapping (#72755 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72755 Add is_refernece flag in convert function Test Plan: python3 test/test_quantization.py TestQuantizeEagerOps.test_conv_transpose_2d Imported from OSS Reviewed By: mruberry Differential Revision: D34188856 fbshipit-source-id: 291014a7b3b4d4b40ca0ca76a80711097dcc4b58 (cherry picked from commit cfba3b8dc0373708712c0d847d590f0d587df002)	2022-03-08 06:48:04 +00:00
Andrew Or	f3c6e8f720	[Quant][fx] Add lowering for functional conv (#73708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73708 This adds functionality to lower reference models involving functional conv in FX. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_conv Imported from OSS Reviewed By: mruberry Differential Revision: D34648870 fbshipit-source-id: d1c8afdb9787c36639d5ee5762ae71e7e8ab3769 (cherry picked from commit 7a28617faf4b8aad152076239927e94ed3f0169e)	2022-03-07 15:32:54 +00:00
Andrew Or	cedce3be20	[Quant][fx] Add lowering for Linear-Bn1d in QAT mode (#73509 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73509 This adds functionality to lower reference models involving the Linear-Bn1d pattern in FX QAT mode. This follows https://github.com/pytorch/pytorch/pull/72431 and https://github.com/pytorch/pytorch/pull/72796, which add Linear-Bn1d fusion functionality to eager QAT mode. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_linear_module Imported from OSS Reviewed By: dagitses Differential Revision: D34591251 fbshipit-source-id: 39144485f9954ee1830c8b414e724560fd7e47bf (cherry picked from commit b97a39b4d9df00e045fab4c01eca88e562ca2c02)	2022-03-07 15:32:54 +00:00
Vasiliy Kuznetsov	6f2dad24d3	ns for fx: add ability for fp16 model to shadow fp32 model (#73785 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73785 The conversion from fp32 to fp16 is easily defined, we just did not have it in NS code yet. This PR adds it.This is needed for some customer models. Test Plan: ``` python test/test_quantization.py TestFXNumericSuiteCoreAPIs.test_fp16_shadows_fp32 ``` Reviewed By: jerryzh168 Differential Revision: D34642873 Pulled By: vkuzo fbshipit-source-id: 9df505b1ea3f3d3cdb3a5f2409ef3a66f40b7eff (cherry picked from commit 679cd8a5e24b1cfd7f871dcba3ce8a90de980556)	2022-03-05 00:27:48 +00:00
Terry Chen	5167e9d59d	[quant][fix] Fix bug for ave pooling in FX quant (#73054 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73054 Fix bug for ave pooling in FX quant Test Plan: python3 test/test_quantization.py TestQuantizeFxOps.test_ave_pool_with_custom_cfg Imported from OSS Reviewed By: george-qi Differential Revision: D34334059 fbshipit-source-id: a2ddad4fa3abf250f5dc20486c966fff3a9098a6 (cherry picked from commit d0f6ea680427a454200735075d557fb0b145a625)	2022-03-04 23:29:18 +00:00
Jerry Zhang	8b8fac91bf	[quant][fx] Refactor _convert_fx_do_not_use (#73777 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73777 att, this is to prepare for the migration of current convert to this function Test Plan: regression tests to make sure the refactor doesn't break anything internal only, since tensorrt tests are moved to a separate repo Imported from OSS Reviewed By: vkuzo Differential Revision: D34636000 fbshipit-source-id: 9850904e3b834345abbeedc8bccaf107397db59d (cherry picked from commit a8c87d4592237c247989e7419bb165c96b8e90db)	2022-03-04 18:29:36 +00:00
Vasiliy Kuznetsov	727debb18e	dbr quant: enable reference module support for torch.qint32 (#73493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73493 This PR enables basic support for reference modules in DBR quant. For now, the support is limited to: 1. modules that have reference versions defined only (no functions) 2. torch.qint32 dtype only Currently, the reference module logic is enabled whenever dtype is torch.qint32. This is done because this is needed the earliest for the first use case. A future PR will support more dtypes and also add the `is_reference` flag to the API. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR.test_conv_int32_reference_model ``` Reviewed By: jerryzh168 Differential Revision: D34520759 Pulled By: vkuzo fbshipit-source-id: 363db715315c5c7c20962a1818330ce288948778 (cherry picked from commit 6ccdfe2889c252211f191edc49f4147f66e803a4)	2022-03-04 17:35:31 +00:00
Vasiliy Kuznetsov	5787a36e30	dbr quant: insert activation obs explicitly, instead of relying on hooks (#73492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73492 Before this PR, DBR quant reused the Eager mode quantization machinery to insert activation observers. This was done for speed of developing the prototype. A drawback of this is that the activation observers are not present in DBR's data structures and live on the modules instead. This PR refactors DBR quant to stop using Eager mode quantization observer insertion for activations, and instead create and track the activation observers in DBR's data structures. This has a couple of benefits: 1. activation observers are now created the same way in DBR for modules and functions 2. we can remove some technical debt due to fixing (1) 3. this will make it easier to support reference modules in a future PR The reason (3) is true is because the current design of reference modules assumes that the activation observer lives on the framework (like in FX graph mode quantization). This PR starts to adhere to that assumption. Test Plan: ``` python test/test_quantization.py -k DBR ``` Reviewed By: jerryzh168 Differential Revision: D34520758 Pulled By: vkuzo fbshipit-source-id: 2f6448dce021024cb2fa112d8691c94128c43123 (cherry picked from commit cfc1a0eaf6579cea2c710c1c2b4c86d28ee799eb)	2022-03-04 17:35:31 +00:00
Haixin Liu	3042f0ce22	[NS] Mark output logger impure to avoid being removed in acc tracer (#73745 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73745 Mark output logger as impure, which will help prevent it and the shadow ops from being removed in acc tracer. Test Plan: Tested in N1611591 Reviewed By: jerryzh168 Differential Revision: D34616990 fbshipit-source-id: ccc93e30f9cbf3eb69f49fc2d0f02fd4d083c507 (cherry picked from commit e40fcbd1bc543eb64fa692776c34f26e2a0a05ff)	2022-03-04 11:30:10 +00:00
Jerry Zhang	f5c7e5406b	[quant][fx] Add lowering support for qat and fused convs (#73527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73527 This includes: ``` torch.nn.qat.Conv2d, torch.nn.qat.Conv3d, torch.nn.intrinsic.qat.ConvBn1d, torch.nn.intrinsic.qat.ConvBn2d, torch.nn.intrinsic.qat.ConvBn3d, torch.nn.intrinsic.qat.ConvBnReLU1d, torch.nn.intrinsic.qat.ConvBnReLU2d, torch.nn.intrinsic.qat.ConvBnReLU3d, torch.nn.intrinsic.qat.ConvReLU2d, torch.nn.intrinsic.qat.ConvReLU3d torch.nn.intrinsic.ConvReLU1d, torch.nn.intrinsic.ConvReLU2d, torch.nn.intrinsic.ConvReLU3d, ``` We first produce the reference pattern and then lower the reference pattern to quantized modules Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: andrewor14 Differential Revision: D34583206 fbshipit-source-id: d298114d1906ea44c071b0eee52730dadf67fd3e (cherry picked from commit 6498af35b5aa6104cadb68ca48dff4e443bee7d6)	2022-03-04 06:29:03 +00:00
dzdang	a39e8e8f5e	[Quant][fx] Added explicit entries for for functional and module conv&linear support into get_default_qconfig_dict&get_default_qat_qconfig_dict (#73528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73528 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34535572 Pulled By: dzdang fbshipit-source-id: 883f46e014e47aeba3ea6f9fb401c54e3792b2ac (cherry picked from commit 66713d518295b2e7306561030aa6b7ca049a708c)	2022-03-04 03:29:20 +00:00
Vasiliy Kuznetsov	bf896a2988	dbr quant: add torchscript pass to remove redundant aliases (#71230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71230 DBR quantization uses `torch.Tensor.as_subclass` frequently. When the quantized model is traced with `torch.jit.trace`, these calls appear in the resulting graph as `aten::alias`. This PR adds a pass to remove these calls from the graph, for two reasons: 1. ease of debugging (these calls do nothing) 2. less work for downstream passes (for example, converting to ONNX currently breaks if these alias calls are present) For now, we have to inline the graph in order for `aliasDb` to determine safety properly. In the future, we may choose to relax this if there is a need for it. Test Plan: Test plan is pretty basic for now, it can be improved in future PRs. ``` python test/test_quantization.py TestQuantizeDBR.test_jit_tracing_removes_aliases ``` Reviewed By: eellison Differential Revision: D33552387 Pulled By: vkuzo fbshipit-source-id: 681a33ddfff394a91e971263ac593afd93c5ea78 (cherry picked from commit 0f8412725d0c6fd9ef1072a50d4203465aa5d1f9)	2022-03-03 15:31:53 +00:00
Vasiliy Kuznetsov	eb8d06591c	quantization: fix bug in QuantWrapper with DeQuant qconfig (#73671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73671 QuantWrapper did not correctly apply qconfig to the dequant. Therefore, if the user first applied qconfig to their module and then wrapped it with `QuantWrapper`, the dequant would not get swapped during the convert step. The fix is to properly apply the qconfig to the dequant. Test Plan: ``` python test/test_quantization.py TestQuantizeEagerPTQStatic.test_quantwrapper_attaches_qconfig_to_dequant ``` Reviewed By: MaigoAkisame Differential Revision: D34585260 Pulled By: vkuzo fbshipit-source-id: 82055a9fa7fc13a714fe460deb461c2e87e76b39 (cherry picked from commit c9f392333dd1c005d893bdc2fbafe8a82b317c88)	2022-03-03 15:31:53 +00:00
Andrew Or	b7a7cdd00a	[Quant][fx] Add lowering for functional linear (#72855 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72855 This adds functionality to lower reference models involving functional linear in FX. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_functional_linear Imported from OSS Reviewed By: albanD Differential Revision: D34514127 fbshipit-source-id: 7af4f37bdeda710dc7197ede9d46f66227d7932c (cherry picked from commit a14cbc04dea4e578643c4183f0c8ea43fbdaf5c7)	2022-03-02 18:34:35 +00:00
Jerry Zhang	81437e66c1	[quant][fx] Add RNN reference module (#73386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73386 This PR adds support for RNN reference module, following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md This includes: RNNCell, LSTMCell, GRUCell, LSTM Test Plan: will be tested in the lowering flow in a separate PR Imported from OSS Reviewed By: vkuzo Differential Revision: D34469445 fbshipit-source-id: 71a13d7d056f7aaccdd98fb477c8a3a38aecc249 (cherry picked from commit 0b10f0d127515556b677eae3150f026ac8cd9acd)	2022-03-02 10:30:37 +00:00
Jerry Zhang	bea075f305	[quant] Add support for multiple inputs in fusion pattern (#73572 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73572 Previously we can't specify how to get extra inputs for fused ops in backend_config_dict, for example, for patterns like: (torch.add, (nn.BatchNorm2d, nn.Conv2d), MatchAllNode) where nn.Conv2d is the root node, the extra MatchAllNode (the input for original torch.add) would be lost This PR added a "extra_inputs_getter" key in the backend_config_dict, which allows user to provide a function, that can return a list of extra input node for the fused op given the matched node pattern. In this case, we need a function that returns the node that matches with `MatchAllNode`, it would be something like the following: ``` def extra_inputs_getter(pattern): add, conv_bn, extra_input = pattern return [extra_input] ``` Test Plan: python test/test_quantization.py TestFuseFx.test_fusion_pattern_with_multiple_inputs Imported from OSS Reviewed By: vkuzo Differential Revision: D34553210 fbshipit-source-id: 748f8ce20974438458a39dbe9eae75281156c227 (cherry picked from commit be748526480e811874dbca64b1cf3bf4950f0393)	2022-03-02 08:37:07 +00:00
Andrew Or	fb2fe11ce4	[Quant][improvement] Rename ReferenceableQuantizedModule (#72717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72717 This will be renamed to WeightedQuantizedModule to minimize confusion with reference modules. Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34172554 fbshipit-source-id: 4cd77d6048fde4875218386f7e55f864a73d5bd3 (cherry picked from commit b7af4cedb4275b6f9c06c0773f2997bc4e61578a)	2022-03-01 17:43:16 +00:00
Jerry Zhang	d39ad0543a	[quant][fx] Remove Fuser class in fusion implementation (#73470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73470 att, this does not affect user apis since we are only exposing fuse_fx as a public api Test Plan: python test/test_quantization.py TestFuseFx Imported from OSS Reviewed By: vkuzo Differential Revision: D34495260 fbshipit-source-id: 3aa253bc7190e50acc7229186f210901ebc5481b (cherry picked from commit a88517ff6feff7abbece2234d82fd53e33702237)	2022-03-01 09:29:21 +00:00
Jerry Zhang	ad1078a21e	[quant] Enable reference path by default for CopyNodeQuantizeHandler (#73233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73233 This PR makes CopyNodeQuantizeHandler to always produce reference patterns, and we have some custom lowering pass to rewrite the reference qunatized patterns to quantized ops Lowering passes have been implemented previously, we just need to enable the reference path here, and cleanup the previous code to allow list some of the ops (`check_node`) Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: mrshenli Differential Revision: D34469446 fbshipit-source-id: b9d9c5f793fbb735839199056c197ae98969cc4b (cherry picked from commit af0cf4e79e11e7343d57e6ff7766c80e72ec60f3)	2022-03-01 01:33:30 +00:00
Jerry Zhang	5613527ef9	[quant][fx] Add lowering support for functional ops using DefaultNodeQuantizeHandler (#73120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73120 att This is to align our implementation with https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34354038 fbshipit-source-id: 873a867e62bd541ef236974c697fac2334bf02ea (cherry picked from commit 3fce7cade2f057b985833659c2cb365ee4d6d9f3)	2022-02-26 19:29:58 +00:00
Jerry Zhang	45a042037f	[quant][fx] Add root_node_getter in backend_config_dict (#73345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73345 For complex patterns we need to identify which node is the root, so that we can eliminate all other nodes and only preserve the root, e.g. (torch.add, MatchAllNode, (torch.nn.ReLU, torch.nn.Conv2d)), we can preserve the torch.nn.Conv2d as root node, and remove other nodes. Prevoiusly we assumed the root_node of a pattern is the "last node" of the pattern, computed by: ``` def default_root_node_getter(node_pattern): while not isinstance(node_pattern[-1], Node): node_pattern = node_pattern[-1] return node_pattern[-1] ``` This PR enables user configuration to define their own root_node_getter, that means we can define root_node for patterns like: (torch.add, (torch.nn.ReLU, torch.nn.Conv2d), MatchAllNode) Test Plan: python test/test_quantize_fx.py TestFuseFx.test_root_node_getter Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D34442193 fbshipit-source-id: 2f6da69a5b6527b49710ae32820e8e2915d9af37 (cherry picked from commit 8b49bf0d7d53cdcf2c9f40f8e25bc843e8814026)	2022-02-26 06:34:22 +00:00
Jerry Zhang	186ef8b22d	Fix test missing target (#73415 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73415 Fix test missing TARGET file issue. Test Plan: buck test mode/opt deeplearning/trt/fx2trt_oss/test/quant:test_quant_trt Reviewed By: yinghai Differential Revision: D34400710 fbshipit-source-id: e68145b4e70db5333f4a8d11a2d240a2f38b4077 (cherry picked from commit fe78e63f0c646409b1cdab91d3b139f1b0a97b9e)	2022-02-26 03:31:31 +00:00
Jerry Zhang	16554bec1b	[qunat][fx][fix] Fix get_module_type for fusion (#72735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72735 We use `get_matched_types` to get the (type) pattern from matched modules. And we need to use MatchAllNode instead of type(MatchAllNode) to query the fuser_method for the pattern Test Plan: TODO Imported from OSS Reviewed By: raghuramank10000 Differential Revision: D34180705 fbshipit-source-id: db9b6e791a9f26b70079fddc95fce033052199ab (cherry picked from commit 01d38afabcb1bfc207dee7d49ee13df500d32fdf)	2022-02-25 18:37:31 +00:00
Jerry Zhang	ee5b8f0c64	[quant][fx] Move MatchAllNode from match_utils.py to utils.py under quantization (#73344 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73344 not user facing as of now, since we haven't advertised the backend_config_dict api, we need this in fuser_method_mapping.py, this is to avoid circular dependency Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D34441778 fbshipit-source-id: 7a01c359e4b21e9e98345dc7781f735628209a20 (cherry picked from commit 758537094c5a98a17a8825b3f240c8d5acdd72b0)	2022-02-25 17:36:14 +00:00
Jerry Zhang	9db0e0e76e	[quant][graphmode] produce reference pattern for binary ops and then rewrite to quantized op (#72953 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72953 This PR makes BinaryOpQuantizeHandler to always produce reference patterns, and we have some custom lowering pass to rewrite the reference qunatized patterns to quantized ops it includes rewrite for torch.ops.quantized.add, torch.ops.quantized.mul, torch.ops.quantized.matmul Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: gchanan Differential Revision: D34292408 fbshipit-source-id: 9872a5098249bc77db15e9fb614416958e62b9b2 (cherry picked from commit dbdc61ee8b5dde2e54a34a370a3af887e5117398)	2022-02-25 17:36:14 +00:00
Terry Chen	16e2f5d291	[quant] Add ConvTranspose reference module - Reland #73031 (#73094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73094 Add ConvTranspose reference module Test Plan: python3 test/test_quantization.py TestQuantizeEagerOps.test_conv_transpose_2d Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34352228 fbshipit-source-id: 03062d6b441bc5a3298ec094f421a69c4c3d5c40 (cherry picked from commit `2f2bdd4fcf`)	2022-02-23 02:31:42 +00:00
Vasiliy Kuznetsov	6d86dc5390	dbr quant: store auto_quant_state on the top level model (#72934 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72934 Before this PR, DBR quantization had a limitation on handling user code which iterates over all module children. For example, imagine a forward function such as ``` def forward(self, x): for module in self: x = module(x) return x ``` Before this PR, this code would break with DBR quantization, because we attach `AutoQuantizationState` objects to each child, and those objects live in the child's module hierarchy and will appear in these kinds of iterations, changing the meaning of the user program. This PR reduces the scope of this problem to just the top level module. Instead of attaching `AutoQuantizationState` objects to each child, we register them in a map on the parent. Here is a before and after: ``` // toy model model \|--> child1 // toy model with AutoQuantizationState objects, before this PR model \|--> child1 \| \|--> _auto_quant_state \|--> _auto_quant_state // toy model with AutoQuantizationState objects, after this PR model \|--> child1 \|--> _fqn_to_auto_quant_state_map \|--> ( ) --> _auto_quant_state // of `model` \|--> (child1) --> _auto_quant_state // of `model.child1` ``` Note: `child1._auto_quant_state` works as before for convenience, but the `child1` object now stores a soft link to its `_auto_quant_state` instead of properly registering it in its module hierarchy. This is somewhat hacky. If we need to improve this in the future, we could remove this soft link and refactor the code to call the FQN map instead. Note: if the top level module iterates over its children, things will still be broken. This is less likely, and we will recommend that the user work around this by wrapping their model, or checking for the `AutoQuantizationStateModuleDict` type in their iteration loop. The impact of this change should be an improvement of coverage of user models. In fact, we expect this to drive our coverage of torchbenchmark models from 89% to 100%. Test Plan: ``` // previously disabled test cases with user code iterating // over module children are now enabled, with wrappers python test/test_quantization.py -k test_module_calls_items python test/test_quantization.py -k test_vovnet_sequential ``` Reviewed By: dzdang Differential Revision: D34281074 Pulled By: vkuzo fbshipit-source-id: 0e25fc1ec529c47f72478a1875fe43219feac6b1 (cherry picked from commit `4008f89967`)	2022-02-22 17:31:32 +00:00
Jane Xu	477d1bd6cf	Revert D34313425: [quant] Add ConvTranspose reference module Test Plan: revert-hammer Differential Revision: D34313425 (`710f12f58e`) Original commit changeset: 3eeec1b24a51 Original Phabricator Diff: D34313425 (`710f12f58e`) fbshipit-source-id: aecf9113d2e4cef3ccf4e1a9c4c33b07dc2ad385 (cherry picked from commit `3fcb9cd14d`)	2022-02-18 17:31:20 +00:00
Vasiliy Kuznetsov	1c0df26597	eager quant: convert mapping for fused QAT Linear-Bn1d (#72796 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72796 Adds the eager mode convert mappint for fused QAT Linear-Bn1d module. Test Plan: ``` python test/test_quantization.py TestQuantizeEagerQATNumerics.test_linear_bn_workflow ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34213150 fbshipit-source-id: c08b5eb843dea673fd07c6b7b93dcd3ba03eaec2 (cherry picked from commit `722edfe676`)	2022-02-18 13:14:56 +00:00
Vasiliy Kuznetsov	e73eaffd3b	quant: add QAT fused Linear-Bn1d [1/x]: prepared module (#72431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72431 Adds support for a fused QAT observed module for `Linear` followed by `BatchNorm1d`. In this PR, only the support for prepared module with fake_quants in the right places is added. A future PR will add support for `convert`, and tests for eager and FX graph mode workflows. Similar to conv-bn, we rescale the weight before applying the fake quant, and undo the rescaling after the linear operation. Test Plan: ``` python test/test_quantization.py TestQuantizeEagerQATNumerics.test_linear_bn ``` Imported from OSS Reviewed By: jerryzh168, raghuramank10000 Differential Revision: D34044427 fbshipit-source-id: 47a519173939ca4824d2c6e6ea7a599764a8ed10 (cherry picked from commit `bfc75fe078`)	2022-02-18 13:14:56 +00:00
Terry Chen	710f12f58e	[quant] Add ConvTranspose reference module (#73031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73031 Add ConvTranspose reference module Test Plan: python3 test/test_quantization.py TestQuantizeEagerOps.test_conv_transpose_2d Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34313425 fbshipit-source-id: 3eeec1b24a51c7951c4d4b0c7dca43a012468b85 (cherry picked from commit `0ee7c1cc39`)	2022-02-18 06:29:12 +00:00
Nikita Shulga	e6fd28fb05	Revert D34126542: [Qunat] Add ConvTranspose reference module Test Plan: revert-hammer Differential Revision: D34126542 (`7a031ec17f`) Original commit changeset: 7da167695a1f Original Phabricator Diff: D34126542 (`7a031ec17f`) fbshipit-source-id: 14e40884807b9908017ae30af83a8dea23ff1f0f (cherry picked from commit `f99a7f5a69`)	2022-02-16 22:24:15 +00:00
Terry Chen	f67cf03526	[Quant] Add qint32 quantization support (#72472 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72472 Add dtype=int32 support for observer Test Plan: python3 test/test_quantization.py TestObserver.test_per_tensor_observers Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34056640 fbshipit-source-id: 4fa15a7274cfbb6a7dd4e698e3989cc0c0626e7b (cherry picked from commit `bf4351de45`)	2022-02-16 03:45:15 +00:00
Terry Chen	7a031ec17f	[Qunat] Add ConvTranspose reference module (#72473 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72473 Add ConvTranspose reference module Test Plan: python3 test/test_quantization.py TestQuantizeEagerOps.test_conv_transpose_op Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34126542 fbshipit-source-id: 7da167695a1fd9c141059bce14cce4f0608b086c (cherry picked from commit `dee22dcf48`)	2022-02-16 01:56:28 +00:00
Jerry Zhang	3d377fb4a3	[quant][fx][improvement] Add lowering support for BatchNormQuantizeHandler (#72490 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72490 This is an effort to move the current implementation towards the reference quantized model design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md so that we use reference model in the default fbgemm/qnnpack path Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps.test_qbatch_norm Imported from OSS Reviewed By: vkuzo, andrewor14 Differential Revision: D34062365 fbshipit-source-id: ed015c61f5b969554a6477f92cf6be2358cb558c (cherry picked from commit `9498421ddd`)	2022-02-15 21:34:17 +00:00
Jerry Zhang	8b67b83c6e	[quant][fx][improvement] Add lowering support for FixedQParamsOpQuantizeHandler (#72488 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72488 This is an effort to move the current implementation towards the reference quantized model design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md so that we use reference model in the default fbgemm/qnnpack path Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34062364 fbshipit-source-id: 50c4a86644c3f5f6fb03d2a98aa7376895c0fc84 (cherry picked from commit `ed8122e44d`)	2022-02-11 18:13:29 +00:00
Vasiliy Kuznetsov	decc79e541	fx quant: add workflow support for torch.matmul quantization (#72444 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72444 In https://github.com/pytorch/pytorch/pull/71783 support was added for quantized matmul. In this PR, the FX graph mode quantization workflow support for this operator is added, for int8 dtypes. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_qmatmul ``` Imported from OSS Reviewed By: andrewor14 Differential Revision: D34047310 fbshipit-source-id: 781219047419ce621a4deb46ea04881818bf4209 (cherry picked from commit `7e039fa3a1`)	2022-02-09 18:43:58 +00:00
Jerry Zhang	ac0cac7724	[quant][fx][devs] Add lowering support for torch.cat (#72487 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72487 This is an effort to move the current implementation towards the reference quantized model design: https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md so that we use reference model in the default fbgemm/qnnpack path Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D34062366 fbshipit-source-id: 86673bead79180a7509b51bd577f328e90f24893 (cherry picked from commit `de3e443384`)	2022-02-09 06:09:57 +00:00
Jerry Zhang	4b69a2373f	[quant][fx] Add lowering support for ops in GeneralTensorShapeOpQuantizeHandler (#72387 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72387 Also make GeneralTensorShapeOpQuantizeHandler produce reference patterns by default Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: albanD, terrychenism Differential Revision: D34025005 fbshipit-source-id: 01ca62cce727bbf4579ba8fb2b8c40198f327b86 (cherry picked from commit `7f3a9ab4c5`)	2022-02-09 02:10:20 +00:00
Andrew Or	9d08318aa3	DBR Quantization: Add support for functional conv variants (#71795 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71795 This commit expands the API coverage of functional conv ops in DBR quantization from F.conv2d to all conv variants. Test Plan: python test/test_quantization.py TestQuantizeDBRIndividualOps.test_conv_functional Imported from OSS Reviewed By: albanD Differential Revision: D33907099 fbshipit-source-id: f459c219482822f64c7c9d22cd316c6e9ef44405 (cherry picked from commit `acf4548e8d`)	2022-02-08 22:52:27 +00:00
Vasiliy Kuznetsov	998a5adf8a	dbr quant function fusion [2/x]: use fusion for observation and inference (#71781 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71781 The previous PR added information about fusions found in the subgraphs. This PR uses that information for: 1. inserting observers at the end of fusions and not in the middle 2. during inference, replacing the original op with the fused op. The way this is implemented is that the base op is replaced with the fused op, and all other ops are replaced with identity functions. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR.test_fusion_functions ``` Reviewed By: jerryzh168 Differential Revision: D33775097 Pulled By: vkuzo fbshipit-source-id: 12249b85b2f7ba7545a54872aeb5f1ff2fc928cf (cherry picked from commit `0db4324ea9`)	2022-02-07 14:00:26 +00:00
Vasiliy Kuznetsov	d672bbd0a9	fx quant: add fusion matching for operator.add and torch.relu (#71780 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71780 Adds support for matching operator.add -> torch.relu in FX graph mode quantization. It would be nice to support torch.relu better in general, but saving that for a future PR to keep PRs small. This is useful for DBR quant because we have some test cases in DBR quant which use add-relu, and we'd like to match them to FX. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_add_relu python test/test_quantization.py TestQuantizeFxOps.test_mul_relu ``` Reviewed By: jerryzh168 Differential Revision: D33775096 Pulled By: vkuzo fbshipit-source-id: 889d9b41d3758ecbbb6d7eab67f64ce3d4892d24 (cherry picked from commit `c1f9f38ca1`)	2022-02-07 14:00:26 +00:00

... 3 4 5 6 7 ...

681 Commits