pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
vasiliy	dc70e8175f	Add various uninterpreted bit tensor data types (try 2) (#95860 ) Summary: This is a retry of https://github.com/pytorch/pytorch/pull/94992 which was reverted due to CI issues. This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc). @bypass-github-export-checks Test Plan: ``` python test/test_quantization.py -k TestBits ``` Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/95860 Approved by: https://github.com/atalman	2023-03-04 03:35:59 +00:00
PyTorch MergeBot	3bafecf719	Revert "Add various uninterpreted bit tensor data types (#94992 )" This reverts commit `9dbfca7840`. Reverted https://github.com/pytorch/pytorch/pull/94992 on behalf of https://github.com/atalman due to breaks libtorch windows nightly builds see: https://github.com/pytorch/pytorch/pull/95406	2023-02-23 23:54:23 +00:00
vasiliy	9dbfca7840	Add various uninterpreted bit tensor data types (#94992 ) Summary: This PR adds a set of unintrepreted data types on PyTorch which can be used to implement experimental functionality out of core (think fp8, int4, int16 quant, etc). Note: this is a copy-pasta of https://github.com/pytorch/pytorch/pull/89990 with a bug fix for clang9, easier to just to put up another PR since I'm not sure how comandeering works with Meta-only changes. @bypass-github-export-checks Test Plan: ``` python test/test_quantization.py -k TestBits ``` Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/94992 Approved by: https://github.com/angelayi	2023-02-18 00:04:30 +00:00
Jerry Zhang	8fa66a6337	[quant][pt2e] Add a test to confirm we can set qconfig according to module_name (#91977 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qconfig_none Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/91977 Approved by: https://github.com/jcaip	2023-01-12 21:59:02 +00:00
Jerry Zhang	f7b384cc46	[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#91035 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/91035 Approved by: https://github.com/HDCharles	2022-12-17 02:15:53 +00:00
PyTorch MergeBot	ad1b04c4a9	Revert "[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90971 )" This reverts commit `7dd5e55497`. Reverted https://github.com/pytorch/pytorch/pull/90971 on behalf of https://github.com/ezyang due to still broke tons of master jobs sorry	2022-12-16 09:29:39 +00:00
Jerry Zhang	7dd5e55497	[reland][quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90971 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90971 Approved by: https://github.com/HDCharles	2022-12-16 06:24:28 +00:00
PyTorch MergeBot	9c912c7dd0	Revert "[quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802 )" This reverts commit `a66af1feba`. Reverted https://github.com/pytorch/pytorch/pull/90802 on behalf of https://github.com/malfet due to somehow broke test_resnet18 (quantization.fx.test_quantize_pt2e.TestQuantizePT2EModels), see `a66af1feba`	2022-12-15 23:28:21 +00:00
Jerry Zhang	a66af1feba	[quant][pt2e] Add early prototype top level quantize_pt2e APIs (#90802 ) Summary: This PR introduces the top level APIs for quantization support in PyTorch 2.0 Export stack * torch.ao.quantization.quantize_pt2e.prepare_pt2e Takes a model that is captured by the PyTorch 2.0 export (torchdynamo full graph mode) and prepares the model for calibration for post training quantization * torch.ao.quantization.quantize_pt2e.convert_pt2e Takes a calibrated model and converts that to a reference quantized model that can be lowered later to quantized operator libraries or delegation modules Also added a backend config for the qnnpack_pt2e backend: * torch.ao.quantization.backend_config.get_qnnpack_pt2e_backend_config Note: everything related to quantize_pt2e are experimental (prototype), and we don't have any bc guarantees Test Plan: python test/test_quantization.py TestQuantizePT2EModels Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/90802 Approved by: https://github.com/qihqi	2022-12-15 21:50:29 +00:00
AllenTiTaiWang	bdb14238ec	[Reland][ONNX] Move all torch.onnx.export related tests to test/onnx (#87292 ) Moving torch.onnx.export related tests to test/onnx integrates ONNX tests to the same CI machine, so the testing environment can be better managed. Fixes https://github.com/pytorch/pytorch/issues/87320 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87292 Approved by: https://github.com/thiagocrepaldi, https://github.com/BowenBao, https://github.com/kit1980, https://github.com/malfet	2022-11-01 14:22:46 +00:00
Vasiliy Kuznetsov	237316aa1d	PNP: early FX numeric suite tool to quantize each layer N times (#80521 ) Summary: This PR is an early prototype of a tool to quantize each layer of a model N times, with N qconfigs each. We follow the design agreed upon in https://fburl.com/gdoc/e1gaq3ih . Current API: ``` m = M().eval() example_input = (torch.randn(2, 2),) qconfig_mappings = [ QConfigMapping().set_global(torch.quantization.default_qconfig), QConfigMapping().set_global(torch.quantization.default_dynamic_qconfig), ] backend_config = get_native_backend_config() msp = prepare_n_shadows_model( m, example_input, qconfig_mappings, backend_config) for _ in range(2): msp(example_input) msq = convert_n_shadows_model(msp) msq(example_input) results = extract_results_n_shadows_model(msq) print_comparisons_n_shadows_model(results) // example output subgraph_idx ref_node_name best_idx 1 2 -------------- --------------- ---------- ------- ------- subgraph_0 fc1 2 42.0834 42.6279 subgraph_1 fc2 2 43.7259 50.0593 ``` Test plan: ``` python test/test_quantization.py -k test_n_shadows ``` Differential Revision: [D37650332](https://our.internmc.facebook.com/intern/diff/D37650332) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80521 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2022-10-06 02:30:45 +00:00
zaf	d542aab5c1	[quant][ao_migration] nn.intrinsic migration to ao (#84842 ) All quantization-related modules are being migrated to `torch.ao`. This migrates the `nn.intrinsic.modules`. Please, see the [tracker](https://github.com/pytorch/pytorch/issues/81667) for the timeline. Differential Revision: [D39419733](https://our.internmc.facebook.com/intern/diff/D39419733/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39419733/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84842 Approved by: https://github.com/jerryzh168	2022-09-28 23:54:29 +00:00
Vasiliy Kuznetsov	58170fb8aa	Remove DBR quantization from the codebase (#83642 ) Summary: DBR quantization is a no-go for now because it does not align well with PyTorch 2.0 plans and we do not want to build yet another tracing system. Deleting it from the codebase for now since there are no plans to develop this in the near future. We can bring it back at a later time if necessary. Test plan: CI Differential Revision: [D38839556](https://our.internmc.facebook.com/intern/diff/D38839556) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83642 Approved by: https://github.com/andrewor14, https://github.com/jerryzh168	2022-08-23 15:18:40 +00:00
zaf	78c8a0d752	[quant][ao_migration] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` (#78712 ) Context: In order to avoid the cluttering of the `torch.nn` namespace the quantized modules namespace is moved to `torch.ao.nn`. The list of the `nn.quantized` files that are being migrated: - [ ] `torch.nn.quantized` → `torch.ao.nn.quantized` - [X] [Current PR] `torch.nn.quantized.functional` → `torch.ao.nn.quantized.functional` - [ ] `torch.nn.quantized.modules` → `torch.ao.nn.quantized.modules` - [ ] `torch.nn.quantized.dynamic` → `torch.ao.nn.quantized.dynamic` - [ ] `torch.nn.quantized._reference` → `torch.ao.nn.quantized._reference` - [ ] `torch.nn.quantizable` → `torch.ao.nn.quantizable` - [ ] `torch.nn.qat` → `torch.ao.nn.qat` - [ ] `torch.nn.qat.modules` → `torch.ao.nn.qat.modules` - [ ] `torch.nn.qat.dynamic` → `torch.ao.nn.qat.dynamic` - [ ] `torch.nn.intrinsic` → `torch.ao.nn.intrinsic` - [ ] `torch.nn.intrinsic.modules` → `torch.ao.nn.intrinsic.modules` - [ ] `torch.nn.intrinsic.qat` → `torch.ao.nn.intrinsic.qat` - [ ] `torch.nn.intrinsic.quantized` → `torch.ao.nn.intrinsic.quantized` - [ ] `torch.nn.intrinsic.quantized.modules` → `torch.ao.nn.intrinsic.quantized.modules` - [ ] `torch.nn.intrinsic.quantized.dynamic` → `torch.ao.nn.intrinsic.quantized.dynamic` Majority of the files are just moved to the new location. However, specific files need to be double checked: - [Documentation](docs/source/quantization-support.rst) @vkuzo - [Public API test list](test/allowlist_for_publicAPI.json) @peterbell10 Differential Revision: [D36792967](https://our.internmc.facebook.com/intern/diff/D36792967/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36792967/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/78712 Approved by: https://github.com/jerryzh168	2022-08-18 17:51:54 +00:00
mikey dagitses	3f612b58be	fix quantization/core/test_docs for Buck2 (#83341 ) Summary: We extract the test to its own target, fixing the relative path to the quantization docs. This allows us to find the docs with a more simple implementation. Test Plan: Tested locally with buck1 and buck2. Differential Revision: D38662169 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83341 Approved by: https://github.com/huydhn, https://github.com/seemethere, https://github.com/ZainRizvi	2022-08-18 13:03:00 +00:00
Andrew Or	194255bb56	[Quant][fx] Implement BackendConfig (part 1) (#81469 ) Summary: Following https://github.com/pytorch/pytorch/pull/78452 and https://github.com/pytorch/pytorch/pull/79066, this commit is part 1 of the broader effort to replace `backend_config_dict` with a python config object, a more formal and robust API that leads to better user experience. Note that there is no change in behavior in this commit by itself. A future commit (part 2) will replace all existing usages of `backend_config_dict` with the `BackendConfig` object added in this commit. Test Plan: python test/test_quantization.py TestBackendConfig Reviewers: jerryzh168 Subscribers: jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81469 Approved by: https://github.com/jerryzh168	2022-07-24 00:34:48 +00:00
vspenubarthi	d0ce1fbbe2	[ao] Created Skeleton for ModelReportVisualizer class (#81523 ) Summary: This introduces the skeleton for the ModelReportVisualizer class. This class helps visualize the information generated by the ModelReport class `generate_report()` output. This class aims to provide visualizations in a table, plot (line graph) and histogram view. This also introduces an empty test class for testing visualizations. As implementations start occuring for this class, tests will also be approrpriately added. This includes the high level descriptions for each of the methods as well. Expected use cases will be added to the class description in a future commit as that gets finalized. Test Plan: python test/test_quantization.py TestFxModelReportVisualizer Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/81523 Approved by: https://github.com/andrewor14	2022-07-20 02:39:14 +00:00
vspenubarthi	e5162dcfa7	[ao] Added framework for ModelReport Outlier Detector (#80743 ) Summary: This adds the class framework for the ModelReport OutlierDetector. This detector will be in charge of looking at activation data and figuring out whether there are significant oultiers present in them. It will average this data across batches to make a recommendation / warning if significant outliers are found. This commit contains just the class framework and a base test class. Implementations will follow in following commits. Test Plan: python test/test_quantization.py TestFxDetectOutliers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80743 Approved by: https://github.com/HDCharles	2022-07-01 01:03:31 +00:00
vspenubarthi	845021db2c	[ao] Adds framework for InputWeightEqualization Detector (#79916 ) Summary: This adds the framework (method signatures and descriptors) for the InputWeightEqualization Detector. There is no code implemenation yet so the test suite for this is a simple pass. This Detector will be used to determine whether input weight equalization should be recommended. Test Plan: python test/test_quantization.py TestFxDetectInputWeightEqualization Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79916 Approved by: https://github.com/HDCharles	2022-06-24 14:51:15 +00:00
HDCharles	ffdc5eebc7	[ao][docs] tests for quantization docs (#79923 ) Summary: per https://github.com/pytorch/pytorch/issues/79135 the code snippets in the docs don't run. This is a recurring problem since previously there was no unit test to check that these code snippets actually ran. This PR adds support for such a test, importing the snippet as a string and evaluating it to make sure that it actually runs if the code snippet has user defined code, you can pass in dummy versions using global_inputs. Sometimes the imports of the code snippets behave oddly but you can pass them in as in test_quantization_doc_custom where nnq is passed in. Test Plan: python test/test_quantization.py TestQuantizationDocs also see https://github.com/pytorch/pytorch/pull/79994 to see what shows up in CI when the docs get broken Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79923 Approved by: https://github.com/z-a-f, https://github.com/vspenubarthi	2022-06-23 20:50:31 +00:00
vspenubarthi	01720ae3b6	[ao] Added ModelReport class outline for Fx Graph Modules Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the init method and the signatures and docs for each of the proposed helper functions. This also address and fixes a revert issue. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/80052 Approved by: https://github.com/HDCharles	2022-06-22 21:12:58 +00:00
PyTorch MergeBot	ea6fa8dc95	Revert "[ao] Added ModelReport class outline for Fx Graph Modules" This reverts commit `0f95e1846c`. Reverted https://github.com/pytorch/pytorch/pull/79595 on behalf of https://github.com/malfet due to Broke tests on MacOS, see `0f95e1846c`	2022-06-22 12:43:07 +00:00
vspenubarthi	0f95e1846c	[ao] Added ModelReport class outline for Fx Graph Modules Summary: The ModelReport class in model_report.py combines the functionality of the detectors and the ModelReportObserver. It creates an end-to-end system where a user can pass in a prepared Graph Model to insert the ModelReportObservers, then after the user callibrates their model, the callibrated model can then be used by the ModelReport class to generate reports based on what the user wished to gather information on. This contains the init method and the signatures and docs for each of the proposed helper functions. Test Plan: python test/test_quantization.py TestFxModelReportClass Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79595 Approved by: https://github.com/andrewor14	2022-06-22 02:47:24 +00:00
vspenubarthi	38952d9350	[ao] Added function to inform dynamic vs static appropriate Summary: The _detect_dynamic_vs_static function was added to take in a prepared fx graph model that already had ModelReportObservers built into it and uses the collected information to determine whether input and output are stationary or non-stationary and provides feedback on whether to make linear modules static or dynamic based on this information. This PR will be followed up soon with another PR that will more rigoursly test the whole end to end performance of this system, which is primarily how the function in this PR will be tested for functionality, which is why this one only has 1 test. Test Plan: python test/quantization/fx/test_model_report_fx.py TestModelReportDetectDynamicStatic Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79326 Approved by: https://github.com/HDCharles	2022-06-15 02:51:27 +00:00
vspenubarthi	8e05513152	[ao] Added ModelReportObserver to inform on dynamic vs static Summary: The purpose of this is to add to the module report functioality by creating an observer that will take a prepared fx module and suggest whether static or dynamic quantization is more appropriate. The tests for this have been written and included in the location indicated by the Test Plan Test Plan: python test/quantization/fx/test_model_report_fx.py TestModelReportObserver Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79243 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2022-06-14 19:08:40 +00:00
vspenubarthi	28c541776c	[ao] Added fx model report per_channel detector Summary: This code is meant to be a tool to help people get the most out of their backend by hinting them to use per_channel quantization if it's supported, which will help increase accuracy significantly. The code is completed and ready to be reviewed. Test Plan: test/quantization/fx/test_model_report_fx.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79104 Approved by: https://github.com/HDCharles	2022-06-10 08:09:59 +00:00
Jerry Zhang	7ea5fa3dd4	[reland][quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78286 Approved by: https://github.com/dzdang	2022-05-25 23:31:51 +00:00
PyTorch MergeBot	87148f2b59	Revert "[quant] Add utility function get_fqn_to_example_inputs" This reverts commit `50a44fe461`. Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master	2022-05-25 06:37:32 +00:00
Jerry Zhang	50a44fe461	[quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146 Approved by: https://github.com/vkuzo	2022-05-25 03:07:16 +00:00
Jerry Zhang	81437e66c1	[quant][fx] Add RNN reference module (#73386 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73386 This PR adds support for RNN reference module, following https://github.com/pytorch/rfcs/blob/master/RFC-0019-Extending-PyTorch-Quantization-to-Custom-Backends.md This includes: RNNCell, LSTMCell, GRUCell, LSTM Test Plan: will be tested in the lowering flow in a separate PR Imported from OSS Reviewed By: vkuzo Differential Revision: D34469445 fbshipit-source-id: 71a13d7d056f7aaccdd98fb477c8a3a38aecc249 (cherry picked from commit 0b10f0d127515556b677eae3150f026ac8cd9acd)	2022-03-02 10:30:37 +00:00
Vasiliy Kuznetsov	4e90fa6a8c	dbr quant: break up test class into multiple classes (#70246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70246 Breaks up the large `TestQuantizeDBR` test case into 1. `TestQuantizeDBRIndividualOps` for testing functionality of ops 2. `TestQuantizeDBRMultipleOps` for testing non-fusion interactions between ops 3. `TestQuantizeDBR` for everything else We may need to refactor this more in the future, but this should unblock things for the near future. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR python test/test_quantization.py TestQuantizeDBRIndividualOps python test/test_quantization.py TestQuantizeDBRMultipleOps ``` Reviewed By: jerryzh168 Differential Revision: D33255925 Pulled By: vkuzo fbshipit-source-id: 82db1a644867e9303453cfedffed2d81d083c9cd	2022-01-05 06:36:41 -08:00
Jerry Zhang	ef6f776e82	[quant][be] Cleanup test cases for eager mode workflow (#69880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69880 Making the test cases more standardized, in general we would like to have ``` TestQuantizeEager, TestQuantizeEagerOps, TestQuantizeEagerModels, ``` but currently since we have separate ptq static, ptq dynamic and qat static apis, we only partially cleaned up the test cases, we can merge all of them later when we merge all the apis Test Plan: python test/test_quantization.py Imported from OSS Reviewed By: supriyar Differential Revision: D33081418 fbshipit-source-id: fcb96559b76bbc51eb1b0625e0d4b193dbb37532	2021-12-16 17:47:30 -08:00
Jerry Zhang	1940cc028e	[quant][graphmode][fx] Fork subgraph_rewriter from torch.fx to quantization (#68228 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68228 Forking this for now so that we can make changes as we need, the changes can be merged back to torch.fx later Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32537713 fbshipit-source-id: 326598d13645fcc28ef2c66baaac6a077b80fd0c	2021-11-24 10:49:05 -08:00
Jerry Zhang	a6d862c50a	[quant][graphmode][fx] Add support for weight and bias dtype in backend_config_dict (#68602 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68602 This PR adds support for configuring weight/bias dtype in backend_config_dict and refactor the current code that checks when to insert observers Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D32537712 fbshipit-source-id: 28eb7c61a8dcad8c1f3f6622d490a34cff0c59e2	2021-11-19 13:01:50 -08:00
Charles David Hernandez	7ee84ad321	Refactoring quantized op tests to combine test classes (#68282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68282 Combined 3 Dynamic quantized op test classes into 1 Test Plan: python test/test_quantization.py TestDynamicQuantizedOps Imported from OSS Reviewed By: jerryzh168 Differential Revision: D32402163 fbshipit-source-id: 696b7ef5d823632941dc7afc95161501445d0e18	2021-11-15 20:47:02 -08:00
Charles David Hernandez	09615cd0b0	Adding Dynamic Conv and ConvT ops/modules (#68176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68176 it should be noted that for the modules, reduce_range is set to true by default in a similar fashion to linear_dynamic. Test Plan: python test/test_quantization.py TestDynamicQuantizedModule python test/test_quantization.py TestDynamicQuantizedConv python test/test_quantization.py TestQuantizedConv Imported from OSS Reviewed By: kimishpatel Differential Revision: D32374003 fbshipit-source-id: 011562bd0f4d817387d53bb113df2600aa60a7a3	2021-11-15 16:42:25 -08:00
Vasiliy Kuznetsov	4466ba8f30	Working POC of define-by-run quantization (#64676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64676 We implement a working eager mode quantization flow which uses tracing and `__torch_function__` and `torch.nn.Module.__call__` overrides to automate the model modifications needed for quantization. Partial program capture (instead of full program capture) is used, allowing this scheme to target a wide variety of user programs. Control flow over quantizeable ops is not supported, but general control flow is supported. In particular: * `auto_trace.py` contains the machinery to override `__torch_function__` and `torch.nn.Module.__call__` and call hooks before and after each quantizeable module or function * `quantization_state.py` contains the state needed to use the hooks to implement quantization logic such as adding quants/dequants, observers, etc. * please see `README.md` for more details Test Plan: ``` python test/test_quantization.py TestAutoTracing python test/test_quantization.py TestAutoTracingModels ``` ``` python test/test_quantization.py TestAutoTracing python test/test_quantization.py TestAutoTracingModels ``` Differential Revision: D31992281 D31992281 Reviewed By: HDCharles Pulled By: vkuzo fbshipit-source-id: 6d40e855f3c96b9a4b637a0e677388a7b92f7967	2021-11-11 06:25:24 -08:00
Ben Koopman	3aadff651c	[quant][embedding qat][bugfix] Fix and test QAT EmbeddingBag from_float error message (#66989 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66989 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D31961773 Pulled By: b-koopman fbshipit-source-id: 0d28728c87751ffc696ac221c3e8e75ac923cc57	2021-10-28 06:29:20 -07:00
Jerry Zhang	a7bbf8814c	[quant][graphmode][fx] Move quant-fx2trt unittests to test_quantize_fx.py (#67064 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67064 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D31849075 fbshipit-source-id: 9c5e8aad7c88070830d853faf3106491726e77ff	2021-10-22 14:36:36 -07:00
Jane Xu	6a224b3370	Set test owners for quantization tests (#66832 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/66832 Reviewed By: saketh-are Differential Revision: D31842880 Pulled By: janeyx99 fbshipit-source-id: 8aee760e4203045c12e7548a21ed5b71c557e3ee	2021-10-21 16:04:41 -07:00
Jerry Zhang	a89851a0d9	[quant][fx][graphmode] Adding a new convert function that produces reference pattern by default (#66925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66925 Current convert_fx implementation is using "The Interpreter Pattern" in https://pytorch.org/docs/stable/fx.html There are two things that's changed which make the approach in this PR possible and needed: 1). original convert implementation is developed at the initial prototype where fx does not allow mutations, now fx supports mutations 2). original convert needs to work for a lot of fbgemm/qnnpack specific logic, which is not needed for reference patterns Therefore it makes sense for us to write a new convert function just for reference patterns, the implementation is significantly easier to understand than the original convert implementation Current support: * we should be able to support all non-weighted ops like relu, add etc. Missing: * linear and conv * some advanced features like standalone modules, input_quantized_idxs etc. will add linear and conv support and start defining the backend_config_dict based on this version of convert Test Plan: python test/test_quantization.py TestQuantizeFxOpsNew Imported from OSS Reviewed By: vkuzo Differential Revision: D31786241 fbshipit-source-id: 2a32156eb6d3c5271cb44906cd863055785fb5d4	2021-10-20 18:54:30 -07:00
Vasiliy Kuznetsov	1d9a6862cd	fx quant: add a BC test for loading old torch.package models (#65538 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65538 Adds a test which verifies that `prepare_fx` and `convert_fx` work on models created by `torch.package` in the past. In detail: 1. (one time) create a model and save it with torch.package. Also save input, expected output, and names of quantization related get_attrs added by our passes. 2. (every time) load the model from (1), and verify that expected output matches current output, and that get_attr targets did not change. Test Plan: ``` python test/test_quantization.py TestSerialization.test_linear_relu_package_quantization_transforms ``` Imported from OSS Reviewed By: supriyar Differential Revision: D31512939 fbshipit-source-id: 718ad5fb66e09b6b31796ebe0dc698186e9a659f	2021-10-11 08:23:38 -07:00
Jerry Zhang	508845f2b5	[quant] AO migration of the `torch/quantization/quantize_fx.py` and `torch/quantization/fx/` (#65033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65033 1. Move the file: ``` hg mv caffe2/torch/quantization/fx caffe2/torch/ao/quantization/fx hg mv caffe2/torch/quantization/quantize_fx.py caffe2/torch/ao/quantization/quantize_fx.py ``` 2. Create new files ``` touch caffe2/torch/quantization/quantize_fx.py touch caffe2/torch/quantization/fx/__init__.py ``` 3. import things in the new files 4. add tests to test/quantization/ao_migration/test_quantization_fx.py this is because we have some fx import in quantize_fx and fx/.py Test Plan: buck test mode/dev //caffe2/test:quantization Reviewed By: vkuzo, z-a-f Differential Revision: D30949749 fbshipit-source-id: 9e5d4d039c8a0a0820bc9040e224f0d2c26886d3	2021-09-22 09:29:15 -07:00
Zafar Takhirov	425f173f9d	[quant][refactor] Change the structure of the ao migration tests (#64912 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64912 The test naming was confusing and ambiguous. The file was changed to reflect the framework that is being migrated ("quantization" instead of "quantize"). Also, the common testing class was extracted out ghstack-source-id: 138157450 Test Plan: `buck test mode/dev //caffe2/test:quantization -- TestAOMigrationQuantization` Reviewed By: vkuzo Differential Revision: D30898214 fbshipit-source-id: 017f95995271d35bcdf6ff6a1b3974b837543e84	2021-09-15 13:15:43 -07:00
Zafar Takhirov	9cc44aad21	[quant] AO migration of the `quantize.py` (resubmission) (#64445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64445 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the quantize.py from torch.quantization to torch.ao.quantization. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/dev //caffe2/test:quantization` Reviewed By: HDCharles Differential Revision: D30734870 fbshipit-source-id: dc204f3cc46bff2cc81c95159eab9d333b43bb4b	2021-09-08 04:58:47 -07:00
Zafar Takhirov	046ed57a4d	Revert D30055886: [quant] AO migration of the `quantize.py` Test Plan: revert-hammer Differential Revision: D30055886 (`44e3ed88c9`) Original commit changeset: 8ef7470f9fa6 fbshipit-source-id: c5bd3ead43a2d44b9e56872ec5bd7a195bdac725	2021-09-02 16:59:59 -07:00
Zafar Takhirov	44e3ed88c9	[quant] AO migration of the `quantize.py` (#64086 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64086 AO Team is migrating the existing torch.quantization into torch.ao.quantization. We are doing it one file at a time to make sure that the internal callsites are updated properly. This migrates the `quantize.py` from torch.quantization to `torch.ao.quantization`. At this point both locations will be supported. Eventually the torch.quantization will be deprecated. Test Plan: `buck test mode/opt //caffe2/test:quantization` Reviewed By: jerryzh168, raghuramank100 Differential Revision: D30055886 fbshipit-source-id: 8ef7470f9fa640c0042bef5bb843e7a05ecd0b9f	2021-08-29 20:30:01 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Supriya Rao	b8386f5d72	[quant] Create FusedMovingAvgObsFakeQuantize for QAT (#61691 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61691 Create a new module for QAT that does a Fused MovingAvgMinMaxObserver and FakeQuantize operation The module currently only supports per-tensor quantization (affine/symmetric). Follow-up PR will add support for per-channel Results on running QAT with MobileNetV2 (Obs enabled/fake_quant enabled) Original FQ module PyTorchObserver {"type": "_", "metric": "qnnpack_fp_latency_ms", "unit": "ms", "value": "242.80261993408203"} PyTorchObserver {"type": "_", "metric": "qnnpack_qat0_latency_ms", "unit": "ms", "value": "505.7964324951172"} PyTorchObserver {"type": "_", "metric": "fbgemm_fp_latency_ms", "unit": "ms", "value": "235.80145835876465"} PyTorchObserver {"type": "_", "metric": "fbgemm_qat0_latency_ms", "unit": "ms", "value": "543.8144207000732"} Fused FakeQuant module (~50% improvement in latency) PyTorchObserver {"type": "_", "metric": "qnnpack_fp_latency_ms", "unit": "ms", "value": "232.1624755859375"} PyTorchObserver {"type": "_", "metric": "qnnpack_qat0_latency_ms", "unit": "ms", "value": "263.8866901397705"} PyTorchObserver {"type": "_", "metric": "fbgemm_fp_latency_ms", "unit": "ms", "value": "236.9832992553711"} PyTorchObserver {"type": "_", "metric": "fbgemm_qat0_latency_ms", "unit": "ms", "value": "292.1590805053711"} Individual module benchmark result (>5x improvement in latency) ===> Baseline FakeQuantize module ``` --------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg # of Calls --------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ aten::fake_quantize_per_tensor_affine 0.77% 1.210ms 4.92% 7.730ms 154.596us 718.528us 0.45% 9.543ms 190.862us 50 aten::fake_quantize_per_tensor_affine_cachemask 2.41% 3.792ms 4.15% 6.520ms 130.402us 8.825ms 5.58% 8.825ms 176.492us 50 aten::_aminmax 3.25% 5.105ms 4.43% 6.955ms 139.102us 8.193ms 5.18% 8.193ms 163.868us 50 aten::zeros_like 1.87% 2.939ms 6.95% 10.922ms 109.218us 5.992ms 3.79% 10.844ms 108.442us 100 aten::zeros 0.97% 1.527ms 3.11% 4.885ms 97.702us 2.383ms 1.51% 4.800ms 96.010us 50 aten::rsub 1.34% 2.106ms 2.94% 4.614ms 92.277us 2.063ms 1.30% 4.559ms 91.173us 50 aten::clamp 2.79% 4.381ms 5.42% 8.519ms 85.190us 5.385ms 3.41% 8.438ms 84.381us 100 aten::eq 11.70% 18.384ms 21.31% 33.479ms 83.280us 22.465ms 14.21% 33.310ms 82.861us 402 aten::ones 1.05% 1.656ms 2.57% 4.038ms 80.751us 2.494ms 1.58% 3.951ms 79.028us 50 aten::le 2.52% 3.955ms 4.84% 7.607ms 76.071us 4.998ms 3.16% 7.702ms 77.016us 100 aten::min 0.69% 1.087ms 2.32% 3.641ms 72.827us 1.017ms 0.64% 3.603ms 72.055us 50 aten::max 1.40% 2.195ms 4.62% 7.260ms 72.597us 2.008ms 1.27% 7.140ms 71.404us 100 aten::is_nonzero 2.68% 4.207ms 11.35% 17.829ms 71.033us 4.062ms 2.57% 17.225ms 68.625us 251 aten::detach 1.17% 1.831ms 3.65% 5.736ms 57.360us 1.680ms 1.06% 5.634ms 56.340us 100 aten::mul 3.36% 5.278ms 3.36% 5.278ms 53.862us 5.215ms 3.30% 5.215ms 53.216us 98 aten::div 3.42% 5.376ms 3.42% 5.376ms 53.759us 5.320ms 3.36% 5.320ms 53.196us 100 aten::sub 6.79% 10.672ms 6.79% 10.672ms 53.901us 10.504ms 6.64% 10.504ms 53.050us 198 aten::item 4.06% 6.380ms 12.02% 18.883ms 53.798us 6.127ms 3.87% 18.322ms 52.198us 351 aten::add 3.28% 5.147ms 3.28% 5.147ms 52.518us 5.113ms 3.23% 5.113ms 52.171us 98 aten::minimum 1.63% 2.555ms 1.63% 2.555ms 51.092us 2.585ms 1.64% 2.585ms 51.708us 50 aten::maximum 3.22% 5.065ms 3.22% 5.065ms 50.646us 5.133ms 3.25% 5.133ms 51.329us 100 aten::round 1.61% 2.529ms 1.61% 2.529ms 50.578us 2.528ms 1.60% 2.528ms 50.552us 50 aten::zero_ 1.99% 3.125ms 4.72% 7.422ms 49.481us 2.835ms 1.79% 7.269ms 48.462us 150 aten::copy_ 6.62% 10.394ms 6.62% 10.394ms 41.576us 10.252ms 6.48% 10.252ms 41.010us 250 detach 2.49% 3.905ms 2.49% 3.905ms 39.049us 3.954ms 2.50% 3.954ms 39.539us 100 aten::select 2.01% 3.154ms 2.47% 3.876ms 38.759us 3.866ms 2.44% 3.866ms 38.658us 100 aten::_local_scalar_dense 7.96% 12.503ms 7.96% 12.503ms 35.621us 12.195ms 7.71% 12.195ms 34.743us 351 aten::to 2.31% 3.625ms 4.16% 6.530ms 32.650us 4.320ms 2.73% 6.270ms 31.348us 200 aten::fill_ 3.70% 5.808ms 3.70% 5.808ms 29.039us 5.892ms 3.73% 5.892ms 29.459us 200 aten::as_strided 0.79% 1.244ms 0.79% 1.244ms 6.221us 0.000us 0.00% 0.000us 0.000us 200 aten::empty 3.55% 5.579ms 3.55% 5.579ms 11.137us 0.000us 0.00% 0.000us 0.000us 501 aten::resize_ 2.36% 3.712ms 2.36% 3.712ms 12.332us 0.000us 0.00% 0.000us 0.000us 301 aten::empty_like 1.45% 2.284ms 3.68% 5.776ms 28.878us 0.000us 0.00% 0.000us 0.000us 200 aten::empty_strided 2.80% 4.398ms 2.80% 4.398ms 17.592us 0.000us 0.00% 0.000us 0.000us 250 --------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ Self CPU time total: 157.108ms Self CUDA time total: 158.122ms ``` ===> FusedFakeQuant ``` ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ Name Self CPU % Self CPU CPU total % CPU total CPU time avg Self CUDA Self CUDA % CUDA total CUDA time avg # of Calls ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ fb::fused_fake_quant 23.42% 6.408ms 100.00% 27.361ms 547.215us 7.887ms 27.20% 28.996ms 579.925us 50 aten::fake_quantize_per_tensor_affine 4.25% 1.162ms 27.65% 7.565ms 151.298us 686.176us 2.37% 10.217ms 204.336us 50 aten::_fake_quantize_per_tensor_affine_cachemask_ten... 14.11% 3.860ms 23.40% 6.403ms 128.068us 9.531ms 32.87% 9.531ms 190.612us 50 aten::_aminmax 20.57% 5.628ms 27.47% 7.515ms 150.305us 8.218ms 28.34% 8.218ms 164.367us 50 aten::item 3.65% 999.522us 10.27% 2.810ms 56.202us 931.904us 3.21% 2.674ms 53.481us 50 aten::_local_scalar_dense 6.62% 1.811ms 6.62% 1.811ms 36.212us 1.742ms 6.01% 1.742ms 34.843us 50 aten::empty 10.85% 2.969ms 10.85% 2.969ms 14.843us 0.000us 0.00% 0.000us 0.000us 200 aten::as_strided 1.92% 524.365us 1.92% 524.365us 5.244us 0.000us 0.00% 0.000us 0.000us 100 aten::empty_like 6.48% 1.774ms 14.62% 4.000ms 26.670us 0.000us 0.00% 0.000us 0.000us 150 aten::empty_strided 8.14% 2.226ms 8.14% 2.226ms 14.842us 0.000us 0.00% 0.000us 0.000us 150 ------------------------------------------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ Self CPU time total: 27.361ms Self CUDA time total: 28.996ms ``` Test Plan: python test/test_quantization.py TestFusedObsFakeQuantModule Imported from OSS Reviewed By: vkuzo Differential Revision: D29706889 fbshipit-source-id: ae3f9fb1fc559920459bf6e8663e8299bf7d21e1	2021-07-21 10:13:04 -07:00

1 2 3 4 5

207 Commits