pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
James Reed	c941dd3492	[FX] s/get_param/get_attr/ (#45000 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45000 Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D23798016 Pulled By: jamesr66a fbshipit-source-id: 1d2f3db1994a62b95d0ced03bf958e54d30c35dd	2020-09-21 14:09:32 -07:00
Vasiliy Kuznetsov	2163d31016	histogram observer: ensure buffer shape consistency (#44956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44956 Makes buffer shapes for HistogramObserver have the same shapes in uninitialized versus initialized states. This is useful because the detectron2 checkpointer assumes that these states will stay the same, so it removes the need for manual hacks around the shapes changing. Test Plan: ``` python test/test_quantization.py TestObserver.test_histogram_observer_consistent_buffer_shape ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23785382 fbshipit-source-id: 1a83fd4f39b244b00747c368d5d305a07d877c92	2020-09-19 09:29:39 -07:00
Supriya Rao	1fde54d531	[quant][qat] Ensure fake_quant and observer can be disabled on scriptmodule (#44773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44773 The model is created and prepared using fx APIs and then scripted for training. In order to test QAT on scriptmodel we need to be able to disable/enable fake_quant and observer modules on it. Test Plan: python test/test_quantization.py TestQuantizeFx.test_qat_and_script Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23741354 fbshipit-source-id: 3fee7aa9b049d9901313b977710f4dc1c4501532	2020-09-17 10:21:52 -07:00
Supriya Rao	361b38da19	[quant][fx] Add node name as prefix to observer module name (#44765 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44765 Test Plan: python test/test_quantization.py TestQuantizeFx.test_save_observer_state_dict Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23741355 fbshipit-source-id: 7185ceae5b3b520ac0beebb627c44eab7ae7d231	2020-09-17 10:17:42 -07:00
Xiang Gao	20ac736200	Remove py2 compatible future imports (#44735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735 Reviewed By: mruberry Differential Revision: D23731306 Pulled By: ezyang fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f	2020-09-16 12:55:57 -07:00
Supriya Rao	3f512b0de2	[quant][qat] Ensure observers and fq modules are scriptable (#44749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44749 Ensure fx module is scriptable after calling prepare_qat on it Test Plan: python test/test_quantization.py TestQuantizeFx.test_qat_and_script Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23718380 fbshipit-source-id: abf63ffb21e707f7def8f6c88246877f5aded58c	2020-09-16 09:30:07 -07:00
Jerry Zhang	e594c30bc2	[quant][graphmode][fx] Support fp16 dynamic quantization for linear (#44582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44582 Test Plan: test_quantize_fx.py Imported from OSS Reviewed By: vkuzo Differential Revision: D23665974 fbshipit-source-id: 19ba6c61a9c77ef570b00614016506e9a2729f7c	2020-09-14 15:43:08 -07:00
Zafar	742654d1b6	[quant] ConvTranspose1d / ConvTranspose2d (#40371 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40371 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D22158981 Pulled By: z-a-f fbshipit-source-id: defbf6fbe730a58d5b155dcb2460dd969797215c	2020-09-14 14:25:06 -07:00
Jerry Zhang	b6f0ea0c71	[quant][graphmode][fx][fix] Remove qconfig in convert (#44526 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44526 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23641960 fbshipit-source-id: 546da1c16694d1e1dfb72629085acaae2165e759	2020-09-11 15:51:47 -07:00
Jerry Zhang	a82ea6a91f	[quant][graphmode][fx][fix] Support None qconfig in convert (#44524 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44524 None qconfig is not handled previously closes: https://github.com/pytorch/pytorch/issues/44438 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23640269 fbshipit-source-id: 8bfa88c8c78d4530338d9d7fa9669876c386d91f	2020-09-11 15:22:25 -07:00
Vasiliy Kuznetsov	70dfeb44bd	MinMax based observers: respect device affinity for state_dict (#44537 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44537 Originally, the `min_val`, `max_val`, `min_vals`, `max_vals` attributes of observers were Tensors but not buffers. They had custom state_dict save/load code to ensure their state was saved. At some point, these attributes became buffers, and the custom save/load code remained. This introduced a subtle bug: * create model A, move it to a device (cpu/cuda) and save its state_dict * create model B, load its state dict. * `min_val\|min_vals\|max_val\|max_vals` would always be loaded to model A's device, even if the rest of model B was on a different device * the above is inconsistent with how save/load on different devices is expected to work (see https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-across-devices) In practice, the case people would sometimes hit is: * model A is on CPU, state dict is saved * model B is created and moved to GPU, state_dict from model A is loaded * assertions throw when operations are attempted across different devices This PR fixes the behavior by removing the custom save/load where possible and letting the default `nn.Module` save/load code handle device assignment. We special case `PerChannelMinMaxObserver` and its children to allow for loading buffers or different size, which is normal. There are some followups to also enable this for HistogramObserver and FakeQuantize, which can be done in separate PRs due to higher complexity. Test Plan: ``` python test/test_quantization.py TestObserver.test_state_dict_respects_device_affinity ``` Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23644493 fbshipit-source-id: 0dbb6aa309ad569a91a663b9ee7e44644080032e	2020-09-11 14:48:56 -07:00
Jerry Zhang	11fb51d093	[quant][graphmode][fx][fix] Support dictionary output (#44508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44508 Bug fix for dictionary output Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23636182 fbshipit-source-id: 0c00cd6b9747fa3f8702d7f7a0d5edb31265f466	2020-09-11 11:29:20 -07:00
Jerry Zhang	0c58a017bd	[quant][eagermode][refactor] Add set/get method for quantization and fusion mappings (#43990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43990 Allow user to register custom quantization and fusion patterns Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D23485344 fbshipit-source-id: 4f0174ee6d8000d83de0f73cb370e9a1941d54aa	2020-09-10 21:29:39 -07:00
Supriya Rao	646ffd4886	[quant] Move EmbeddingBag eager quantization to static (#44217 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44217 Move the tests to static ones as well Test Plan: python test/test_quantization.py TestStaticQuantizedModule.test_embedding_bag_api Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23547386 fbshipit-source-id: 41f81c31e1613098ecf6a7eff601c7dcd4b09c76	2020-09-08 19:05:02 -07:00
Supriya Rao	57b87aaf59	[quant] Add quantized Embedding module (#44208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44208 Add quantized module in static quantization namespace. Embedding quantization requires only weights to be quantized so it is static. Internally it calls the embedding_bag_byte op with the offsets set corresponding to the indices. Future PR will move EmbeddingBag quantization from dynamic to static as well. Test Plan: python test/test_quantization.py test_embedding_api Imported from OSS Reviewed By: vkuzo Differential Revision: D23547384 fbshipit-source-id: eddc6fb144b4a771060e7bab5853656ccb4443f0	2020-09-08 19:04:59 -07:00
Jerry Zhang	6269b6e0f0	[quant][graphmode][fx][api] Call fuse in prepare (#43984 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43984 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23459261 fbshipit-source-id: 6b56b0916d76df67b9cc2f4be1fcee905d604019	2020-09-08 18:09:26 -07:00
Jerry Zhang	9f54bcc522	[quant][graphmode][fx] Support inplace option (#43983 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43983 Support inplace option in apis Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23459260 fbshipit-source-id: 80409c7984f17d1a4e13fb1eece8e18a69ee43b3	2020-09-08 17:39:13 -07:00
Vasiliy Kuznetsov	00b5bd536f	fx quant: add docblocks to _find_matches and _find_quants (#43928 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43928 Improving readability, no logic change. Test Plan: CI Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23440249 fbshipit-source-id: a7ebfc7ad15c73e26b9a94758e7254413cc17d29	2020-09-08 16:13:11 -07:00
Jerry Zhang	43e38d60d6	[quant][graphmode][fx] Support quantize per channel in all cases (#44042 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44042 Missed one case last time Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23479345 fbshipit-source-id: 30e6713120c494e9fab5584de4df9b25bec83d32	2020-09-08 15:45:14 -07:00
Vasiliy Kuznetsov	fd8e2064e0	quant: switch observers to use min_max (#42957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42957 Switches observers to use the new min_max function to calculate min and max at the same time. We see around 45-50% speedup on representative input shapes on the microbenchmarks for all observers except `HistogramObserver`. Test Plan: CI for correctness performance: ``` cd benchmarks/operator_benchmark // repeat (before diff, after diff) x (cpu, cuda) python -m pt.qobserver_test --tag_filter all --device cpu /* * before, cpu: https://our.intern.facebook.com/intern/paste/P138633280/ * before, cuda: https://our.intern.facebook.com/intern/paste/P138639473/ * after, cpu: https://our.intern.facebook.com/intern/paste/P138635458/ * after, cuda: https://our.intern.facebook.com/intern/paste/P138636344/ */ ``` Imported from OSS Reviewed By: supriyar Differential Revision: D23093995 fbshipit-source-id: 9f416d144109b5b80baf089eb4bcfabe8fe358d5	2020-09-08 11:39:44 -07:00
Vasiliy Kuznetsov	618b4dd763	fx quant prepare: clarify naming (#44125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44125 In `Quantizer._prepare`, `observed` was used for two different variables with different types. Making the names a bit cleaner and removing the name conflict. Test Plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: dskhudia Differential Revision: D23504109 fbshipit-source-id: 0f73eac3d6dd5f72ad5574a4d47d33808a70174a	2020-09-04 21:29:56 -07:00
Zachary DeVito	2ad5a82c43	[fx] get rid of graph_module.root (#44092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44092 instead submodules and weights are installed directly on the graph_module by transferring the original modules. This makes it more likely that scripting will succeed (since we no longer have submodules that are not used in the trace). It also prevents layered transforms from having to special case handling of the `root` module. GraphModules can now be re-traced as part of the input to other transforms. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D23504210 Pulled By: zdevito fbshipit-source-id: f79e5c4cbfc52eb0ffb5d6ed89b37ce35a7dc467	2020-09-04 11:35:32 -07:00
Vinod Kumar S	2a1fc56694	replace the white list from default mappings (#41802 ) Summary: Replaced "whitelist" from default_mappings.py Fixes https://github.com/pytorch/pytorch/issues/41756 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41802 Reviewed By: ngimel Differential Revision: D23521452 Pulled By: malfet fbshipit-source-id: 019a2d5c06dc59dc53d6c48b70fb35b216299cf4	2020-09-04 10:04:28 -07:00
Vasiliy Kuznetsov	71510c60ad	fx qat: respect device affinity (#44115 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44115 Fixes device affinity in the FX prepare pass for QAT. Before this PR, observers were always created on CPU. After this PR, observers are created on the same device as the rest of the model. This will enable QAT prepare to work regardless of whether users move the model to cuda before or after calling this pass. Test Plan: ``` python test/test_quantization.py TestQuantizeFx.test_qat_prepare_device_affinity ``` Imported from OSS Reviewed By: supriyar Differential Revision: D23502291 fbshipit-source-id: ec4ed20c21748a56a25e3395b35ab8640d71b5a8	2020-09-03 16:16:59 -07:00
Meghan Lele	7816d53798	[JIT] Add mypy type annotations for JIT (#43862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43862 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D23491151 Pulled By: SplitInfinity fbshipit-source-id: 88367b89896cf409bb9ac3db7490d6779efdc3a4	2020-09-03 15:09:24 -07:00
Vasiliy Kuznetsov	f9efcb646b	fx quant: clarify state in Quantizer object (#43927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43927 Adds uninitialized placeholders for various state used throughout the Quantizer object, with documentation on what they are. No logic change. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23439473 fbshipit-source-id: d4ae83331cf20d81a7f974f88664ccddca063ffc	2020-09-02 16:34:00 -07:00
Vasiliy Kuznetsov	df8da5cb5a	fx quant: make load_arg function more clear (#43923 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43923 Readability improvements to `Quantizer.convert.load_arg`, makes things easier to read. 1. add docblock 2. `arg` -> `arg_or_args`, to match what's actually happening 3. `loaded_arg` -> `loaded_args`, to match what's actually happening Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFx ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23438745 fbshipit-source-id: f886b324d2e2e33458b72381499e37dccfc3bd30	2020-09-02 09:06:05 -07:00
Vasiliy Kuznetsov	77ef77e5fa	fx quant: rename matches -> is_match (#43914 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43914 Renames `matches` function to `is_match`, since there is also a list named `matches` we are passing around in `Quantizer`, and would be good to decrease name conflicts. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23435601 fbshipit-source-id: 394af11e0120cfb07dedc79d5219247330d4dfd6	2020-09-02 09:06:01 -07:00
Vasiliy Kuznetsov	6f5282adc8	add quantization debug util to pretty print FX graphs (#43910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43910 Adds a debug function to get a representation of all nodes in the graph, such as ``` name op target args kwargs x plchdr x () {} linear_weight gt_prm linear.weight () {} add_1 cl_fun <bi_fun add> (x, linear_weight) {} linear_1 cl_mod linear (add_1,) {} relu_1 cl_meth relu (linear_1,) {} sum_1 cl_fun <bi_meth sum> (relu_1,) {'dim': -1} topk_1 cl_fun <bi_meth topk> (sum_1, 3) {} ``` using only Python STL. This is useful for printing internal state of graphs when working on FX code. Has some on-by-default logic to shorten things so that node reprs for toy models and unit tests fit into 80 chars. Flexible on function name and location, I care more that this is accessible from both inside PT as well as from debug scripts which are not checked in. Test Plan: see https://gist.github.com/vkuzo/ed0a50e5d6dc7442668b03bb417bd603 for example usage Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23435029 fbshipit-source-id: 1a2df797156a19cedd705e9e700ba7098b5a1376	2020-09-02 09:04:44 -07:00
Jerry Zhang	8fd9fe93be	[quant][graphmode][fx] Support dynamic quantization without calibration (#43952 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43952 Run weight observer for dynamic quantization before inserting quant/dequant node Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D23452123 fbshipit-source-id: c322808fa8025bbadba36c2e5ab89f59e85de468	2020-09-01 19:09:48 -07:00
Jerry Zhang	0ffe3d84d5	[quant][graphmode][fx] Support dynamic quantization without calibration (#43892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43892 Run weight observer in the convert function, so user do not need to run calibration Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23429758 fbshipit-source-id: 5bc222e3b731789ff7a86463c449690a58dffb7b	2020-09-01 17:01:48 -07:00
Jerry Zhang	d15b9d980c	[quant][graphmode][fx][refactor] Move patterns to separate files (#43891 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43891 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23429759 fbshipit-source-id: f19add96beb7c8bac323ad78f74588ca1393040c	2020-09-01 16:37:33 -07:00
Jerry Zhang	825c109eb7	[reland][quant][graphmode][fx] Add support for weight prepack folding (#43728 ) (#43902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43902 Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes and run the graph module to pack the weight. then replace the original chain of ops with the packed weight. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D23432431 fbshipit-source-id: 657f21a8287494f7f87687a9d618ca46376d3aa3	2020-09-01 00:26:19 -07:00
Jerry Zhang	7db7da7151	[reland][quant][graphmode][fx] Add top level APIs (#43581 ) (#43901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43901 Add similar APIs like eager and graph mode on torchscript - fuse_fx - quantize_fx (for both post training static and qat) - quantize_dynamic_fx (for post training dynamic) - prepare_fx (for both post training static and qat) - prepare_dynamic_fx (for post training dynamic) - convert_fx (for all modes) Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D23432430 fbshipit-source-id: fc99eb75cbecd6ee7a3aa6c8ec71cd499ff7e3c1	2020-08-31 18:24:26 -07:00
Qi Zhou	f73ba88946	Avoid resizing in MinMaxObserver (#43789 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43789 Since it's single element.. In some cases we may not be able to resize the buffers. Test Plan: unit tests Reviewed By: supriyar Differential Revision: D23393108 fbshipit-source-id: 46cd7f73ed42a05093662213978a01ee726433eb	2020-08-31 17:41:39 -07:00
Alban Desmaison	f7bae5b6b1	Revert D23385091: [quant][graphmode][fx] Add top level APIs Test Plan: revert-hammer Differential Revision: D23385091 (`eb4199b0a7`) Original commit changeset: b789e54e1a0f fbshipit-source-id: dc3dd9169d34beab92488d78d42d7e7d05e771d1	2020-08-31 12:18:29 -07:00
Alban Desmaison	68304c527a	Revert D23385090: [quant][graphmode][fx] Add support for weight prepack folding Test Plan: revert-hammer Differential Revision: D23385090 (`ef08f92076`) Original commit changeset: 11341f0af525 fbshipit-source-id: fe2bcdc16106923a2cee99eb5cc0a1e9c14ad2c5	2020-08-31 12:17:28 -07:00
Jerry Zhang	ef08f92076	[quant][graphmode][fx] Add support for weight prepack folding (#43728 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43728 Trace back from the weight node util we hit getattr, reconstruct the graph module with the traced nodes and run the graph module to pack the weight. then replace the original chain of ops with the packed weight. Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23385090 fbshipit-source-id: 11341f0af525a02ecec36f163a9cd35dee3744a1	2020-08-31 10:35:11 -07:00
Jerry Zhang	eb4199b0a7	[quant][graphmode][fx] Add top level APIs (#43581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43581 Add similar APIs like eager and graph mode on torchscript - fuse_fx - quantize_fx (for both post training static and qat) - quantize_dynamic_fx (for post training dynamic) - prepare_fx (for both post training static and qat) - prepare_dynamic_fx (for post training dynamic) - convert_fx (for all modes) Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23385091 fbshipit-source-id: b789e54e1a0f3af6b026fd568281984e253e0433	2020-08-31 10:12:55 -07:00
Jerry Zhang	b8d34547ee	[quant][graphmode][fx][fix] enable per channel quantization for functional ops (#43534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43534 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23310857 fbshipit-source-id: ff7a681ee55bcc51f564e9de78319249b989366c	2020-08-31 09:35:25 -07:00
Yizhou Yu	8997a4b56b	[typing] Enable typing in torch.quantization.fuse_modules typechecks … (#43786 ) Summary: …during CI Fixes #{42971} Pull Request resolved: https://github.com/pytorch/pytorch/pull/43786 Reviewed By: malfet Differential Revision: D23403258 Pulled By: yizhouyu fbshipit-source-id: 4cd24a4fcf1408341a210fa50f574887b6db5e0e	2020-08-28 20:42:23 -07:00
Dmytro Dzhulgakov	633d239409	[torch.fx] Pass placeholders through delegate too (#43432 ) Summary: It's useful if we add additional attributed to nodes in the graph - it's easier to set the attribute on all nodes, even if the value would happen to be None. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43432 Reviewed By: jamesr66a Differential Revision: D23276433 Pulled By: dzhulgakov fbshipit-source-id: c69e7cb723bbbb4dba3b508a3d6c0e456fe610df	2020-08-28 18:07:52 -07:00
maxosen64	1f7434d1ea	Fix 'module' to 'model' in quantize_dynamic doc (#43693 ) Summary: Fixes issue https://github.com/pytorch/pytorch/issues/43503 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43693 Reviewed By: malfet Differential Revision: D23397641 Pulled By: mrshenli fbshipit-source-id: bc216cea4f0a30c035e84a6cfebabd3755ef1305	2020-08-28 10:44:43 -07:00
Jerry Zhang	5a1aa0e21e	[reland][quant][graphmode][fx] Add e2e test on torchvision (#43587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43587 Add tests for graph mode quantization on torchvision and make sure it matches current eager mode quantization Test Plan: Imported from OSS Imported from OSS Reviewed By: z-a-f Differential Revision: D23331253 fbshipit-source-id: 0445a44145d99837a2c975684cd0a0b7d965c8f9	2020-08-27 10:12:07 -07:00
Ralf Gommers	573940f8d7	Fix type annotation errors in torch.functional (#43446 ) Summary: Closes gh-42968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/43446 Reviewed By: albanD Differential Revision: D23280962 Pulled By: malfet fbshipit-source-id: de5386a95a20ecc814c39cbec3e4252112340b3a	2020-08-26 08:27:59 -07:00
Mikhail Zolotukhin	be637fd5f6	Revert D23306683: [quant][graphmode][fx] Testing torchvision Test Plan: revert-hammer Differential Revision: D23306683 (`62dcd253e3`) Original commit changeset: 30d27e225d45 fbshipit-source-id: e661334d187d3d6756facd36f2ebdb3ab2cd2e26	2020-08-25 15:24:02 -07:00
Jerry Zhang	62dcd253e3	[quant][graphmode][fx] Testing torchvision (#43526 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43526 Add tests for graph mode quantization on torchvision and make sure it matches current eager mode quantization Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23306683 fbshipit-source-id: 30d27e225d4557bfc1d9aa462086e416aa9a9c0e	2020-08-25 13:02:14 -07:00
Supriya Rao	284ff04792	[quant] Support set API for EmbeddingBag quantization (#43433 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43433 Add support for torch.quint8 dtype Test Plan: Imported from OSS Reviewed By: radkris-git Differential Revision: D23277002 fbshipit-source-id: 4204bc62f124b4fd481aaa6aa47b9437978c43ee	2020-08-24 14:33:35 -07:00
Jerry Zhang	ec9e6e07bc	[quant][graphmode][fx] Add support for general value ops (#43439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43439 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23278585 fbshipit-source-id: ad29f39482cf4909068ce29555470ef430ea17f6	2020-08-22 08:52:28 -07:00
Jerry Zhang	88b564ce39	[quant][graphmode][fx] Add support for general shape ops (#43438 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43438 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23278583 fbshipit-source-id: 34b73390d47c7ce60528444da77c4096432ea2cb	2020-08-21 23:07:20 -07:00
Jerry Zhang	192c4b0050	[quant][graphmode][fx] Add support for clamp (#43437 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43437 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23278584 fbshipit-source-id: 266dc68c9ca30d9160a1dacf28dc7781b3d472c2	2020-08-21 20:21:50 -07:00
Jerry Zhang	490d41aaa6	[quant][graphmode][fx] Add support for instance_norm (#43377 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43377 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23257045 fbshipit-source-id: 7f4ad5d81f21bf0b8b9d960b054b20dc889e6c3b	2020-08-21 18:32:50 -07:00
Jerry Zhang	aec917a408	[quant][graphmode][fx] Add support for layer_norm (#43376 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43376 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23257048 fbshipit-source-id: 47a04a5221bcaf930d574f879d515e3dff2d1f6d	2020-08-21 16:38:16 -07:00
Jerry Zhang	089bb1a8e4	[quant][graphmode][fx] Add support for elu (#43375 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43375 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23257043 fbshipit-source-id: 22360610d87ef98d25871daff3fdc3dbb3ec5bdb	2020-08-21 16:07:36 -07:00
Jerry Zhang	5a02c6b158	[quant][graphmode][fx] Add support for hardswish (#43374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43374 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23257044 fbshipit-source-id: 2cdf12e104db6e51ffa0324eb602e68132a646ef	2020-08-21 16:06:32 -07:00
Jerry Zhang	109ea59afc	[quant][graphmode][fx] Add support for batchnorm relu (#43335 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43335 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23243563 fbshipit-source-id: 3c562f519b90e0157761a00c89eca63af8b909f2	2020-08-21 14:32:51 -07:00
Jerry Zhang	9e87a8ddf4	[quant][graphmode][fx] Add support for batchnorm (#43334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43334 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: z-a-f Differential Revision: D23243560 fbshipit-source-id: 0a7bc331293bbc3db85616bf43a995d3b112beb6	2020-08-21 14:31:49 -07:00
Jerry Zhang	650590da0d	[quant][graphmode][fx] Add support for conv module + relu (#43287 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43287 Porting op tests from test_quantize_jit.py Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: raghuramank100 Differential Revision: D23221735 fbshipit-source-id: 2513892a1928f92c09d7e9a24b2ea12b00de218d	2020-08-21 12:13:02 -07:00
Supriya Rao	3293fdfa80	[quant] Enable from_float for quantized Embedding_Bag (#43176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43176 Convert floating point nn.EmbeddingBag module to nn.quantized.dynamic.EmbeddingBag module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api python test/test_quantization.py TestPostTrainingDynamic.test_embedding_quantization Imported from OSS Reviewed By: vkuzo Differential Revision: D23200196 fbshipit-source-id: 090f47dbf7aceab9c719cbf282fad20fe3e5a983	2020-08-21 11:46:03 -07:00
Edmund Williams Jr	17f9edda42	Bias Correction Implementation (#41845 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41845 Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D22661503 Pulled By: edmundw314 fbshipit-source-id: a88c349c6cc15b1c66aa6dee7593ef3df588eb85	2020-08-20 21:40:33 -07:00
Jerry Zhang	217ddea93a	[quant] Make OP_LIST_TO_FUSER_METHOD public (#43286 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43286 We need to use this in graph mode quantization on fx Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23221734 fbshipit-source-id: 7c3c3840ce5bdc185b962e081aff1618f4c58e85	2020-08-20 20:19:13 -07:00
Jerry Zhang	9984d33542	[quant][graphmode][fx] Add support for conv module (#43285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43285 Porting op tests from test_quantize_jit.py (Note: this ignores all push blocking failures!) Test Plan: TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D23221733 fbshipit-source-id: c1f0f7ae0c82379143aa33fc1af7284d8303174b	2020-08-20 19:53:30 -07:00
Jerry Zhang	b0ec336477	[quant][graphmode][fx][test] Add per op test for graph mode quant on fx (#43229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43229 Test Plan: Imported from OSS Reviewed By: supriyar Differential Revision: D23201692 fbshipit-source-id: 37fa54dcf0a9d5029f1101e11bfd4ca45b422641	2020-08-20 17:32:02 -07:00
Jerry Zhang	dae2973fae	[quant][graphmode][fx] Add graph mode quantization on fx (#43175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43175 This PR added graph mode quantization on fx: https://github.com/pytorch/pytorch/pull/42741 Currently it matches eager mode quantization for torchvision with static/dynamic/qat ddp/synbn test is still wip Test Plan: python test/test_quantization.py TestQuantizeFx Imported from OSS Reviewed By: vkuzo Differential Revision: D23178602 fbshipit-source-id: 8e7e0322846fbda2cfa79ad188abd7235326f879	2020-08-20 14:50:09 -07:00
Vasiliy Kuznetsov	57af1ec145	observers: use torch.all to check for valid min and max values (#43151 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43151 Using `torch.all` instead of `torch.sum` and length check. It's unclear whether the increase in perf (~5% for small inputs) is real, but should be a net benefit, especially for larger channel inputs. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23170426 fbshipit-source-id: ee5c25eb93cee1430661128ac9458a9c525df8e5	2020-08-17 17:08:57 -07:00
Vasiliy Kuznetsov	3264ba065c	observers: use clamp instead of min/max in calculate_qparams (#43150 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43150 The current logic was expensive because it created tensors on CUDA. Switching to clamp since it can work without needing to create tensors. Test Plan: benchmarks Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23170427 fbshipit-source-id: 6fe3a728e737aca9f6c2c4d518c6376738577e21	2020-08-17 17:08:54 -07:00
Vasiliy Kuznetsov	a5dfba0a6e	observers: make eps a buffer (#43149 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43149 This value doesn't change, making it a buffer to only pay the cost of creating a tensor once. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D23170428 fbshipit-source-id: 6b963951a573efcc5b5a57649c814590b448dd72	2020-08-17 17:08:51 -07:00
Paul Shao	b992a927a9	Clearer Semantics and Naming for Customized Quantization Range Initialization in Observer (#42602 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42602 In this diff, clearer semantics and namings for are introduced by splitting the original `init_dynamic_qrange` into 2 separate `Optional[int]` types `qmin` and `qmax` to avoid the confusion of the parameters with dynamic quantization. The `qmin` and `qmax` parameters allow customers to specify their own customary quantization range and enables specific use cases for lower bit quantization. Test Plan: To assert the correctness and compatibility of the changes with existing observers, on a devvm, execute the following command to run the unit tests: `buck test //caffe2/test:quantization -- observer` Reviewed By: vkuzo, raghuramank100 Differential Revision: D22948334 fbshipit-source-id: 275bc8c9b5db4ba76fc2e79ed938376ea4f5a37c	2020-08-13 21:15:23 -07:00
Jerry Zhang	a55b7e2a6d	[reland][quant][fix] Remove activation_post_process in qat modules (#42343 ) (#43015 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43015 Currently activation_post_process are inserted by default in qat modules, which is not friendly to automatic quantization tools, this PR removes them. Test Plan: Imported from OSS Imported from OSS Reviewed By: vkuzo Differential Revision: D23105059 fbshipit-source-id: 3439ac39e718ffb0390468163bcbffd384802b57	2020-08-13 20:44:14 -07:00
Jerry Zhang	85752b989d	[quant][doc] Print more info for fake quantize module (#43031 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43031 fixes: https://github.com/pytorch/pytorch/issues/43023 Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D23116200 fbshipit-source-id: faa90ce8711da0785d635aacd0362c45717cfacc	2020-08-13 20:27:36 -07:00
Supriya Rao	816d37b1d8	[quant] Make PerChannel Observer work with float qparams (#42690 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42690 Add implementation for new qscheme per_channel_affine_float_qparams in observer Test Plan: python test/test_quantization.py TestObserver.test_per_channel_observers Imported from OSS Reviewed By: vkuzo Differential Revision: D23070633 fbshipit-source-id: 84d348b0ad91e9214770131a72f7adfd3970349c	2020-08-13 11:22:19 -07:00
Richard Zou	607e49cc83	Revert D22856816: [quant][fix] Remove activation_post_process in qat modules Test Plan: revert-hammer Differential Revision: D22856816 (`8cb42fce17`) Original commit changeset: 988a43bce46a fbshipit-source-id: eff5b9abdfc15b21c02c61eefbda38d349173436	2020-08-13 07:22:20 -07:00
Jerry Zhang	8cb42fce17	[quant][fix] Remove activation_post_process in qat modules (#42343 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42343 Currently activation_post_process are inserted by default in qat modules, which is not friendly to automatic quantization tools, this PR removes them. Test Plan: Imported from OSS Reviewed By: raghuramank100 Differential Revision: D22856816 fbshipit-source-id: 988a43bce46a992b38fd0d469929f89e5b046131	2020-08-12 20:14:23 -07:00
Jerry Zhang	ac93d45906	[quant] Attach qconfig to all modules (#42576 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42576 Previously we have qconfig propagate list and we only attach qconfig for modules in the list, this works when everything is quantized in the form of module. but now we are expanding quantization for functional/torch ops, we'll need to attach qconfig to all modules Test Plan: Imported from OSS Reviewed By: vkuzo Differential Revision: D22939453 fbshipit-source-id: 7d6a1f73ff9bfe461b3afc75aa266fcc8f7db517	2020-08-11 20:34:34 -07:00
Mike Ruberry	b7a9bc0802	Revert D22217029: Add fake quantize operator that works in backward pass Test Plan: revert-hammer Differential Revision: D22217029 (`48e978ba18`) Original commit changeset: 7055a2cdafcf fbshipit-source-id: f57a27be412c6fbfd5a5b07a26f758ac36be3b67	2020-08-07 23:04:40 -07:00
Presley Graham	48e978ba18	Add fake quantize operator that works in backward pass (#40532 ) Summary: This diff adds FakeQuantizeWithBackward. This works the same way as the regular FakeQuantize module, allowing QAT to occur in the forward pass, except it has an additional quantize_backward parameter. When quantize_backward is enabled, the gradients are fake quantized as well (dynamically, using hard-coded values). This allows the user to see whether there would be a significant loss of accuracy if the gradients were quantized in their model. Pull Request resolved: https://github.com/pytorch/pytorch/pull/40532 Test Plan: The relevant test for this can be run using `python test/test_quantization.py TestQATBackward.test_forward_and_backward` Reviewed By: supriyar Differential Revision: D22217029 Pulled By: durumu fbshipit-source-id: 7055a2cdafcf022f1ea11c3442721ae146d2b3f2	2020-08-07 17:47:01 -07:00
Presley Graham	7332c21f7a	Speed up HistogramObserver by vectorizing critical path (#41041 ) Summary: 22x speedup over the code this replaces. Tested on ResNet18 on a devvm using CPU only, using default parameters for HistogramObserver (i.e. 2048 bins). Pull Request resolved: https://github.com/pytorch/pytorch/pull/41041 Test Plan: To run the test against the reference (old) implementation, you can use `python test/test_quantization.py TestRecordHistogramObserver.test_histogram_observer_against_reference`. To run the benchmark, while in the folder `benchmarks/operator_benchmark`, you can use `python -m benchmark_all_quantized_test --operators HistogramObserverCalculateQparams`. Benchmark results before speedup: ``` # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: HistogramObserverCalculateQparams # Mode: Eager # Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_affine # Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_affine Forward Execution Time (us) : 185818.566 # Benchmarking PyTorch: HistogramObserverCalculateQparams # Mode: Eager # Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_symmetric # Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_symmetric Forward Execution Time (us) : 165325.916 ``` Benchmark results after speedup: ``` # ---------------------------------------- # PyTorch/Caffe2 Operator Micro-benchmarks # ---------------------------------------- # Tag : short # Benchmarking PyTorch: HistogramObserverCalculateQparams # Mode: Eager # Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_affine # Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_affine Forward Execution Time (us) : 12242.241 # Benchmarking PyTorch: HistogramObserverCalculateQparams # Mode: Eager # Name: HistogramObserverCalculateQparams_C3_M512_N512_dtypetorch.quint8_cpu_qschemetorch.per_tensor_symmetric # Input: C: 3, M: 512, N: 512, dtype: torch.quint8, device: cpu, qscheme: torch.per_tensor_symmetric Forward Execution Time (us) : 12655.354 ``` Reviewed By: raghuramank100 Differential Revision: D22400755 Pulled By: durumu fbshipit-source-id: 639ac796a554710a33c8a930c1feae95a1148718	2020-08-07 12:29:23 -07:00
Jerry Zhang	c3236b6649	[quant] Expose register activation post process hook function to user (#42342 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42342 Test Plan: Imported from OSS Reviewed By: raghuramank100 Differential Revision: D22856711 fbshipit-source-id: d6ad080c82b744ae1147a656c321c448ac5e7f10	2020-08-03 12:28:42 -07:00
Supriya Rao	38bf5be24f	[quant] Use PlaceholderObserver instead of Fp16Observer and NoopObserver (#42348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42348 Use the dtype info in placeholderObserver to decide what ops to insert in the graph In the next PR we can delete NoopObserver Test Plan: python test/test_quantization.py Imported from OSS Reviewed By: jerryzh168 Differential Revision: D22859457 fbshipit-source-id: a5c618f22315534ebd9a2df77b14a0aece196989	2020-07-31 12:33:56 -07:00
Supriya Rao	6bd46b583e	[quant][graph] Add support for FP16 dynamic quant (#42222 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42222 This change adds the necessary passes to perform FP16 dynamic quantization. We skip inserting observers for activations based on the dtype (torch.float16) and only insert the Fp16Observer for weights Test Plan: python test/test_quantization.py TestQuantizeJitOps Imported from OSS Reviewed By: jerryzh168 Differential Revision: D22849220 fbshipit-source-id: 2c53594ecd2485e9e3dd0b380eceaf7c5ab5fc50	2020-07-31 12:33:53 -07:00
Supriya Rao	8c5bf10264	[quant] Add FP16Observer for fp16 quant support (#42221 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42221 Adds a new observer that emits a warning if the range of tensor is beyond fp16 range. This will be further used in graph mode quantization to insert the cast to fp16 ops in the graph Test Plan: python test/test_quantizaton.py TestObserver.test_fp16_observer Imported from OSS Reviewed By: jerryzh168 Differential Revision: D22849222 fbshipit-source-id: a301281ce38ba4d4e7a009308400d34a08c113d2	2020-07-31 12:33:51 -07:00
Paul Shao	382781221d	Extending Learnable Fake Quantize module to support gradient scaling and factory (partial) construction (#41969 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41969 In this diff, the `_LearnableFakeQuantize` module is extended to provide support for gradient scaling where the gradients for both scale and zero point are multiplied by a constant `g` (in some cases, can help with quicker convergence). In addition, it is also augmented to provide a factory method via `_with_args` such that a partial constructor of the module can be built. Test Plan: For correctness of the fake quantizer operators, on a devvm, enter the following command: ``` buck test //caffe2/torch:quantization -- learnable_py_module ``` Reviewed By: z-a-f Differential Revision: D22715629 fbshipit-source-id: ff8e5764f81ca7264bf9333789f57e0b0cec7a72	2020-07-29 10:22:26 -07:00
Paul Shao	5a6d88d503	Updates to Scale and Zero Point Gradient Calculation (#42034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42034 In this diff, scale and zero point gradient calculations are updated to correctly reflect the actual backpropagation equation (instead of `dScale * dX`, the near-final output should be `dScale * dY`; the same applies to zero point). Test Plan: To execute the unit tests for all affected learnable fake quantize modules and kernels, on a devvm, execute the following command: `buck test //caffe2/test:quantization -- learnable` To enable the `cuda` tests, execute the following command: `buck test mode/dev-nosan //caffe2/test:quantization -- learnable` Reviewed By: jerryzh168 Differential Revision: D22735668 fbshipit-source-id: 45c1e0fd38cbb2d8d5e60be4711e1e989e9743b4	2020-07-27 11:18:49 -07:00
Paul Shao	c261a894d1	Updates to Python Module for Calculation of dX and Addition of Unit Tests (#42033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42033 In this diff, the Python `_LearnableFakeQuantize` module is updated where the gradient with respect to the input `x` is actually computed instead of passed through. Argument naming is also updated for better clarity; and unit tests on the `PerTensor` and `PerChannel` operators are added for asserting correctness. Test Plan: On a devvm, execute the command: `buck test //caffe2/test:quantization -- learnable_py_module` To include `cuda` tests as well, run: `buck test mode/dev-nosan //caffe2/test:quantization -- learnable_py_module` Reviewed By: jerryzh168 Differential Revision: D22735580 fbshipit-source-id: 66bea7e9f8cb6422936e653500f917aa597c86de	2020-07-27 11:18:47 -07:00
Haixin Liu	c5b4f60fc2	Move qconfig removal into convert() (#41930 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41930 As title ghstack-source-id: 108517079 Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D22698386 fbshipit-source-id: 4f748c9bae4a0b615aa69c7cc8d8e451e5d26863	2020-07-25 13:27:13 -07:00
Edmund Williams Jr	e9e6cc8c83	Added Prehook option to prepare method (#41863 ) Summary: Added a logic so that if a prehook is passed into the prepare method during quantization, then the hook will be added as a prehook to all leaf nodes (and modules specified in the non_leaf_module_list). Pull Request resolved: https://github.com/pytorch/pytorch/pull/41863 Test Plan: Small demo, made simple module then called prepare with prehook parameter set to the numeric suite logger, printed the results to verify its what we wanted {F245156246} Reviewed By: jerryzh168 Differential Revision: D22671288 Pulled By: edmundw314 fbshipit-source-id: ce65a00830ff03360a82c0a075b3b6d8cbc4362e	2020-07-24 10:26:39 -07:00
Supriya Rao	36fb14b68b	[quant] Add Graph Mode Passes to quantize EmbeddingBag operators (#41612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41612 This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Imported from OSS Reviewed By: vkuzo, jerryzh168 Differential Revision: D22609342 fbshipit-source-id: 23e33f44a451c26719e6e283e87fbf09b584c0e6	2020-07-23 18:54:59 -07:00
Edmund Williams Jr	fd62847eb2	cross_layer_equalization (#41685 ) Summary: The goal is to implement cross layer equalization as described in section 4.1 in this paper: https://arxiv.org/pdf/1906.04721.pdf Given two adjacent submodules in a trained model, A,B quantization might hurt one of the submodules more than the other. The paper poses the idea that a loss in accuracy from quantizing can be due to a difference in the channel ranges between the two submodules (the output channel range of A can be small, while the input channel range of B can be large). To minimize this source of error, we want to scale the tensors of A,B s.t. their channel ranges are equal (them being equal means no difference in ranges and minimizes this source of error). Pull Request resolved: https://github.com/pytorch/pytorch/pull/41685 Test Plan: Imported from OSS Reviewed By: z-a-f Differential Revision: D22630219 Pulled By: edmundw314 fbshipit-source-id: ccc91ba12c10b652d7275222da8b85455b8a7cd5	2020-07-22 08:39:23 -07:00
Paul Shao	9e0c746b15	Augmenting Concrete Observer Constructors to Support Dynamic Quantization Range; Modifying Utility Functions in _LearnableFakeQuantize Module for Better Logging and Baseline Construction. (#41815 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41815 All are minor changes to enable better simulations. The constructors of MinMaxObserver, MovingAverageMinMaxObserver, PerChannelMinMaxObserver, and MovingAveragePerChannelMinMaxObserver are augmented so they can utilize the dynamic quantization range support in the _ObserverBase class. In addition, minor adjustments are made to the enable_static_observation function that allow observer to update parameters but do not fake quantize on the output (for constructing baseline). Test Plan: To ensure this modification is still backward compatible with past usages, numerics are verified by running the quantization unit test suite, which contains various observer tests. The following command executes the test suite, which also verifies the observer numerics: ``` buck test //caffe2/test:quantization -- observer ``` Reviewed By: z-a-f Differential Revision: D22649128 fbshipit-source-id: 32393b706f9b69579dc2f644fb4859924d1f3773	2020-07-21 17:59:40 -07:00
Paul Shao	5c50cb567c	Generalized Learnable Fake Quantizer Module (#41535 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41535 A generalized fake quantization module is built to support lower-bit fake quantization with back propagation on the scale and zero point. The module supports both per tensor and per channel fake quantization. Test Plan: Please see diff D22337313 for a related experiment performed on the fake quantizer module. The `_LearnableFakeQuantize` module supports the following use cases: - Per Tensor Fake Quantization or Per Channel Fake Quantization - Static Estimation from Observers or Quantization Parameter Learning through Back Propagation By default, the module assumes per tensor affine fake quantization. To switch to per channel, during initialization, declare `channel_size` with the appropriate length. To toggle between utilizing static estimation and parameter learning with back propagation, you can invoke the call `enable_param_learning` or `enable_static_estimate`. For more information on the flags that support these operations, please see the doc string of the `_LearnableFakeQuantize` module. The `_LearnableFakeQuantizer` module relies on 2 operators for its forward and backward paths: `_LearnableFakeQuantizePerTensorOp` and `_LearnableFakeQuantizePerChannelOp`. The backpropagation routine is developed based on the following literature: - Learned Step Size Quantization: https://openreview.net/pdf?id=rkgO66VKDS - Trained Quantization Thresholds: https://arxiv.org/pdf/1903.08066.pdf Reviewed By: z-a-f Differential Revision: D22573645 fbshipit-source-id: cfd9ece8a959ae31c00d9beb1acf9dfed71a7ea1	2020-07-20 18:24:21 -07:00
Paul Shao	16dde6e3a0	Augmenting Observers to Support Dynamic Quantization Range (#41113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41113 In this diff, the `ObserverBase` class is augmented with 2 additional optional arguments qmin and qmax. Correspondingly the calculation of qmin and qmax and the related quantization parameters are modified to accommodate this additional flexibility should the number of bits for quantization be lower than 8 (the default value). Additional logic in the base class `_calculate_qparams` function has also been modified to provide support for dynamic quantization range. Test Plan: To ensure this modification is still backward compatible with past usages, numerics are verified by running the quantization unit test suite, which contains various observer tests. The following command executes the test suite, which also verifies the observer numerics: `buck test //caffe2/test:quantization -- observer` This modified observer script can be tested within the experiments for lower bit fake quantization. Please see the following diffs for reference. - Single Fake Quantizer: D22337447 - Single Conv Layer: D22338532 Reviewed By: z-a-f Differential Revision: D22427134 fbshipit-source-id: f405e633289322078b0f4a417f54b684adff2549	2020-07-20 08:51:31 -07:00
wudenggang	9600ed9af3	typo fixes (#41632 ) Summary: typo fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/41632 Reviewed By: ezyang Differential Revision: D22617827 Pulled By: mrshenli fbshipit-source-id: c2bfcb7cc36913a8dd32f13fc9adc3aa0a9b682f	2020-07-20 07:23:00 -07:00
emil	0c77bd7c0b	Quantization: preserving pre and post forward hooks (#37233 ) Summary: 1. While do convert() preserve module's pre and post forward hooks 2. While do fusion preserve only module's pre forward hooks (because after fusion output no longer the same) Pull Request resolved: https://github.com/pytorch/pytorch/pull/37233 Differential Revision: D22425141 Pulled By: jerryzh168 fbshipit-source-id: e69b81821d507dcd110d2ff3594ba94b9593c8da	2020-07-13 12:41:24 -07:00
Edward Leardi	733b8c23c4	Fix several quantization documentation typos (#40567 ) Summary: This PR fixes several typos I noticed in the docs here: https://pytorch.org/docs/master/quantization.html. In one case there was a misspelled module [torch.nn.instrinsic.qat](https://pytorch.org/docs/master/quantization.html#torch-nn-instrinsic-qat) which I corrected and am including screenshots of below just in case. <img width="1094" alt="before" src="https://user-images.githubusercontent.com/54918401/85766765-5cdd6280-b6e5-11ea-93e6-4944cf820b71.png"> <img width="1093" alt="after" src="https://user-images.githubusercontent.com/54918401/85766769-5d75f900-b6e5-11ea-8850-0d1f5ed67b16.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/40567 Differential Revision: D22311291 Pulled By: ezyang fbshipit-source-id: 65d1f3dd043357e38a584d9e30f31634a5b0995c	2020-07-07 09:45:23 -07:00
Haixin Liu	12b5bdc601	Remove unused Logger in get_matching_activations (#41023 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41023 Remove Logger in get_matching_activations since it's not used. ghstack-source-id: 107237046 Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic' Differential Revision: D22394957 fbshipit-source-id: 7d59e0f35e9f4c304b8487460d48236ee6e5a872	2020-07-07 00:33:07 -07:00
Haixin Liu	cab7d94d47	[PyTorch Numeric Suite] Remove unnecessary Logger in input arguments (#40890 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40890 Remove unnecessary Logger in input arguments and simplify the API. ghstack-source-id: 107110487 Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_lstm_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_weights_linear_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_submodule_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_functional_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_stub_linear_dynamic' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_conv_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_functional_static' buck test mode/dev caffe2/test:quantization -- 'test_compare_model_outputs_linear_dynamic' Differential Revision: D22345477 fbshipit-source-id: d8b4eb3d6cb3049aa3296dead8ba29bf5467bd1c	2020-07-03 02:45:46 -07:00
Supriya Rao	6aebd2c412	[quant][graphmode] Add FP16 quant support - Insert Noop Observers (#40708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40708 Insert NoopObservers for activations and weight tensors for FP16 Test Plan: python test/test_quantization.py test_prepare_dynamic Imported from OSS Differential Revision: D22335976 fbshipit-source-id: b19e8035c7db3b0b065ec09c9ad6d913eb434f3e	2020-07-01 14:13:31 -07:00
Haixin Liu	46b9e519aa	Remove print (#40475 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40475 As title ghstack-source-id: 106474870 Test Plan: CI Differential Revision: D22200640 fbshipit-source-id: 1f4c7bbf54be8c4187c9338fefdf14b501597d98	2020-06-24 00:42:25 -07:00
Jerry Zhang	0e26a03ef9	[quant][graphmode] Enable inplace option for top level API (#40414 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40414 after `_reconstruct` is supported in RecursiveScriptModule: https://github.com/pytorch/pytorch/pull/39979 we can support inplace option in quantization API Test Plan: Imported from OSS Differential Revision: D22178326 fbshipit-source-id: c78bc2bcf2c42b06280c12262bb31aebcadc6c32	2020-06-23 16:42:48 -07:00
Vasiliy Kuznetsov	b02c932fb6	qat eager: remove unneeded modules (#40396 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40396 Removes activation and normalization modules from eager mode QAT. These were incorrectly added, but we don't actually need them. Test Plan: ``` python test/test_quantization.py TestQuantizationAwareTraining ``` Imported from OSS Differential Revision: D22169768 fbshipit-source-id: b5bd753dafe92e90e226fb773eb18c6aae179703	2020-06-22 17:45:51 -07:00

1 2 3 4 5 ...

350 Commits