pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
albanD	058fb1790f	Fix compilation and "import torch" issues for cpython 3.14 (#158184 ) Beginning of process for 3.14 bringup. State of things from this PR: - Nothing too scary looking from the Dynamo CPython side, nothing we heavily rely on seems to be missing @williamwen42 - The existing check that makes torch.compile() nicely fail is working as expected. So all these empty functions shouldn't cause any weirdness. - The `__module__` update changes look suspicious, we should investigate what is the reason and impact of that, in particular for our public API checking @jbschlosser - Leaving the weakref.py thread safety change as a follow up to keep this a bit simpler. I vendored the whole struct in the meantime FYI @ezyang EDIT: The `__module__` change is even more cursed than I though due to changes to Union and Optional type where the `__module__` field cannot be changed anymore. See https://github.com/python/cpython/issues/132139 for details. For now, I'm just skipping the `__module__` setting for 3.14 which will trip the public API checks. Will revisit once I have a final answer on the cpython issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158184 Approved by: https://github.com/msaroufim	2025-07-15 05:06:55 +00:00
Xuehai Pan	f8293116f5	[BE][13/16] fix typos in torch/ (torch/ao/) (#156603 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156603 Approved by: https://github.com/msaroufim	2025-06-29 04:34:04 +00:00
Xuehai Pan	279cae52e7	[BE][PYFMT] migrate PYFMT for `torch/ao/` to `ruff format` (#148185 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148185 Approved by: https://github.com/ezyang	2025-06-14 16:47:04 +00:00
Jerry Zhang	d23aa7e182	Add deprecation warning for `torch.ao.quantization` (#153892 ) Summary: att Test Plan: (ao) $ PYTHONWARNINGS='default' python Python 3.10.14 \| packaged by conda-forge \| (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer printing warning /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/__init__.py:36: DeprecationWarning: torch.ao.quantization is deprecated. Plan is to 1. Remove eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead 2. Remove fx graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx, torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e) 3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e) see https://dev-discuss.pytorch.org/t/torch-ao-quantization-migration-plan/2810 for more details warnings.warn( >>> a = XNNPACKQuantizer() /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/quantizer/xnnpack_quantizer.py:281: DeprecationWarning: XNNPACKQuantizer is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead warnings.warn(f"{self.__class__.__name__} is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead", DeprecationWarning) >>> Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/153892 Approved by: https://github.com/Skylion007	2025-05-28 16:25:30 +00:00
Aaron Orenstein	9e0437a04a	PEP585 update - torch/ao/quantization (#145140 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145140 Approved by: https://github.com/bobrenjc93	2025-01-19 10:20:00 +00:00
bobrenjc93	a55977f763	Migrate from Tuple -> tuple in torch/ao (#144265 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144265 Approved by: https://github.com/aorenste	2025-01-10 00:12:06 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit `fda975a7b3`. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
yintong-lu	3361908fc5	torch/ao/quantization/utils.py: Moving eps to targeted device to avoid device mismatch issue (#135204 ) MOTIVATION We recently verified some quantization tests on devices other than cpu (eg. CUDA and Intel Gaudi devices identified as 'hpu'). We noticed a device mismatch error as eps is a tensor created on cpu but other tensors (min_val_neg, max_val_pos, scale, zero_point) are moved to the targeted _device_. CHANGES Move eps to _device_ of other tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/135204 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-10-15 14:58:55 +00:00
Shen Xu	19a4d68224	Add missing mappings to support torch.uint16 in quantization and export (#136547 ) Test Plan: CI. Differential Revision: D63142844 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136547 Approved by: https://github.com/angelayi	2024-10-01 00:01:01 +00:00
Jerry Zhang	f2b0fc89f2	Add uint16 support for observer (#136238 ) Summary: att Test Plan: python test/test_quantization.py -k TestObserver Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D62909821](https://our.internmc.facebook.com/intern/diff/D62909821) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136238 Approved by: https://github.com/tarun292	2024-09-18 23:52:18 +00:00
Xuehai Pan	758a0a88a2	[BE][Easy] enable `ruff` rule `PIE790`: unnecessary `pass` statement (#133200 ) This PR removes unnecessary `pass` statement. This is semanticly safe because the bytecode for the Python code does not change. Note that if there is a docstring in the function, a empty function does not need a `pass` statement as placeholder. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133200 Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/kit1980	2024-08-15 15:50:19 +00:00
Oguz Ulgen	72d2dba992	Add None return type to init (#132335 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335 Approved by: https://github.com/albanD	2024-08-01 15:26:45 +00:00
Xuehai Pan	2ce734cee9	[BE] enable UFMT for `torch/ao/quantization/` (#128863 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128863 Approved by: https://github.com/ezyang ghstack dependencies: #128861, #128862	2024-07-25 04:17:54 +00:00
Aaron Orenstein	62bcdc0ac9	Flip default value for mypy disallow_untyped_defs [4/11] (#127841 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127841 Approved by: https://github.com/oulgen	2024-06-08 18:36:48 +00:00
Andrea Frittoli	04272a0e12	Add docstring for the torch.ao.quantization.utils.get_combined_dict function (#128127 ) Fixes: #127906 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128127 Approved by: https://github.com/jerryzh168	2024-06-06 21:22:09 +00:00
andrewor14	85b28ffc3a	[quant][pt2e] Move batch norm op between eval/train for cuda (#123957 ) Summary: Before in `move_exported_model_to_train/eval`, we only switched the CPU versions of the batch norm op. This commit adds support for the cuda versions of the op too. Note that this fix is temporary; we won't have to differentiate between these two cases once we have batch norm consolidation. Test Plan: python test/test_quantization.py -k test_move_exported_model_bn Reviewers: jerryzh168 Subscribers: jerryzh168, leslie-fang-intel, supriyar Differential Revision: [D56070054](https://our.internmc.facebook.com/intern/diff/D56070054) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123957 Approved by: https://github.com/jerryzh168	2024-04-24 22:01:50 +00:00
PyTorch MergeBot	e739a2d59e	Revert "[quant][pt2e] Move batch norm op between eval/train for cuda (#123957 )" This reverts commit `4efb28c900`. Reverted https://github.com/pytorch/pytorch/pull/123957 on behalf of https://github.com/jeanschmidt due to reverting to check if it will fix rocm jobs on main ([comment](https://github.com/pytorch/pytorch/pull/123957#issuecomment-2075158146))	2024-04-24 15:02:11 +00:00
andrewor14	4efb28c900	[quant][pt2e] Move batch norm op between eval/train for cuda (#123957 ) Summary: Before in `move_exported_model_to_train/eval`, we only switched the CPU versions of the batch norm op. This commit adds support for the cuda versions of the op too. Note that this fix is temporary; we won't have to differentiate between these two cases once we have batch norm consolidation. Test Plan: python test/test_quantization.py -k test_move_exported_model_bn Reviewers: jerryzh168 Subscribers: jerryzh168, leslie-fang-intel, supriyar Differential Revision: [D56070054](https://our.internmc.facebook.com/intern/diff/D56070054) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123957 Approved by: https://github.com/jerryzh168	2024-04-24 01:02:59 +00:00
Amadeusz Skrzypczak	107f944f22	Support fp8 quantization (#123161 ) This commit enables float8_e5m2 and float8_e4m3fn dtypes in fx quantization and PT2E. Motivation for using fp8 quantization instead of int8: - it works better to run inference with the same datatype the model was trained with, - fp8 can handle outliers better, which is one of the problems in LLMs activations. The numerical recipe we want to use it for is fp8 inference: - bgemms/gemms running in float8_e4m3fn, - Per-Tensor-Quantization/Scaling, - amax observer for measurement with input_backoff and weight_backoff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123161 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-04-23 13:35:27 +00:00
Aaron Gokaslan	c5fafe9f48	[BE]: TRY002 - Ban raising vanilla exceptions (#124570 ) Adds a ruff lint rule to ban raising raw exceptions. Most of these should at the very least be runtime exception, value errors, type errors or some other errors. There are hundreds of instance of these bad exception types already in the codebase, so I have noqa'd most of them. Hopefully this error code will get commiters to rethink what exception type they should raise when they submit a PR. I also encourage people to gradually go and fix all the existing noqas that have been added so they can be removed overtime and our exception typing can be improved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124570 Approved by: https://github.com/ezyang	2024-04-21 22:26:40 +00:00
Xia, Weiwen	94db6578cc	[Quant] Add dynamic quantization config for x86 inductor backend (#115337 ) Description Add dynamic quantization config for x86 inductor backend. To support the QKV structure in self-attention, we removed an assertion in port-metadata-pass that requires single dequantize node after quantize node. Test plan ``` python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_dynamic_quant_linear python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_qat_dynamic_quant_linear ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115337 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-01-10 11:33:37 +00:00
Aaron Gokaslan	a0632389b7	[BE]: Update lintrunner mypy to 1.6.0 (#111375 ) Follow up to #111305 that updates lintrunner's version too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111375 Approved by: https://github.com/malfet	2023-10-17 01:22:06 +00:00
Jerry Zhang	32a16d4999	[quant][pt2e] Support int16 quantization (#108453 ) Summary: Previously we can only use native pytorch int dtypes that has corresponding quantized dtypes (e.g. quint8, qint8), this PR removes this assumption in observers/fake_quants so that users can use all pytorch native dtypes (except for int64, we can add it later if need) the main addition here is int16. Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108453 Approved by: https://github.com/kimishpatel	2023-09-06 19:31:20 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Justin Chu	c0d8a4af0a	[BE] Enable ruff's UP rules and autoformat ao/ (#105430 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430 Approved by: https://github.com/albanD, https://github.com/malfet	2023-07-19 13:44:37 +00:00
Jerry Zhang	ce8d31551b	[quant][be] Change return type for zero_point to be int32 Tensor (#102234 ) Summary: This is probably a typo Test Plan: CI Reviewed By: salilsdesai Differential Revision: D46172706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102234 Approved by: https://github.com/salilsdesai	2023-06-01 18:30:44 +00:00
Richard Barnes	6120c5842c	[codemod] Replace hasattr with getattr in caffe2/torch/ao/quantization/utils.py (#100361 ) Summary: The pattern ``` X.Y if hasattr(X, "Y") else Z ``` can be replaced with ``` getattr(X, "Y", Z) ``` The [getattr](https://www.w3schools.com/python/ref_func_getattr.asp) function gives more succinct code than the [hasattr](https://www.w3schools.com/python/ref_func_hasattr.asp) function. Please use it when appropriate. This diff is very low risk. Green tests indicate that you can safely Accept & Ship. Test Plan: Sandcastle Reviewed By: jerryzh168 Differential Revision: D44886493 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100361 Approved by: https://github.com/Skylion007	2023-05-04 14:46:38 +00:00
Wyatt Borsos	6361c3debc	Return zero_point from determine_qparams as a int64 (#98746 ) Summary: In some cases, zero_point is returned as an int tensor. We want it to be a long. This fixes a failed assertion in Executorch op_choose_qparams: https://www.internalfb.com/code/fbsource/[4609e7dbbf2e]/fbcode/executorch/kernels/quantized/cpu/op_choose_qparams.cpp?lines=49-52 Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D44764070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98746 Approved by: https://github.com/jerryzh168	2023-04-11 19:01:05 +00:00
Kazuaki Ishizaki	a13a63ae9a	Fix typos under torch/ao directory (#97679 ) This PR fixes typos in comments and messages of `.py` files under `torch/ao` directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/97679 Approved by: https://github.com/janeyx99, https://github.com/kit1980	2023-04-10 22:25:15 +00:00
Vasiliy Kuznetsov	cdab1d676c	pt2e short term quant: respect qmin/qmax for linear weight (#96232 ) Summary: Makes the `nnqr.Linear` module respect the qmin/qmax attributes of weight observer. This is to unblock some customer teams who are depending on non-default values of these attributes. Test plan: ``` python test/test_quantization.py -k TestReferenceQuantizedModule.test_linear_decomposed ``` Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/96232 Approved by: https://github.com/andrewor14	2023-03-10 04:46:20 +00:00
andrewor14	faa4cb29b2	[Quant][fx] Create new FX-based LSTM reference module (#96343 ) Summary: The previous LSTM reference module implementation did not handle dtypes other than quint8 correctly. This is because the internal LSTM custom module quantization used eager mode, which did not insert the q-dq ops properly. E.g., we want the following reference quantized model: ``` [dq -> linear1_fp32 -> q_to_qint32] -> dq -> q_to_quint8 -> [dq - linear2_fp32 -> q_to_quint8] -> dq -> ... ``` This requires two sets of `q - dq` pairs between two adjacent ops that have different dtypes (linear1 and linear2). However, these `q - dq` pairs were not inserted in the old flow, because eager mode required users to insert Quant/DeQuantStubs manually. This commit changes the internal LSTM custom module quantization to use FX graph mode quantization, which automatically inserts the `q - dq` ops that convert the dtypes between adjacent ops correctly. However, using FX graph mode quantization here comes with its own set of challenges that required some hacks to get the end-to-end flow to work. These hacks are detailed in the comments in the util functions. Test Plan: python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams This commit also updates the corresponding test to verify the dtypes as well as the qparams in the reference quantized graph. This test case should serve as an example for users to set up their own LSTM reference module flows. Reviewers: vkuzo, supriyar, jcaip Subscribers: vkuzo, supriyar, jcaip Pull Request resolved: https://github.com/pytorch/pytorch/pull/96343 Approved by: https://github.com/vkuzo	2023-03-09 23:23:48 +00:00
Jacob Szwejbka	fc324d3485	[quant][pt2e] Add support for dynamic quantization with symmetric quant for input (#94854 ) Summary: Previously we assumed asymmetric quantization for dynamic quantization, this diff adds the support of symmetric quantization for the input in dynamic quantization Test Plan: buck run executorch/exir/tests:quant_lowering_custom_backend_pass -- "executorch.exir.tests.test_quant_lowering_custom_backend_pass.TestQuantLoweringCustomBackendPass.test_quantized_linear_dynamic" Reviewed By: digantdesai Differential Revision: D43134794 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94854 Approved by: https://github.com/digantdesai	2023-02-28 19:39:31 +00:00
andrewor14	a3b505c55e	[Quant] Fix setting fixed qparams for inner LSTM ops (#95537 ) Summary: The existing util function did not quantize all inner ops in the quantizable LSTM module, resulting in the error "Could not run X with arguments from the 'QuantizedCPU' backend." This commit fixes this by ensuring that all the other ops whose qparams were not specifically configured are still quantized as before, as in `torch.ao.nn.quantizable.LSTM.from_float`. Test Plan: This commit also adds an additional check in the test to ensure that the final converted model is in fact quantized, in addition to just checking the qparams in the observers have the right values. python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams Reviewers: vkuzo Subscribers: vkuzo, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/95537 Approved by: https://github.com/vkuzo	2023-02-27 19:08:51 +00:00
Jesse Cai	cba8b12fa7	[quant][bug fix] Fix qrange_len in `torch.ao.quantization.utils.py` (#95297 ) Summary: It looks like there is a typo and qrange_len should be 2^32 instead of 2^31, as it is currently set. Test Plan: ``` python test/test_quantization.py TestObserver.test_per_tensor_observers ``` Reviewers: Subscribers: Tasks: https://github.com/pytorch/pytorch/issues/95295 Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/95297 Approved by: https://github.com/vkuzo	2023-02-23 20:23:45 +00:00
Jacob Szwejbka	2628901033	[Executorch][Quant] Add Choose_qparams_symmetric (#94685 ) Summary: needed for symmetric dynamic quant flow Test Plan: todo Reviewed By: jerryzh168 Differential Revision: D43134117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94685 Approved by: https://github.com/larryliu0820	2023-02-13 07:27:48 +00:00
Jacob Szwejbka	bb48d90b00	[Executorch][Quant][BE] Refactor Choose_Qparams (#94338 ) Summary: Refactor so that it can be decomposed Test Plan: ci Differential Revision: D42681268 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94338 Approved by: https://github.com/jerryzh168	2023-02-09 01:20:17 +00:00
Nikita Shulga	c0dd9b3b67	Revert "[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 )" This reverts commit `59071ab1e7`. It breaks `quantization.jit.test_ondevice_quantization.TestOnDeviceDynamicPTQFinalize`, which is not run in OSS, but is mandatory for internal CI.	2023-01-23 09:13:02 -08:00
Jacob Szwejbka	59071ab1e7	[Executorch][Quantization][BE] Refactor Choose Qparams (#92592 ) Summary: Should hopefully be a little faster. Definitely cleaner to not create an observer inside the op Test Plan: ci Differential Revision: D42154677 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92592 Approved by: https://github.com/jerryzh168	2023-01-20 01:36:47 +00:00
HDCharles	1ca9d43d4e	[ao] quantize.py fixing public v private (#87521 ) Summary: made _register_activation_post_process_hook, _add_observer, _get_unique_devices_, _get_observer_dict private Test Plan: python test/test_public_bindings.py Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D40709277](https://our.internmc.facebook.com/intern/diff/D40709277) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87521 Approved by: https://github.com/jerryzh168	2022-12-14 22:50:39 +00:00
Vasiliy Kuznetsov	22a1b5e243	quantization: deprecate observer compute_dtype and replace with is_dynamic (#85431 ) Summary: This PR deprecates the `compute_dtype` field on observers, and replaces it with the `is_dynamic` field on observers. This is better aligned with the reference model spec. Test plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85431 Approved by: https://github.com/jerryzh168	2022-11-24 07:07:34 +00:00
andrewor14	19e66fcec2	[Quant] Allow setting fixed qparams for inner LSTM ops (#88456 ) Summary: In both eager and FX graph mode quantization, `torch.ao.nn.quantizable.LSTM` is used as an observed custom module, which is responsible for inserting its own observers. By default, the user specifies a single QConfig for the custom module (either through QConfigMapping or by setting the "qconfig" attribute"), and all inner ops will [inherit this QConfig](`dc00bb51b8/torch/ao/nn/quantizable/modules/rnn.py (L366-L378)`) and use the same observer/fake_quantize constructors. Today, users who wish to override this behavior must extend `torch.ao.nn.quantizable.LSTM` and write a lot of custom code to manually assign the QConfigs to the inner ops. This commit alleviates this burden on the user by providing a helper function to assign QConfigs with custom observers. An example use case of this is providing a reference implementation for a backend kernel that hardcodes qparams for efficiency. Example usage: ``` import torch from torch.ao.quantization import get_default_qconfig_mapping from torch.ao.quantization.fx.custom_config import ( PrepareCustomConfig, ConvertCustomConfig, ) class MyModel(torch.nn.Module): ... class UserLSTM(torch.ao.nn.quantizable.LSTM): @classmethod def from_float(cls, other): assert isinstance(other, cls._FLOAT_MODULE) linear_output_obs_ctr = FixedQParamsObserver.with_args( scale=2 -11, zero_point=2 15, dtype=torch.qint32) sigmoid_obs_ctr = FixedQParamsObserver.with_args( scale=2 -16, zero_point=0, dtype=torch.qint32) tanh_obs_ctr = FixedQParamsObserver.with_args( scale=2 -15, zero_point=2 15, dtype=torch.qint32) cell_state_obs_ctr = FixedQParamsObserver.with_args( scale=2 -11, zero_point=0, dtype=torch.qint32) hidden_state_obs_ctr = FixedQParamsObserver.with_args( scale=2 -7, zero_point=2 7, dtype=torch.quint8) return torch.ao.quantization.utils._get_lstm_with_individually_observed_parts( float_lstm=other, linear_output_obs_ctr=linear_output_obs_ctr, sigmoid_obs_ctr=sigmoid_obs_ctr, tanh_obs_ctr=tanh_obs_ctr, cell_state_obs_ctr=cell_state_obs_ctr, hidden_state_obs_ctr=hidden_state_obs_ctr, ) qconfig_mapping = get_default_qconfig_mapping() example_inputs = (torch.rand(5, 3, 50), torch.rand(1, 3, 50), torch.randn(1, 3, 50)) prepare_custom_config = PrepareCustomConfig() \ .set_float_to_observed_mapping(torch.nn.LSTM, UserLSTM) convert_custom_config = ConvertCustomConfig() \ .set_observed_to_quantized_mapping(UserLSTM, torch.ao.nn.quantized.LSTM) model = MyModel() model = prepare_fx(model, qconfig_mapping, example_inputs, prepare_custom_config=prepare_custom_config) model(example_inputs) # calibrate model = convert_fx(model, convert_custom_config=convert_custom_config) model(example_inputs) ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_static_lstm_with_custom_fixed_qparams Reviewers: jerryzh168, vkuzo Subscribers: jerryzh168, vkuzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/88456 Approved by: https://github.com/jerryzh168, https://github.com/vkuzo	2022-11-18 16:27:12 +00:00
Jacob Szwejbka	7f55db4fb0	add quantize_decomposed_dynamic to op lib (#88855 ) Summary: Needed for dynamic quant reference pattern graphs. Test Plan: added unittest Differential Revision: D41205030 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88855 Approved by: https://github.com/jerryzh168	2022-11-16 16:59:36 +00:00
Jerry Zhang	0e3b5ea026	[quant][fx] Add _convert_to_reference_decomposed (#87094 ) Summary: _convert_to_reference_decomposed is a private convert function in fx graph mode quantization flow to convert a calibrated/trained model to a reference quantized model with decomposed quantized tensor representations. Test Plan: python test/test_quantization.py TestQuantizeFx.test__convert_to_reference_decomposed_fx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/87094 Approved by: https://github.com/andrewor14	2022-10-27 01:22:08 +00:00
HDCharles	25476f2e4b	[ao] fixing public v private for quantization_types (#86031 ) Summary: the main problem with this was that the different objects defined simply as 'Any' should theoretically be public but making them public either A) results in an error about the module being 'typing' rather than whatever module it should be or B) you set the module manually, thereby changing the module for the original 'Any' class. note: QuantizeHandler has a similar issue where its simply defined as 'Any' Pattern was defined in multiple places which was causing issues so i just moved it to a single place given the note at the top of quantization_types.py indicating these definitions should be moved to utils at some point anyway. Finally i changed any references to these objects to point at the correct locations. Note: i didn't see any fb internal references to NodePattern or QuantizerCls that would cause issues. Test Plan: python test/test_public_bindings.py Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/86031 Approved by: https://github.com/jerryzh168	2022-10-12 20:06:30 +00:00
Jiaxu Zhu	bc919ac796	[torch.ao.quantization] include torch.qint32 for static quant (#86345 ) Summary: include `torch.qint32` to `activation_is_statically_quantized` and `get_quant_type` so that fakequantize with `dtype=torch.qint32` won't be skipped Test Plan: updated `test_custom_module_class` Differential Revision: D40128178 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86345 Approved by: https://github.com/jerryzh168	2022-10-06 20:05:56 +00:00
Vasiliy Kuznetsov	09965957cd	quantization: align observer dtype with reference model spec (#85345 ) Summary: Before this PR, the `dtype` attribute of observers was not clearly defined. It originally meant `interface_dtype` in the eager mode workflow, which is how the codebase before this PR is using it. In the new reference model spec, `dtype` attribute of an observer represents the `dtype` value which needs to be passed into a `quantize` function in the reference model spec. This PR aligns the codebase to this definition of dtype. In detail: 1. change util functions to interpret `dtype` using the reference model definition 2. change `prepare` to interpret `dtype` using the reference model definition 3. change observers for dynamic quantization to interpret `dtype` using the reference model definition. A future PR (left out of this one to keep LOC small) will deprecate the `compute_dtype` field and instead expose `is_dynamic` on observers. " Test plan: ``` python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps ``` Differential Revision: [D39675209](https://our.internmc.facebook.com/intern/diff/D39675209) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85345 Approved by: https://github.com/z-a-f, https://github.com/jerryzh168	2022-09-21 06:34:26 +00:00

1 2

73 Commits