pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Zhicheng Yan	77643ed2eb	[torch quantization]raise exception when OOM during combine histogram in observer (#123309 ) Summary: Even with changes in D55347133, it is still possible to OOM in histogram observer, because the size of allocated tensor also depends on downsample_rate. For example, I still see OOM due to the attempt of allocating a 10GB+ histogram tensor in multi-task model. To fix OOM issue better, we use try-catch clause to avoid OOM. Empirically, we set the max size of a single histogram tensor size to 1 GB. Test Plan: Test the change for Multi-Task model (depth + segmentation) Differential Revision: D55567292 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123309 Approved by: https://github.com/jerryzh168	2024-04-06 03:15:02 +00:00
andrewor14	fe29a8fbea	[quant][be] Simplify fake_quant_per_channel (#123186 ) Summary: We probably don't need `torch._C._AutoDispatchBelowAutograd()`, which is to prevent infinite recursion if the implementation calls itself. Let's remove it and see if anything breaks. The other major change is registering the op to the more general Autograd dispatch key so it can be used on cuda as well. Test Plan: python test/inductor/test_cpu_repro.py -k test_decomposed_fake_quant_per_channel Reviewers: zou3519, bdhirsh Subscribers: zou3519, bdhirsh, jerryzh168, leslie-fang-intel Pull Request resolved: https://github.com/pytorch/pytorch/pull/123186 Approved by: https://github.com/zou3519, https://github.com/leslie-fang-intel	2024-04-03 18:06:45 +00:00
Zhengxu Chen	dacc73669c	[export] Make quantizer compatible with the standard nn_module_stack. (#122819 ) Summary: When we migrate to torch.export, we won't put L['self'] as the prefix for all the fqn in nn_module_stack. This diff adds the branch to handle the new case. Test Plan: buck test mode/opt caffe2/test/quantization:test_quantization -- -r set_module_name Differential Revision: D55436617 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122819 Approved by: https://github.com/tugsbayasgalan	2024-03-28 19:36:46 +00:00
Zhicheng Yan	07f94df1a6	[torch quantization]fix HistogramObserver OOM when (self.max_val - self.min_val) is too small (#122659 ) Differential Revision: D55347133 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122659 Approved by: https://github.com/jerryzh168	2024-03-28 17:41:21 +00:00
Jerry Zhang	5af839f86d	[quant][pt2e] Enable observer sharing between different quantization specs (#122734 ) Summary: Right now we don't insert additional observers (share observers) if qspec.dtype and qspec.is_dynamic matches exactly, since fixed qparams quantization spec and derived quantization spec do have have is_dynamic field curerntly, observer sharing does not happen between them and quantization spec, in this PR we fixed the issue by adding is_dynamic to all quantization specs. Note: SharedQuantizationSpec should probably be its own type in the future TODO later: (1). move all these fields (dtype, is_dynamic, quant_min, quant_max etc.) to QuantizationSpecBase, (2). make SharedQuantizationSpec a separate type (3). add quant_min/quant_max in observer sharing checking in pt2e/prepare.py Test Plan: python test/test_quantization.py -k test_fixed_qparams_qspec_observer_dedup Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D55396546](https://our.internmc.facebook.com/intern/diff/D55396546) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122734 Approved by: https://github.com/andrewor14	2024-03-27 16:45:19 +00:00
haozhe.zhu	e0329cba8a	[Quant] [PT2] Add SiLU into X86InductorQuantizer Conv2d Unary Annotation (#122267 ) Summary Add `SiLU` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/122267 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5 ghstack dependencies: #122266	2024-03-26 08:03:42 +00:00
Guang Yang	c677221798	remove torchao dependency (#122524 ) Test Plan: CI ``` buck2 run mode/dev-nosan mode/inplace executorch/examples/models/llama2:export_llama -- -c ~/llama/ultra_new_checkpoint.pt -p ~/llama/params.json -kv -E 8,8 -d fp32 --pt2e_quantize "xnnpack_dynamic" -2 ``` ``` buck run //executorch/backends/xnnpack/test:test_xnnpack_ops -- executorch.backends.xnnpack.test.ops.linear.TestLinear.test_qd8_fp32_per_token_weight_per_channel_group_int4 ``` Differential Revision: D55263008 Pull Request resolved: https://github.com/pytorch/pytorch/pull/122524 Approved by: https://github.com/jerryzh168	2024-03-23 03:18:43 +00:00
PyTorch MergeBot	60bc29aa0b	Revert "[Quant] [PT2] Add SiLU into X86InductorQuantizer Conv2d Unary Annotation (#122267 )" This reverts commit `2c6eeb26d3`. Reverted https://github.com/pytorch/pytorch/pull/122267 on behalf of https://github.com/jeanschmidt due to Not sure if this PR caused breakages in main rocm jobs, I'll remerge if reverting does not fix it ([comment](https://github.com/pytorch/pytorch/pull/122267#issuecomment-2015294491))	2024-03-22 15:04:30 +00:00
andrewor14	ea8e0c75c7	[quant][pt2] Fix create FQ with FixedQParamsQSpec (#122104 ) Summary: Before we just returned a _PartialWrapper object when using FixedQParamsQuantizationSpec in QAT. This is wrong and we should return a FQ object instead. Differential Revision: [D55021106](https://our.internmc.facebook.com/intern/diff/D55021106) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122104 Approved by: https://github.com/jerryzh168	2024-03-22 14:23:05 +00:00
haozhe.zhu	2c6eeb26d3	[Quant] [PT2] Add SiLU into X86InductorQuantizer Conv2d Unary Annotation (#122267 ) Summary Add `SiLU` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/122267 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5 ghstack dependencies: #122266	2024-03-22 08:12:23 +00:00
Jerry Zhang	901ba2be86	[quant][pt2e] Add support for conv transpose + bn + {relu} weights fusion in PTQ (#122046 ) Summary: also added some utils in xnnpack_quantizer_utils.py * annotate_conv_tranpsose_bn_relu and annotate_conv_transpose_bn -> this is for QAT * annotate_conv_transpose_relu conv_transpose + bn weights fusion is performed automatically and can not be disabled currently we can add support to allow disable this fusion later if needed Test Plan: python test/test_quantization.py -k test_conv_transpose_bn_fusion Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/122046 Approved by: https://github.com/andrewor14	2024-03-19 21:00:57 +00:00
Tugsbayasgalan Manlaibaatar	53d2188df9	Update get_aten_graph_module (#121937 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121937 Approved by: https://github.com/andrewor14	2024-03-15 20:35:55 +00:00
Le-Zheng	25e00545bb	[Quant][PT2E] Enable linear and linear-unary post-op gelu quant recipe for x86 inductor quantizer (#114853 ) Summary Add Gelu for linear-unary post-op quantization recipe to x86 inductor quantizer. Test plan python -m pytest test/quantization/pt2e/test_x86inductor_quantizer.py -k test_linear_unary_gelu python test/test_quantization.py -k test_linear_unary_with_quantizer_api Co-authored-by: leslie-fang-intel <leslie.fang@intel.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/114853 Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jerryzh168	2024-03-14 01:46:35 +00:00
Manuel Candales	c53e3f57b5	allow fp16 in quant/dequant decompositions (#121738 ) Test Plan: ``` buck2 run mode/dev-nosan mode/inplace executorch/examples/models/llama2:export_llama -- -c ~/llama/ultra_new_checkpoint.pt -p ~/llama/params.json -kv -E 8,8 -d fp16 --pt2e_quantize "xnnpack_dynamic" -2 ``` Reviewed By: kirklandsign Differential Revision: D54785950 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121738 Approved by: https://github.com/jerryzh168	2024-03-13 21:45:08 +00:00
Manuel Candales	6d8a7d6e58	[pytorch] optional zero points on dequantize per channel (#121724 ) Summary: X-link: https://github.com/pytorch/executorch/pull/2364 bypass-github-export-checks Test Plan: sandcastle Reviewed By: mikekgfb Differential Revision: D54709217 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121724 Approved by: https://github.com/mikekgfb	2024-03-12 19:54:11 +00:00
kausik	edf22f3a48	Modify signature of dequantize ops for decomposed quantized Tensor (#119173 ) (#121450 ) Summary: X-link: https://github.com/pytorch/executorch/pull/2308 Note: The initial purpose of this PR is to draw suggestion and feedback regarding better alternative, if any. At present, dequantize op for decomposed quantized Tensor representation e.g. dequantize_per_tensor() assumes the output dtype as torch.float and hence, it does not have the output dtype in its operator argument list. However, this op signature becomes unusable when the assumption breaks. Because, in case the output dtype is different from torch.float, there is no way to specify the same during dequantization. This change is aimed at generalizing the signature of dequantize op like dequantize_per_tensor() for wider use-cases where the output dtype can be different from torch.float and needs to passed during dequantization. The proposal is to use an additional argument named 'output_dtype' to solve the problem. However, we would also like to have suggestion and feedback regarding any better alternative that can be used instead. cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 Xia-Weiwen leslie-fang-intel Reviewed By: digantdesai Differential Revision: D53590486 Pulled By: manuelcandales Co-authored-by: kausik <kmaiti@habana.ai> Pull Request resolved: https://github.com/pytorch/pytorch/pull/121450 Approved by: https://github.com/jerryzh168	2024-03-12 12:36:31 +00:00
Shen Xu	159f30331f	[quant][pt2e] Call sub-quantizers' transform_for_annotation in ComposableQuantizer (#121548 ) Test Plan: ``` buck run caffe2/test:quantization_pt2e ``` Differential Revision: D54454707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121548 Approved by: https://github.com/jerryzh168	2024-03-12 02:59:12 +00:00
Jerry Zhang	a6a67da333	[quant] Add error check for input_edge annotation (#121536 ) Summary: Raises error when an input edge contains non-Node elements like constant values etc in annotation. Test Plan: python test/test_quantization.py -k test_input_edge_sanity_check Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/121536 Approved by: https://github.com/andrewor14	2024-03-09 06:13:04 +00:00
leslie-fang-intel	975d428425	[Quant] Add the operator of decomposed fake quant per channel (#121297 ) Summary Add the operator of `quantized_decomposed.fake_quant_per_channel` and test the forward and backward of this op with comparing to ATen. Test Plan ``` python -u -m pytest -s -v test_cpu_repro.py -k test_decomposed_fake_quant_per_channel ``` Next Step Optimize the performance: from the generated code of forward and backward graph, the code didn't vectorize. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121297 Approved by: https://github.com/jerryzh168, https://github.com/jgong5	2024-03-08 10:51:37 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	b474a523c6	Ban passing in free function into capture_pre_autograd_graph (#120817 ) Summary: Today we don't allow free functions to be tracing callable in torch.export. As a part of migrating capture_preautograd_graph usages to torch.export, we need to ban free functions to capture_preautograd_graph as well Test Plan: CI Differential Revision: D54319597 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120817 Approved by: https://github.com/zhxchen17, https://github.com/andrewor14	2024-03-01 19:38:58 +00:00
andrewor14	91190d8087	[quant][pt2e] Relax `model_is_exported` input (#120720 ) Summary: This commit relaxes the `model_is_exported` API to additionally work for `torch.nn.Module`s in addition to just `torch.fx.GraphModule`s, simplifying downstream uses. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_model_is_exported Differential Revision: [D54263935](https://our.internmc.facebook.com/intern/diff/D54263935) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120720 Approved by: https://github.com/tugsbayasgalan	2024-02-28 18:32:03 +00:00
leslie-fang-intel	84de851539	[Inductor] Enable the decomposition of quant/dequant per channel (#119177 ) Summary Part 2 of fixing https://github.com/pytorch/pytorch/issues/119141 which needs vectorized code generation of per channel quant and int8 data type. Enable decomposition of quant/dequant per channel to make it vectorized code generation. TestPlan ``` python -u -m pytest -s -v test_cpu_repro.py -k test_per_channel_fake_quant_uint8 python -u -m pytest -s -v test_cpu_repro.py -k test_per_channel_fake_quant_int8 python -u -m pytest -s -v test_cpu_repro.py -k test_per_channel_fake_quant_uint8_bf16_input python -u -m pytest -s -v test_cpu_repro.py -k test_per_channel_fake_quant_int8_bf16_input ``` Co-authored-by: Jiong Gong <jiong.gong@intel.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/119177 Approved by: https://github.com/peterbell10, https://github.com/jansel	2024-02-19 01:30:44 +00:00
andrewor14	6ea4480818	[quant][pt2e] Add `model_is_exported` util function (#119726 ) Summary: This commit adds the `model_is_exported` util function for users to be able to easily tell what APIs to call to move their models between train and eval modes. This has the additional advantage of hiding the implementation of how we detect a model is exported, in case the metadata format changes in the future. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_model_is_exported Differential Revision: [D53812972](https://our.internmc.facebook.com/intern/diff/D53812972) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119726 Approved by: https://github.com/tugsbayasgalan, https://github.com/albanD	2024-02-16 19:29:36 +00:00
andrewor14	8ec8d78ef2	[quant][pt2e][be] Rename eval_utils -> export_utils (#119725 ) It's not really eval_utils anymore, since we added some training related utils. Instead it should be util functions that are related to general export use cases. Differential Revision: [D53711494](https://our.internmc.facebook.com/intern/diff/D53711494) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119725 Approved by: https://github.com/tugsbayasgalan	2024-02-13 19:10:06 +00:00
andrewor14	830ed6d9b2	[quant][pt2] Fix _disallow_eval_train error message (#119694 ) Fix the message to use the right function name. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119694 Approved by: https://github.com/tugsbayasgalan	2024-02-13 00:17:53 +00:00
Riley Dulin	44796682d0	[torch][ao] Fix module name filter for pytorch2 quantization for underscores (#119344 ) Summary: There was a bug in the module name filter for modules that had an underscore already in them, as it was replaced with a "dot" notation. This is because it was thought that underscores always meant a module separator, but this isn't the case for modules whose name contains an underscore. Test Plan: Added a unit test. Before this change, that test failed (due to applying the wrong qscheme). Now it passes. Differential Revision: D53502771 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119344 Approved by: https://github.com/jerryzh168	2024-02-10 00:29:08 +00:00
Jerry Zhang	7082e24ce8	[quant][pt2e][bc-breaking] Set `fold_quantize` to True in `convert_pt2e` (#119425 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to set `fold_quantize` flag to True in `convert_pt2e` Test Plan: CI Differential Revision: D53550237 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119425 Approved by: https://github.com/andrewor14	2024-02-09 18:13:43 +00:00
PyTorch MergeBot	81abc2b249	Revert "[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 )" This reverts commit `482d952e88`. Reverted https://github.com/pytorch/pytorch/pull/118701 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/118701#issuecomment-1932866964))	2024-02-07 20:56:16 +00:00
Jerry Zhang	482d952e88	[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to remove `fold_quantize` flag from `convert_pt2e` Test Plan: CI Differential Revision: D53247301 BC Breaking Note: flag `fold_quantize` set to True `convert_pt2e` and now we'll fold the quantize op in the weight by default, so users will see model size reduction by default after pt2e quantization. 2.2 ``` folded_model = convert_pt2e(model, fold_quantize=True) non_folded_model = convert_pt2e(model) ``` 2.3 ``` folded_model = convert_pt2e(model) non_folded_model = convert_pt2e(model, fold_quantize=False) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118701 Approved by: https://github.com/andrewor14, https://github.com/leslie-fang-intel	2024-02-07 19:10:51 +00:00
andrewor14	6c1cca153e	[quant][pt2e] Allow users to override train/eval behavior (#119091 ) Summary: This commit adds a util for PT2E quantization users to call `model.train()` and `model.eval()` without error. Instead, these will automatically call the equivalent `move_exported_model_to_train/eval` for the user, which only switch behavior for special ops like dropout and batchnorm. This enables users to onboard to the PT2E flow more easily. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_allow_exported_model_train_eval Reviewers: jerryzh168, tugsbayasgalan, zhxchen17 Subscribers: jerryzh168, tugsbayasgalan, zhxchen17, supriyar Differential Revision: [D53426636](https://our.internmc.facebook.com/intern/diff/D53426636) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119091 Approved by: https://github.com/jerryzh168, https://github.com/tugsbayasgalan, https://github.com/zhxchen17	2024-02-06 22:19:58 +00:00
andrewor14	70605d150b	[quant][pt2] Add `move_exported_model_to_train` (#113492 ) Summary: This is the equivalent API to `model.train()` for exported models, analogous to `move_exported_model_to_eval`. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout_inplace python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout_bn Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/113492 Approved by: https://github.com/jerryzh168, https://github.com/tugsbayasgalan	2024-02-02 17:39:47 +00:00
Jiaxu Zhu	b97ab47619	[pytorch][ao] Update `PerChannelMinMaxObserver` default `_load_from_state_dict` (#118659 ) Summary: When `version` is missing in the metadata, use `min_val/max_val` as keys instead of `max_vals/min_vals` ## Reasons 1. It's been almost 2 years since this change D30003700, which means now most checkpoints are using the `max_val/min_val` keys 2. most checkpoints dumps using `model.state_dict()` don't have version info, which will lead a fake `missing keys` error when loading state_dict Test Plan: CI Differential Revision: D53233012 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118659 Approved by: https://github.com/jerryzh168	2024-02-01 04:39:31 +00:00
suo	5586d7797e	fix up batchnorm folding in pt2 quant (#118720 ) Changes to how attributes are structured messed this pass up, fix it Differential Revision: [D53253601](https://our.internmc.facebook.com/intern/diff/D53253601/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118720 Approved by: https://github.com/SherlockNoMad	2024-01-31 17:40:47 +00:00
Jerry Zhang	82a7460b67	[quant][bc-breaking] Turn on fold_quantize by default (#118605 ) Summary: Previously by default we don't generate quantized weight, that is, we'll have fp32 weight, and `fp32 weight -> q -> dq -> linear -> ...` in the quantized model After this PR, we'll produce a graph with int8 weight by default after convert_pt2e: `int8 weight -> dq -> linear -> ...` We'll remove the fold_quantize flag in the next PR Test Plan: CI Differential Revision: D51730862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118605 Approved by: https://github.com/andrewor14	2024-01-30 21:42:29 +00:00
Catherine Lee	4f5785b6b3	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Co-authored-by: Catherine Lee <csl@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 21:07:01 +00:00
PyTorch MergeBot	40ece2e579	Revert "Enable possibly-undefined error code (#118533 )" This reverts commit `4f13f69a45`. Reverted https://github.com/pytorch/pytorch/pull/118533 on behalf of https://github.com/clee2000 due to sorry i'm trying to figure out a codev merge conflict, if this works i'll be back to rebase and merge ([comment](https://github.com/pytorch/pytorch/pull/118533#issuecomment-1917695185))	2024-01-30 19:00:34 +00:00
Edward Z. Yang	4f13f69a45	Enable possibly-undefined error code (#118533 ) Fixes https://github.com/pytorch/pytorch/issues/118129 Suppressions automatically added with ``` import re with open("error_file.txt", "r") as f: errors = f.readlines() error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118533 Approved by: https://github.com/Skylion007, https://github.com/zou3519	2024-01-30 05:08:10 +00:00
Edward Z. Yang	46712b019d	Enable local_partial_types (#118467 ) When using dmypy, this setting is enabled and cannot be turned off. Force it for regular mypy too. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118467 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418, #118432	2024-01-28 13:38:22 +00:00
Edward Z. Yang	9bce208dfb	Replace follow_imports = silent with normal (#118414 ) This is a lot of files changed! Don't panic! Here's how it works: * Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file. * When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded. * The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors. * Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list. * Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves. * torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state. * There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many. In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file. The codemod was done with this script authored by GPT-4: ``` import glob exclude_patterns = [ ... ] for pattern in exclude_patterns: for filepath in glob.glob(pattern, recursive=True): if filepath.endswith('.py'): with open(filepath, 'r+') as f: content = f.read() f.seek(0, 0) f.write('# mypy: ignore-errors\n\n' + content) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414 Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD	2024-01-27 02:44:11 +00:00
le-zheng	94f0472579	[Quant] [PT2] Add Hardswish into X86InductorQuantizer Conv2d Unary Annotation (#117488 ) Summary Add `hardswish` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117488 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5 ghstack dependencies: #117487	2024-01-20 01:37:33 +00:00
Jerry Zhang	8f1bc876b2	[quant] Support custom qmin/qmax for activation and weight for xnnpack quantizer (#117305 ) Summary: att, this allows us to experiment with 4 bit quant in xnnpack Test Plan: python test/test_quantization.py -k test_dynamic_linear_int4_weight Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/117305 Approved by: https://github.com/digantdesai	2024-01-17 03:22:49 +00:00
Xia, Weiwen	94db6578cc	[Quant] Add dynamic quantization config for x86 inductor backend (#115337 ) Description Add dynamic quantization config for x86 inductor backend. To support the QKV structure in self-attention, we removed an assertion in port-metadata-pass that requires single dequantize node after quantize node. Test plan ``` python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_dynamic_quant_linear python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_qat_dynamic_quant_linear ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115337 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-01-10 11:33:37 +00:00
Max Ren	d2033a0639	[quant][pt2e][xnnpack_quantizer] add support for linear_relu (#117052 ) Add support for linear_relu annotation for XNNPACKQuantizer, this allows the input to linear and the output to relu to share the same quantization parameter.s Differential Revision: [D52574086](https://our.internmc.facebook.com/intern/diff/D52574086/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117052 Approved by: https://github.com/jerryzh168, https://github.com/digantdesai	2024-01-09 23:19:52 +00:00
Zhengxu Chen	5ac57a06eb	[export] Refactor ExportPassBase. (#116778 ) Summary: X-link: https://github.com/pytorch/executorch/pull/1532 as title. This diff decouple the pass base library from torch export and exir, so that different layers can evolve in their own fashion, and we have more head room to divide and conquer in the future. Test Plan: CI Reviewed By: angelayi Differential Revision: D52514517 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116778 Approved by: https://github.com/angelayi	2024-01-04 21:32:14 +00:00
Aaron Gokaslan	3fe437b24b	[BE]: Update flake8 to v6.1.0 and fix lints (#116591 ) Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling. - Replace `assert(0)` with `raise AssertionError()` - Remove extraneous parenthesis i.e. - `assert(a == b)` -> `assert a == b` - `if(x > y or y < z):`->`if x > y or y < z:` - And `return('...')` -> `return '...'` Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591 Approved by: https://github.com/albanD, https://github.com/malfet	2024-01-03 06:04:44 +00:00
Jerry Zhang	41f265b06a	[quant][pt2e] Preserve numeric_debug_handle in quantization flows (#116477 ) Summary: We introduced `node.meta["numeric_debug_handle"]` in https://github.com/pytorch/pytorch/pull/114315 to indicate the numeric debug handle for values in the graph, in this PR we supported preserving this field in prepare and convert so that we can use these for numerical debugging Next: we also want to preserve these in deepcopy of GraphModule as well Test Plan: python test/test_quantization.py -k test_quantize_pt2e_preserve_handle Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116477 Approved by: https://github.com/tugsbayasgalan	2024-01-03 03:39:00 +00:00
Aaron Gokaslan	bd10fea79a	[BE]: Enable F821 and fix bugs (#116579 ) Fixes #112371 I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579 Approved by: https://github.com/ezyang	2024-01-01 08:40:46 +00:00
Jerry Zhang	8173d98c57	[quant][be] Skip conv-bn folding when there are no batchnorm ops (#116440 ) Summary: `_fold_conv_bn_qat` is taking a long time currently, so skipping it when it's not necessary, we can have follow up fixes to actually reduce the patterns or cache the patterns if possible Test Plan: uncomment the print in `test_speed`, run python test/test_quantization.py -k test_speed and make sure the convert time is low, e.g. 0.1s instead of 8-9 seconds Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116440 Approved by: https://github.com/andrewor14	2023-12-28 23:33:21 +00:00
Aaron Gokaslan	bbe3261dd3	[BE]: Use `iterable.chain.from_iterable` where possible (#116376 ) This is more readable and more efficient when dealing with lots of sequences to chain together. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116376 Approved by: https://github.com/albanD	2023-12-27 19:20:07 +00:00
Xuehai Pan	199e07f108	[pytree][BE] update treespec `num_children` access (#116370 ) Change `len(treespec.children_spes) -> treespec.num_children`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116370 Approved by: https://github.com/Skylion007	2023-12-24 20:54:32 +00:00
Jerry Zhang	db25462ffd	[quant][pt2e] Relax constraints on dtype and qscheme to allow for customizations (#116287 ) Summary: att Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116287 Approved by: https://github.com/kimishpatel	2023-12-22 03:12:04 +00:00
andrewor14	6988e40b48	[quant][fx] Lower operator.matmul in convert_fx (#113954 ) Summary: We support lowering `torch.matmul` but not `operator.matmul`. This commit adds support for the latter, which enables lowering the shorthand `@`. This address https://github.com/pytorch/pytorch/issues/111450. Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: jerryzh168 Subscribers: jerryzh168, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/113954 Approved by: https://github.com/jerryzh168	2023-12-12 00:34:58 +00:00
HDCharles	b5d3d3ebf0	[ao] making hist_obs handle torch.inf and closeby values (#103467 ) Summary: This PR does 2 things: 1) Previously this would simply error, now it will ignore any torch.inf values that it recieves. note: The code checks for torch.inf after aminmax that way if there are no torch.inf values found, the perf is a relatively unchanged 2) as mentioned in https://github.com/pytorch/pytorch/issues/100051, values close to (but not quite at) the maximum/minimum float value could overflow to infinity in the course of _adjust_min_max() (when this large value would be multiplied by something in the middle of a calculation that would otherwise result in a non inf value). This was fixed by rearranging the order of operations for the lines in question without altering the actual equations. Specifically, where operations in lines 1095, 1098 and 1100 have multiplication and division of large values, its better to divide the two large values before multiplying, rather than multiplying the two large values together (creating overflow) before dividing like it had been. Test Plan: python test/test_quantization.py TestObserver.test_histogram_observer_ignore_infinity python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467 Approved by: https://github.com/andrewor14	2023-12-08 21:41:31 +00:00
Jerry Zhang	cc8f6f56dc	[quant][pt2e] Add convert callback to Observer module (#115001 ) Summary: This is to allow easier extension of quant workflow in the future, as we are seening more diverse ways of doing quantization putting up this for feedbacks first Test Plan: python test/test_quantization.py TestQuantizePT2E.test_observer_callback Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/115001 Approved by: https://github.com/kimishpatel	2023-12-08 13:47:37 +00:00
Jerry Zhang	ecba053cff	[quant][pt2e] XNNPACKQuantizer skip inserting observers for non-float Tensors (#114999 ) Summary: att Test Plan: python test/test_quantization.py -k test_add_mul_long Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/114999 Approved by: https://github.com/kimishpatel, https://github.com/guangy10	2023-12-07 22:13:36 +00:00
leslie-fang-intel	7ec145bfed	[Quant] [PT2] Fix XNNPACKQuantizer set_module_type issue (#115252 ) Summary Fix the issue https://github.com/pytorch/pytorch/issues/115251, the root-cause is we pass the `filter_fn` parameter of `find_sequential_partitions` in wrong position. Use keyword arg to fix this issue. Summary ``` python -u -m pytest -s -v test_quantization.py -k test_set_module_type_case_2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115252 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-12-07 03:08:20 +00:00
leslie-fang-intel	1489e4bcf3	[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 ) Summary Add standalone batchnorm into `_move_exported_model_to_eval` to move it from training mode into eval mode Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_bn_conv2d python -u -m pytest -s -v test_quantize_pt2e.py -k test_bn_move_exported_model_to_eval ``` Differential Revision: [D51853407](https://our.internmc.facebook.com/intern/diff/D51853407) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114547 Approved by: https://github.com/jgong5, https://github.com/andrewor14	2023-12-06 19:51:22 +00:00
Jerry Zhang	1474dad28c	[quant][pt2e][xnnpack] Add support for QAT dynamic quantization for linear in XNNPACKQuantizer (#113288 ) Summary: FX graph mode quant workflow and also pt2e flow relies on the `is_dynamic` flag in observer/quantizationspec to convert an observer to dynamic quantization patterns (choose_qparams -> q -> dq), this PR added is_dynamic flag for all observers so that it's possible to convert these observers to the pattern. However, this dynamic quantization pattern (choose_qparams -> q -> dq) is actually only valid for MovingAverageObserver(averaging_constant=1) for the computation before convert and after convert to match in the context of QAT. So we'll have some sanity checks in other observers to make sure the is_dynamic is False. Test Plan: python test/test_quantization.py TestXNNPACKQuantizer.test_qat_dynamic_linear Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51124725](https://our.internmc.facebook.com/intern/diff/D51124725) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113288 Approved by: https://github.com/kimishpatel	2023-12-04 23:06:38 +00:00
Jerry Zhang	8f164017ee	[quant][pt2e][xnnpack] XNNPACKQuantizer skip quantization for input and output to workaround histogram observer problem (#113405 ) Summary: att, this is because histogram observer does not work for a corner case in mobilebert (observing a scalar tensor of float32 max value) because histc operator errors out when the value is larger than certain number Test Plan: python test/test_quantization.py -k test_mul_float32_max Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/113405 Approved by: https://github.com/mcr229	2023-12-02 00:44:42 +00:00
PyTorch MergeBot	c6e975bc0e	Revert "[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 )" This reverts commit `bab054063c`. Reverted https://github.com/pytorch/pytorch/pull/114547 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/114547#issuecomment-1836612143))	2023-12-01 18:52:51 +00:00
Jerry Zhang	64fd706b21	[quant][pt2e] Add generate_numeric_debug_handle pass (#114315 ) Summary: This is a util for numeric suite in pt2 export so that we can build a more streamlined UX for numerical debugging in quant + executorch stack Test Plan: python test/test_quantization.py TestGenerateNumericDebugHandle Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/114315 Approved by: https://github.com/zhxchen17	2023-12-01 03:38:17 +00:00
leslie-fang-intel	fd7201029a	[Quant] [PT2] Enable Inplace Dropout in _move_exported_model_to_eval (#114725 ) Summary Enable Inplace Dropout replacement in `_move_exported_model_to_eval` Test Plan ``` python -u -m pytest -s -v test_quantize_pt2e.py -k test_move_exported_model_to_eval ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114725 Approved by: https://github.com/andrewor14, https://github.com/jgong5 ghstack dependencies: #114547	2023-11-30 04:43:22 +00:00
leslie-fang-intel	bab054063c	[Quant] [PT2] Enable batchnorm in _move_exported_model_to_eval (#114547 ) Summary Add standalone batchnorm into `_move_exported_model_to_eval` to move it from training mode into eval mode Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_bn_conv2d python -u -m pytest -s -v test_quantize_pt2e.py -k test_bn_move_exported_model_to_eval ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114547 Approved by: https://github.com/jgong5, https://github.com/andrewor14	2023-11-30 04:31:27 +00:00
Jerry Zhang	d2f4215dbb	[quant][pt2e] Fix the order for implicit sharing code (#114704 ) Summary: Current order of implicit sharing breaks common annotation patterns of SharedQuantizationSpec, so we changed the order here. But it's not going to work in all possible annotation cases, so quantizer implementors still need to be careful. In general if people only refer to node/edges that comes before the current node/edge in SharedQuantizationSpec, it should work I think Test Plan: CI, make sure this Fixed some internal tests Differential Revision: D51605918 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114704 Approved by: https://github.com/andrewor14	2023-11-29 08:58:28 +00:00
leslie-fang-intel	8c1f65dc2b	[Quant] [PT2] Add Hardtanh and ReLU6 into X86InductorQuantizer Conv2d Unary Annotation (#114579 ) Summary Add `Hardtanh` and `ReLU6` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114579 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #114578	2023-11-28 07:18:00 +00:00
Jez Ng	5cfa0647a7	Update mypy to 1.7.0 (#114160 ) It appears that `mypy` is now checking a few more previously-unchecked files; these files are being found via import-following. Not sure exactly why they weren't being checked before. Pull Request resolved: https://github.com/pytorch/pytorch/pull/114160 Approved by: https://github.com/eellison ghstack dependencies: #114162	2023-11-28 06:45:55 +00:00
leslie-fang-intel	74370a8a9d	Add adaptive_avg_pool2d and flatten into x86 Inductor Quantizer recipe (#114442 ) Summary Add adaptive_avg_pool2d and flatten into x86 Inductor Quantizer recipe Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_adaptive_avg_pool2d_recipe python -m pytest test_x86inductor_quantizer.py -k test_flatten_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114442 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-28 01:35:57 +00:00
leslie-fang-intel	e592b9a469	[Quant] [PT2] Fix an issue in Conv Binary Quantization Annotation (#114540 ) Summary To annotate a conv-binary pattern, should skip the pattern if the conv node has more than one user. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_binary2 python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/114540 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-28 01:06:48 +00:00
Jerry Zhang	d6578b3678	[quant][pt2e] Refactor some internal code for observer insertion (#113500 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/113500 Approved by: https://github.com/kimishpatel	2023-11-22 17:44:46 +00:00
Jerry Zhang	a785fbe513	[reland][quant][pt2e] Refactor insert observer to do sharing checking in the same place (#113458 ) (#113920 ) Summary: Previously it is scatter in two different places: before inserting observer and during observer, this PR moved everything before we insert observer * Next: refactor QuantizationSpec and check more fields for sharing Test Plan: CI (regression tests) Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51420029](https://our.internmc.facebook.com/intern/diff/D51420029) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113920 Approved by: https://github.com/andrewor14	2023-11-22 01:48:51 +00:00
andrewor14	e5102ccd27	[quant][pt2] Support conv1d-bn QAT fusion (#113714 ) Summary: Previously the PT2 QAT code only supported conv2d-bn. This commit extends all existing QAT fusion support to conv1d-bn, including support for all variants like relu, no bias, literal args, cuda etc. This commit also refactors the code such that we can support conv3d-bn easily in the future. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT_ConvBn1d Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [D51428979](https://our.internmc.facebook.com/intern/diff/D51428979) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113714 Approved by: https://github.com/jerryzh168	2023-11-17 22:09:30 +00:00
Aaron Gokaslan	d9f2cf9974	[BE]: Enable ruff rule PIE800 - unnecessary nested dict expansion (#113880 ) Adds an additional list which removes unnecessary dict literal unpacking, also applies the fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113880 Approved by: https://github.com/albanD	2023-11-16 22:34:38 +00:00
PyTorch MergeBot	3c4e4d9947	Revert "[quant][pt2e] Refactor insert observer to do sharing checking in the same place (#113458 )" This reverts commit `585e315b3a`. Reverted https://github.com/pytorch/pytorch/pull/113458 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing executorch export test for llama2 ([comment](https://github.com/pytorch/pytorch/pull/113458#issuecomment-1815280715))	2023-11-16 20:43:38 +00:00
andrewor14	8241fe6edb	[quant][pt2][be] Rewrite QAT annotations using subgraph matcher (#113709 ) Summary: This is the recommended way to write quantizers according to https://pytorch.org/tutorials/prototype/pt2e_quantizer.html#a-note-on-ir-for-pt2e-quantization-flow. It is agnostic to changes in the aten IR and can be easily extended to support conv1d-bn and conv3d-bn fusion patterns in the future. This is the first step towards rewriting XNNPACKQuantizer using this subgraph matcher. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT_ConvBn2d Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D51366525](https://our.internmc.facebook.com/intern/diff/D51366525) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113709 Approved by: https://github.com/jerryzh168	2023-11-16 03:57:37 +00:00
Jerry Zhang	585e315b3a	[quant][pt2e] Refactor insert observer to do sharing checking in the same place (#113458 ) Summary: Previously it is scatter in two different places: before inserting observer and during observer, this PR moved everything before we insert observer * Next: refactor QuantizationSpec and check more fields for sharing Test Plan: CI (regression tests) Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/113458 Approved by: https://github.com/kimishpatel	2023-11-15 21:08:39 +00:00
albanD	296c9e3ce7	upgrade lintrunner to the lowest supported versions on python 3.12 (#113562 ) As per title, the current versions fail to install on 3.12. The failures are related to https://github.com/numpy/numpy/issues/25147 They are fixed by adding manual annotations for the code in PyTorch and ignoring them on caffe2 as discussed with @malfet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113562 Approved by: https://github.com/malfet	2023-11-15 18:12:01 +00:00
Andrew Hoblitzell	9724d0fd87	docstyle _correct_bias.py _equalize.py _learnable_fake_quantize.py backend_config experimental fake_quantize.py fuse_modules.py fuser_method_mappings.py (#112992 ) Fixes #112988 For files __init__.py _correct_bias.py _equalize.py _learnable_fake_quantize.py backend_config experimental fake_quantize.py fuse_modules.py fuser_method_mappings.py Correct the following __init__.py:1 at module level: D104: Missing docstring in public package __init__.py:144 in public function `default_eval_fn`: D205: 1 blank line required between summary line and description (found 0) __init__.py:144 in public function `default_eval_fn`: D400: First line should end with a period (not 'f') __init__.py:144 in public function `default_eval_fn`: D401: First line should be in imperative mood; try rephrasing (found 'Default') __init__.py:152 in private class `_DerivedObserverOrFakeQuantize`: D204: 1 blank line required after class docstring (found 0) __init__.py:152 in private class `_DerivedObserverOrFakeQuantize`: D205: 1 blank line required between summary line and description (found 0) __init__.py:152 in private class `_DerivedObserverOrFakeQuantize`: D210: No whitespaces allowed surrounding docstring text __init__.py:152 in private class `_DerivedObserverOrFakeQuantize`: D400: First line should end with a period (not 's') _correct_bias.py:20 in public function `get_module`: D200: One-line docstring should fit on one line with quotes (found 2) _correct_bias.py:20 in public function `get_module`: D210: No whitespaces allowed surrounding docstring text _correct_bias.py:20 in public function `get_module`: D300: Use """triple double quotes""" (found '''-quotes) _correct_bias.py:20 in public function `get_module`: D400: First line should end with a period (not 'l') _correct_bias.py:25 in public function `parent_child_names`: D200: One-line docstring should fit on one line with quotes (found 2) _correct_bias.py:25 in public function `parent_child_names`: D300: Use """triple double quotes""" (found '''-quotes) _correct_bias.py:25 in public function `parent_child_names`: D400: First line should end with a period (not 'e') _correct_bias.py:25 in public function `parent_child_names`: D401: First line should be in imperative mood (perhaps 'Split', not 'Splits') _correct_bias.py:34 in public function `get_param`: D205: 1 blank line required between summary line and description (found 0) _correct_bias.py:34 in public function `get_param`: D210: No whitespaces allowed surrounding docstring text _correct_bias.py:34 in public function `get_param`: D300: Use """triple double quotes""" (found '''-quotes) _correct_bias.py:34 in public function `get_param`: D400: First line should end with a period (not 's') _correct_bias.py:44 in public class `MeanShadowLogger`: D204: 1 blank line required after class docstring (found 0) _correct_bias.py:44 in public class `MeanShadowLogger`: D205: 1 blank line required between summary line and description (found 0) _correct_bias.py:44 in public class `MeanShadowLogger`: D400: First line should end with a period (not 'n') _correct_bias.py:47 in public method `__init__`: D107: Missing docstring in __init__ _correct_bias.py:56 in public method `forward`: D205: 1 blank line required between summary line and description (found 0) _correct_bias.py:56 in public method `forward`: D210: No whitespaces allowed surrounding docstring text _correct_bias.py:56 in public method `forward`: D300: Use """triple double quotes""" (found '''-quotes) _correct_bias.py:56 in public method `forward`: D401: First line should be in imperative mood; try rephrasing (found 'The') _correct_bias.py:77 in public method `clear`: D102: Missing docstring in public method _correct_bias.py:85 in public function `bias_correction`: D205: 1 blank line required between summary line and description (found 0) _correct_bias.py:85 in public function `bias_correction`: D210: No whitespaces allowed surrounding docstring text _correct_bias.py:85 in public function `bias_correction`: D300: Use """triple double quotes""" (found '''-quotes) _correct_bias.py:85 in public function `bias_correction`: D400: First line should end with a period (not 's') _correct_bias.py:85 in public function `bias_correction`: D401: First line should be in imperative mood (perhaps 'Use', not 'Using') _equalize.py:22 in public function `set_module_weight`: D103: Missing docstring in public function _equalize.py:28 in public function `set_module_bias`: D103: Missing docstring in public function _equalize.py:34 in public function `get_module_weight`: D103: Missing docstring in public function _equalize.py:40 in public function `get_module_bias`: D103: Missing docstring in public function _equalize.py:47 in public function `max_over_ndim`: D200: One-line docstring should fit on one line with quotes (found 2) _equalize.py:47 in public function `max_over_ndim`: D210: No whitespaces allowed surrounding docstring text _equalize.py:47 in public function `max_over_ndim`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:47 in public function `max_over_ndim`: D400: First line should end with a period (not 's') _equalize.py:47 in public function `max_over_ndim`: D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies') _equalize.py:55 in public function `min_over_ndim`: D200: One-line docstring should fit on one line with quotes (found 2) _equalize.py:55 in public function `min_over_ndim`: D210: No whitespaces allowed surrounding docstring text _equalize.py:55 in public function `min_over_ndim`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:55 in public function `min_over_ndim`: D400: First line should end with a period (not 's') _equalize.py:55 in public function `min_over_ndim`: D401: First line should be in imperative mood (perhaps 'Apply', not 'Applies') _equalize.py:63 in public function `channel_range`: D200: One-line docstring should fit on one line with quotes (found 2) _equalize.py:63 in public function `channel_range`: D210: No whitespaces allowed surrounding docstring text _equalize.py:63 in public function `channel_range`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:63 in public function `channel_range`: D400: First line should end with a period (not 'l') _equalize.py:63 in public function `channel_range`: D401: First line should be in imperative mood (perhaps 'Find', not 'finds') _equalize.py:63 in public function `channel_range`: D403: First word of the first line should be properly capitalized ('Finds', not 'finds') _equalize.py:76 in public function `cross_layer_equalization`: D205: 1 blank line required between summary line and description (found 0) _equalize.py:76 in public function `cross_layer_equalization`: D210: No whitespaces allowed surrounding docstring text _equalize.py:76 in public function `cross_layer_equalization`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:76 in public function `cross_layer_equalization`: D400: First line should end with a period (not 't') _equalize.py:120 in public function `equalize`: D205: 1 blank line required between summary line and description (found 0) _equalize.py:120 in public function `equalize`: D210: No whitespaces allowed surrounding docstring text _equalize.py:120 in public function `equalize`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:120 in public function `equalize`: D400: First line should end with a period (not 'l') _equalize.py:159 in public function `converged`: D205: 1 blank line required between summary line and description (found 0) _equalize.py:159 in public function `converged`: D210: No whitespaces allowed surrounding docstring text _equalize.py:159 in public function `converged`: D300: Use """triple double quotes""" (found '''-quotes) _equalize.py:159 in public function `converged`: D400: First line should end with a period (not 's') _equalize.py:159 in public function `converged`: D401: First line should be in imperative mood (perhaps 'Test', not 'Tests') _learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`: D204: 1 blank line required after class docstring (found 0) _learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`: D205: 1 blank line required between summary line and description (found 0) _learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`: D210: No whitespaces allowed surrounding docstring text _learnable_fake_quantize.py:8 in private class `_LearnableFakeQuantize`: D400: First line should end with a period (not 'h') _learnable_fake_quantize.py:68 in private method `enable_param_learning`: D205: 1 blank line required between summary line and description (found 0) _learnable_fake_quantize.py:68 in private method `enable_param_learning`: D400: First line should end with a period (not 'd') _learnable_fake_quantize.py:68 in private method `enable_param_learning`: D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables') _learnable_fake_quantize.py:78 in private method `enable_static_estimate`: D205: 1 blank line required between summary line and description (found 0) _learnable_fake_quantize.py:78 in private method `enable_static_estimate`: D400: First line should end with a period (not 'f') _learnable_fake_quantize.py:78 in private method `enable_static_estimate`: D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables') _learnable_fake_quantize.py:87 in private method `enable_static_observation`: D205: 1 blank line required between summary line and description (found 0) _learnable_fake_quantize.py:87 in private method `enable_static_observation`: D400: First line should end with a period (not 't') _learnable_fake_quantize.py:87 in private method `enable_static_observation`: D401: First line should be in imperative mood (perhaps 'Enable', not 'Enables') fake_quantize.py:1 at module level: D205: 1 blank line required between summary line and description (found 0) fake_quantize.py:1 at module level: D400: First line should end with a period (not 'n') fake_quantize.py:61 in public class `FakeQuantizeBase`: D205: 1 blank line required between summary line and description (found 0) fake_quantize.py:61 in public class `FakeQuantizeBase`: D210: No whitespaces allowed surrounding docstring text fake_quantize.py:61 in public class `FakeQuantizeBase`: D400: First line should end with a period (not 'e') fake_quantize.py:74 in public method `__init__`: D107: Missing docstring in __init__ fake_quantize.py:83 in public method `forward`: D102: Missing docstring in public method fake_quantize.py:87 in public method `calculate_qparams`: D102: Missing docstring in public method fake_quantize.py:91 in public method `enable_fake_quant`: D102: Missing docstring in public method fake_quantize.py:95 in public method `disable_fake_quant`: D102: Missing docstring in public method fake_quantize.py:99 in public method `enable_observer`: D102: Missing docstring in public method fake_quantize.py:103 in public method `disable_observer`: D102: Missing docstring in public method fake_quantize.py:107 in public method `with_args`: D102: Missing docstring in public method fake_quantize.py:115 in public class `FakeQuantize`: D205: 1 blank line required between summary line and description (found 0) fake_quantize.py:115 in public class `FakeQuantize`: D210: No whitespaces allowed surrounding docstring text fake_quantize.py:115 in public class `FakeQuantize`: D412: No blank lines allowed between a section header and its content ('Attributes') fake_quantize.py:150 in public method `__init__`: D107: Missing docstring in __init__ fake_quantize.py:188 in public method `calculate_qparams`: D102: Missing docstring in public method fake_quantize.py:191 in public method `forward`: D102: Missing docstring in public method fake_quantize.py:214 in public method `extra_repr`: D102: Missing docstring in public method fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`: D205: 1 blank line required between summary line and description (found 0) fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`: D210: No whitespaces allowed surrounding docstring text fake_quantize.py:262 in public class `FixedQParamsFakeQuantize`: D400: First line should end with a period (not 'n') fake_quantize.py:268 in public method `__init__`: D107: Missing docstring in __init__ fake_quantize.py:279 in public method `calculate_qparams`: D102: Missing docstring in public method fake_quantize.py:283 in public method `extra_repr`: D102: Missing docstring in public method fake_quantize.py:292 in public class `FusedMovingAvgObsFakeQuantize`: D205: 1 blank line required between summary line and description (found 0) fake_quantize.py:292 in public class `FusedMovingAvgObsFakeQuantize`: D400: First line should end with a period (not 'e') fake_quantize.py:307 in public method `__init__`: D107: Missing docstring in __init__ fake_quantize.py:322 in public method `calculate_qparams`: D102: Missing docstring in public method fake_quantize.py:326 in public method `extra_repr`: D102: Missing docstring in public method fake_quantize.py:342 in public method `forward`: D102: Missing docstring in public method fake_quantize.py:480 in private function `_is_fake_quant_script_module`: D200: One-line docstring should fit on one line with quotes (found 2) fake_quantize.py:480 in private function `_is_fake_quant_script_module`: D210: No whitespaces allowed surrounding docstring text fake_quantize.py:480 in private function `_is_fake_quant_script_module`: D300: Use """triple double quotes""" (found '''-quotes) fake_quantize.py:480 in private function `_is_fake_quant_script_module`: D401: First line should be in imperative mood (perhaps 'Return', not 'Returns') fake_quantize.py:491 in public function `disable_fake_quant`: D400: First line should end with a period (not ':') fake_quantize.py:502 in public function `enable_fake_quant`: D400: First line should end with a period (not ':') fake_quantize.py:513 in public function `disable_observer`: D400: First line should end with a period (not ':') fake_quantize.py:524 in public function `enable_observer`: D400: First line should end with a period (not ':') fuse_modules.py:1 at module level: D100: Missing docstring in public module fuse_modules.py:39 in public function `fuse_known_modules`: D205: 1 blank line required between summary line and description (found 0) fuse_modules.py:39 in public function `fuse_known_modules`: D400: First line should end with a period (not 'd') fuse_modules.py:39 in public function `fuse_known_modules`: D401: First line should be in imperative mood (perhaps 'Return', not 'Returns') fuse_modules.py:104 in public function `fuse_modules`: D400: First line should end with a period (not 'e') fuse_modules.py:167 in public function `fuse_modules_qat`: D200: One-line docstring should fit on one line with quotes (found 2) fuse_modules.py:167 in public function `fuse_modules_qat`: D210: No whitespaces allowed surrounding docstring text fuse_modules.py:167 in public function `fuse_modules_qat`: D400: First line should end with a period (not '`') fuser_method_mappings.py:1 at module level: D100: Missing docstring in public module fuser_method_mappings.py:18 in public function `fuse_conv_bn`: D400: First line should end with a period (not 'e') fuser_method_mappings.py:55 in public function `fuse_conv_bn_relu`: D400: First line should end with a period (not 'e') fuser_method_mappings.py:102 in public function `fuse_linear_bn`: D400: First line should end with a period (not 'e') fuser_method_mappings.py:131 in public function `fuse_convtranspose_bn`: D400: First line should end with a period (not 'e') fuser_method_mappings.py:154 in private function `_sequential_wrapper2`: D205: 1 blank line required between summary line and description (found 0) fuser_method_mappings.py:154 in private function `_sequential_wrapper2`: D210: No whitespaces allowed surrounding docstring text fuser_method_mappings.py:154 in private function `_sequential_wrapper2`: D400: First line should end with a period (not 's') fuser_method_mappings.py:182 in public function `get_fuser_method`: D205: 1 blank line required between summary line and description (found 0) fuser_method_mappings.py:182 in public function `get_fuser_method`: D210: No whitespaces allowed surrounding docstring text fuser_method_mappings.py:182 in public function `get_fuser_method`: D300: Use """triple double quotes""" (found '''-quotes) fuser_method_mappings.py:182 in public function `get_fuser_method`: D400: First line should end with a period (not ',') fuser_method_mappings.py:205 in private function `_get_valid_patterns`: D205: 1 blank line required between summary line and description (found 0) fuser_method_mappings.py:205 in private function `_get_valid_patterns`: D400: First line should end with a period (not ',') fuser_method_mappings.py:205 in private function `_get_valid_patterns`: D401: First line should be in imperative mood (perhaps 'Return', not 'Returns') fuser_method_mappings.py:238 in public function `get_fuser_method_new`: D205: 1 blank line required between summary line and description (found 0) fuser_method_mappings.py:238 in public function `get_fuser_method_new`: D210: No whitespaces allowed surrounding docstring text fuser_method_mappings.py:238 in public function `get_fuser_method_new`: D400: First line should end with a period (not 'd') fuser_method_mappings.py:238 in public function `get_fuser_method_new`: D401: First line should be in imperative mood; try rephrasing (found 'This') Pull Request resolved: https://github.com/pytorch/pytorch/pull/112992 Approved by: https://github.com/kit1980	2023-11-15 00:59:44 +00:00
andrewor14	14eb92cb43	[quant][pt2][be] Remove add/relu from conv-bn QAT pattern (#113006 ) Summary: This commit significantly simplifies the QAT fusion code for the `conv-bn` pattern by removing add and relu nodes from the match and replacement patterns. This does not reduce functionality; patterns like `conv-bn-relu`, `conv-bn-add`, and `conv-bn-add-relu` are still supported. We simply do not match these extra nodes, since there is actually no need to replace them. This has the additional benefit of reducing the number of patterns being matched by 16x, since for each add and relu variant of the `conv-bn` pattern there is also an in-place variant. This also enables more flexible `conv-bn` pattern matching in the future and keeps the number of patterns more scalable. One important change needed in this commit was to remove the match filter that requires the input and output activations to be quantized. This was necessary because otherwise we would always expect q-dq nodes immediately after the getitem node, instead of after the add or relu nodes for example. This has another side benefit of keeping QAT fusion flexible enough to support weight only quantization. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel Pull Request resolved: https://github.com/pytorch/pytorch/pull/113006 Approved by: https://github.com/jerryzh168	2023-11-14 16:08:37 +00:00
Snabel Kabiya	c21320b3b1	CPU Publish: Fix Assign device error, when module has multiple devices (#109149 ) (#113509 ) Summary: new version of this: https://www.internalfb.com/diff/D49110166?dst_version_fbid=252052334533986 Fix Assign device error, when module has multiple devices If fc_fp16_quantization enabled for CPU model. And module REMOTE_OTHER has multiple devices: {device(type='meta'), device(type='cpu')} We fail on this assertion: fbcode/caffe2/torch/ao/quantization/fx/utils.py 232 assert len(devices) <= 1, ( Since CPU models work on CPU devices, added a condition before the assertion. In case, we have CPU in module list of devices. Set device as CPU. Please see debug details: https://docs.google.com/document/d/1pMPCeJyMPA15NhFc2uAyNDkS9azR40uaNyOP0DIgHjU/edit Test Plan: AIMP_DISAGG_CPU=true buck run mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true lego/scripts:lego_cli -- run-locally --model_entity_id 959168967 --config_version 28 --publish_context OFFLINE_PUBLISH --lego_pipeline aiplatform.modelstore.model_generation.lego.lego_pipeline_builder.gmpp_lego_pipeline --gmpp_config '{"gmpp_pipeline_descriptor": "aiplatform.modelstore.model_generation.v1.ads_pipelines.aimp_pyper_pipeline.model_generation_pipeline", "worker_process_number":12, "worker_thread_per_process_number": 6, "use_work_assignment": true}' 2>&1 \| tee /tmp/gmpp_lc.txt Snapshot: https://www.internalfb.com/manifold/explorer/ads_storage_fblearner/tree/user/facebook/fblearner/predictor/959168967/47 Differential Revision: D51226114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113509 Approved by: https://github.com/jerryzh168	2023-11-14 06:15:32 +00:00
Jerry Zhang	501d118255	[quant][pt2e] Add transform_for_annotation method in Quantizer (#113115 ) Summary: Adding the method so that people can do some transformations before annotation to make the graph easier to annotate Test Plan: python test/test_quantization.py TestQuantizePT2E.test_transform_for_annotation Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51141080](https://our.internmc.facebook.com/intern/diff/D51141080) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113115 Approved by: https://github.com/kimishpatel	2023-11-09 20:23:29 +00:00
Jerry Zhang	12c257cc00	[qunat][pt2e] Support allow_implicit_sharing flag (#112929 ) Summary: For a Node: node1 and edge: (node1, node2), since they are observing the same Tensor, we may want to implicitly share observers, this flag allows people to turn off this behavior for the output of the node See the test_allow_implicit_sharing test for use case Test Plan: python test/test_quantization.py TestQuantizePT2E.test_allow_implicit_sharing Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/112929 Approved by: https://github.com/kimishpatel	2023-11-08 23:47:17 +00:00
andrewor14	c0aba9be41	[quant][pt2] Fix custom dtype per channel weight in QAT (#112612 ) Summary: Previously we only copied over q/dq args for the per tensor case. This was because the qparams for `quantize_per_tensor` are literals while the qparams for `quantize_per_channel` are `get_attr` nodes (tensors), which disappear from the original nodes in the graph after subgraph rewriting. However, this is problematic because, in the per channel case, not all q/dq args are tensors. In particular, the args after the qparams (axis, qmin, qmax, dtype) are all literals. For these literal args we simply used the hardcoded ones (0, -127, 127, torch.int8 respectively), even if the user explicitly specified to use a different weight dtype. This commit fixes this by copying over these literal args for the per channel case as well. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_per_channel_weight_custom_dtype Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/112612 Approved by: https://github.com/jerryzh168	2023-11-07 20:10:53 +00:00
Aaron Gokaslan	8219bf051b	[BE]: Apply RUF015 to torch folder (#113025 ) Removes unnecessary allocations of iterators. There is a small chance this may have side effects as the entire iterator is no longer consumed, but this is a way more efficient method for retrieving the first element. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113025 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-11-07 00:48:15 +00:00
andrewor14	b6e85eb8d5	[quant][pt2] Support quantized conv bias in QAT fusion (#112528 ) Summary: Previously QAT fusion assumes bias is not quantized. This works for the existing XNNPACKQuantizer, but not for custom quantizers that wish to quantize the bias. This commit supports this by adding the necessary patterns. This requires refactoring the code, however, since it previously assumed that there will only be one pair of q-dq (from conv weight) in the matched pattern, and this is no longer true. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D50856377](https://our.internmc.facebook.com/intern/diff/D50856377) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112528 Approved by: https://github.com/jerryzh168	2023-11-06 17:58:57 +00:00
leslie-fang-intel	6ba2748690	[Quant] [PT2] Enable Decomposed quant per tensor/channel to accept bfloat16 input (#112225 ) Summary - PR 4 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Enable `decomposed quant_per_tensor` and `quant_per_channel` accepts bfloat16 input. TestPlan ``` python -m pytest test_quantized_tensor.py -k test_decomposed_quantize_per_tensor_bfloat16_input python -m pytest test_quantized_tensor.py -k test_decomposed_quantize_per_channel_bfloat16_input ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112225 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-03 23:47:43 +00:00
leslie-fang-intel	871e27a61c	[Quant] [PT2] Remove the output Annotation of Conv/Linear in x86InductorQuantizer (#112140 ) Summary - PR 3 for enabling Int8-Mixed-BF16 PT2E PTQ Quantization with Inductor https://github.com/pytorch/pytorch/issues/111640. - Remove the output annotation of QConv/QLinear in X86InductorQuantizer. Test Plan ``` python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qlinear python -m pytest test_x86inductor_quantizer.py -k Conv2d python -m pytest test_x86inductor_quantizer.py -k Linear ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/112140 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #112010, #112126	2023-11-03 08:24:55 +00:00
leslie-fang-intel	6c19de07cd	[Quant] [PT2] Add ConvBNAdd(ReLU) Annotation into X86InductorQuantizer (#111281 ) Summary This PR adds ConvBNAdd(ReLU) QAT Annotation into `X86InductorQuantizer`. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_binary_unary_with_quantizer_api python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_add python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_add_relu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/111281 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #111280	2023-11-02 02:05:49 +00:00
leslie-fang-intel	56ca0043f6	[Quant] [PT2] Enable QAT Quantization flow in X86InductorQuantizer (#111280 ) Summary This PR enables PT2 QAT Quantization flow in `X86InductorQuantizer`. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_with_quantizer_api python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary_with_quantizer_api python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d python -m pytest test_mkldnn_pattern_matcher.py -k test_qat_qconv2d_relu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/111280 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-11-02 02:03:10 +00:00
andrewor14	5cd1208415	[quant][pt2][be] Refactor QAT q-dq patterns (#112279 ) Summary: This commit refactors q-dq patterns used in QAT fusion, reducing code duplication. This is important for future efforts to support quantizing bias. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/112279 Approved by: https://github.com/jerryzh168 ghstack dependencies: #112159	2023-10-31 18:04:23 +00:00
andrewor14	231129ea36	[quant][pt2] Fix QAT conv-bn bias derived qspec (#112159 ) Summary: Today, we have special handling for special qspecs like `SharedQuantizationSpec` or `DerivedQuantizationSpec`, since these qspecs refer to other nodes in the graph and these node references need to be updated after replacement (since they referred to nodes in the original graph that no longer exist in the new graph). However, we only do the above for special nodes like conv, bn, getitem, and relu. This doesn't cover the common use case of having conv bias derive its qparams from those of conv input activations and conv weight. This commit adds support for this use case by also replacing the node references for these nodes. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_bias_derived_qspec Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D50697078](https://our.internmc.facebook.com/intern/diff/D50697078) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112159 Approved by: https://github.com/jerryzh168	2023-10-31 18:04:23 +00:00
Jerry Zhang	3db0095ea2	[reland][quant][pt2e][be] Cleanup observer insertion logic (#111828 ) (#112453 ) Summary: att, after SharedQuantizationSpec bug fix we are doing some checks before hand, this can simplify the logic when we insert observers Test Plan: contbuild & OSS CI, see `bf998a2c5d` Test plan from GitHub: python test/test_quantization.py TestQuantizePT2E CIs Differential Revision: D50816224 Pulled By: jerryzh168 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112453 Approved by: https://github.com/andrewor14	2023-10-31 17:33:24 +00:00
PyTorch MergeBot	797d7100de	Revert "[quant][pt2e][be] Cleanup observer insertion logic (#111828 )" This reverts commit `bf998a2c5d`. Reverted https://github.com/pytorch/pytorch/pull/111828 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/111828#issuecomment-1782154648))	2023-10-27 01:35:27 +00:00
Jerry Zhang	bf998a2c5d	[quant][pt2e][be] Cleanup observer insertion logic (#111828 ) Summary: att, after SharedQuantizationSpec bug fix we are doing some checks before hand, this can simplify the logic when we insert observers Test Plan: python test/test_quantization.py TestQuantizePT2E CIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/111828 Approved by: https://github.com/kimishpatel ghstack dependencies: #111827	2023-10-25 03:48:36 +00:00
Jerry Zhang	6e2dfb360b	[quant][be] Clean up prepare code (#111827 ) Summary: att Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/111827 Approved by: https://github.com/andrewor14	2023-10-25 00:14:59 +00:00
Kimish Patel	a8760f1b42	[Quantization] Add a test for QAT + PTQ selective quantization in (#111689 ) xnnpack quantizer Summary: For some workflows you want to quantize some parts of the model via qat and then continue eager mode training. After training, you want to export the whole model and perform PTQ on the rest. Test Plan: test added Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D50510480](https://our.internmc.facebook.com/intern/diff/D50510480) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111689 Approved by: https://github.com/jerryzh168	2023-10-24 23:25:38 +00:00
Jerry Zhang	43c211facb	[quant][pt2e] Actually support transitive sharing for SharedQuantizationSpec (#111172 ) Summary: Previously we actually did not really support this, this PR added the support. Next * clean up insert observer logic * add allow_transitive_sharing boolean flag to allow people to turn this op for certain edges Test Plan: python test/test_quantization.py TestQuantizePT2E.test_shared_qspec_transitivity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D50250789](https://our.internmc.facebook.com/intern/diff/D50250789) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111172 Approved by: https://github.com/kimishpatel	2023-10-20 23:25:17 +00:00
Aaron Gokaslan	1ad0f0b308	[BE]: remove unnecessary enumerate calls (#111690 ) Remove unnecessary enumerate calls, entirely automated fixes so probably reasonably low risk. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111690 Approved by: https://github.com/malfet	2023-10-20 23:20:29 +00:00
andrewor14	e4e7d34fe9	[pt2][quant] Clean up QAT get conv-bn-relu nodes (#111515 ) Summary: Reduces duplicate code to map original matched nodes to replacement nodes. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT Reviewers: jerryzh168 Subscribers: jerryzh168, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/111515 Approved by: https://github.com/jerryzh168	2023-10-20 20:01:38 +00:00
Aaron Gokaslan	a0632389b7	[BE]: Update lintrunner mypy to 1.6.0 (#111375 ) Follow up to #111305 that updates lintrunner's version too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111375 Approved by: https://github.com/malfet	2023-10-17 01:22:06 +00:00
Jerry Zhang	d589106bcd	[quant][pt2e] Disable remove_qconfig (#111000 ) Summary: This is a hacky flag that we had before in fx flow, and we don't want this in the new flow Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/111000 Approved by: https://github.com/andrewor14	2023-10-11 19:43:46 +00:00
andrewor14	0e551bbcd7	[quant][pt2] Preserve source_fn_stack after QAT fusion (#110899 ) Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_preserve_source_fn_stack Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D50101253](https://our.internmc.facebook.com/intern/diff/D50101253) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110899 Approved by: https://github.com/jerryzh168	2023-10-11 02:55:52 +00:00
Kazuaki Ishizaki	b5f9696d81	Fix typo under torch directory (#110824 ) This PR fixes typo `the the` of comments and exception messages in files under `torch` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824 Approved by: https://github.com/H-Huang	2023-10-09 19:16:43 +00:00
Jeff Daily	e8f1f4ed66	[quant][pt2][ROCm] follow-up PR 109908 for miopen_batch_norm (#110653 ) Fixes recent broken unit tests caused by PR #109908 because cudnn and miopen have separate batch norm functions. ``` 2023-10-05T09:35:01.6606614Z _______________ TestQuantizePT2EQAT.test_qat_conv_bn_fusion_cuda _______________ 2023-10-05T09:35:01.6606948Z Traceback (most recent call last): 2023-10-05T09:35:01.6607362Z File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 323, in test_qat_conv_bn_fusion_cuda 2023-10-05T09:35:01.6607767Z self._verify_symmetric_xnnpack_qat_graph( 2023-10-05T09:35:01.6608217Z File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 130, in _verify_symmetric_xnnpack_qat_graph 2023-10-05T09:35:01.6608658Z self._verify_symmetric_xnnpack_qat_graph_helper( 2023-10-05T09:35:01.6609105Z File "/var/lib/jenkins/pytorch/test/quantization/pt2e/test_quantize_pt2e_qat.py", line 173, in _verify_symmetric_xnnpack_qat_graph_helper 2023-10-05T09:35:01.6609623Z m = prepare_qat_pt2e(m, quantizer) 2023-10-05T09:35:01.6610171Z File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/quantize_pt2e.py", line 178, in prepare_qat_pt2e 2023-10-05T09:35:01.6610561Z _fuse_conv_bn_qat(model) 2023-10-05T09:35:01.6611072Z File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 501, in _fuse_conv_bn_qat 2023-10-05T09:35:01.6611497Z m = _fuse_conv_bn_qat_helper(m, is_cuda=True) 2023-10-05T09:35:01.6612065Z File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 575, in _fuse_conv_bn_qat_helper 2023-10-05T09:35:01.6612492Z _get_conv_bn_getitem_nodes(r.replacements) 2023-10-05T09:35:01.6613058Z File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/ao/quantization/pt2e/qat_utils.py", line 383, in _get_conv_bn_getitem_nodes 2023-10-05T09:35:01.6613465Z assert bn_node is not None 2023-10-05T09:35:01.6613716Z AssertionError ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110653 Approved by: https://github.com/jerryzh168, https://github.com/pruthvistony	2023-10-06 15:30:55 +00:00
Jerry Zhang	7b6042111f	[quant][pt2e] Refactor conv related annotation for XNNPACKQuantizer (#110308 ) Summary: Since we changed IR that we are working with to pre autograd aten IR, it's easier to use plain pattern match instead of relying on source_matcher_utils now, this PR refactors the annotation for conv to use aten ops directly. Also fixed reentrant test after this change. Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110308 Approved by: https://github.com/kimishpatel	2023-10-05 22:36:18 +00:00
Andrew Or	7c72238e4b	Back out "Enable pickling model prepared with QAT qconfig" (#110392 ) Summary: D49187352 caused our model conversion and loading of QAT checkpoint to be stuck with thrift time out. we are actively checking in final code and model for static quant HTP prod model, and encountered this breakage at head Thursday. Thrift timeout is a not failing, and because of that, it's hard to bisect and find this culprit. It is also hard to set up unit test, because the job simply time-out. Better test is needed to guard downstream model conversion against upstream changes. Our suspicion of why this diff broke us is that we create a lot of modules with qat (in a recursive manner) but our model is not a qat traceable module (it is a graph with many qat modules and floating point modules). With fuctools.partial as in the original diff, we will be caching modules in the memory and causing the memory of the machine to be taken up completely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110392 Approved by: https://github.com/junesg, https://github.com/jerryzh168	2023-10-05 14:41:00 +00:00
andrewor14	62cad5b5b0	[quant][pt2] Support cudnn_batch_norm in QAT fusion (#109908 ) Summary: Today, we get different batch norm ops depending on the device the model is placed on at export time. Exporting `model.cpu()` gives `_native_batch_norm_legit`, while exporting `model.cuda()` gives `cudnn_batch_norm`. QAT fusion currently only supports the former and silently ignores the latter. This commit fixes this by additionally matching on the latter op during QAT fusion. Test Plan: python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_fusion python test/test_quantization.py TestQuantizePT2EQAT.test_qat_conv_bn_relu_fusion Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D49615145](https://our.internmc.facebook.com/intern/diff/D49615145) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109908 Approved by: https://github.com/jerryzh168	2023-10-05 04:08:44 +00:00
Fabrice Pont	053367b1ed	fix: flake8-bugbear code B024 (#107265 ) See #106571 item B024 This fix concerns the addition of `abstractmethod` to methods declared inside abstract classes. Should I also include PEP8 compliant reformatting on the files I had to modify ? Pull Request resolved: https://github.com/pytorch/pytorch/pull/107265 Approved by: https://github.com/kit1980	2023-10-04 23:52:52 +00:00
Max Ren	08c7dcda65	[pt2e][xnnpack_quantizer] quantize "mul" (#110428 ) Adding "mul" to list of partitions that are supported by the quantizer. This shows up in EDSR, where we still want to quantize the mul op Differential Revision: [D49850151](https://our.internmc.facebook.com/intern/diff/D49850151/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110428 Approved by: https://github.com/jerryzh168 ghstack dependencies: #110427	2023-10-04 05:11:53 +00:00
Max Ren	66202ed29c	[pt2e][xnnpack_quantizer] add util function to convert scalars to attrs (#110427 ) Jerry provided a notebook solution for converting scalars to attrs so that they may be properly quantized: https://fburl.com/anp/kzz7tfn1 Adding this pass as a util function in xnnpack_quantizer_utils.py Differential Revision: [D49850150](https://our.internmc.facebook.com/intern/diff/D49850150/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110427 Approved by: https://github.com/jerryzh168	2023-10-04 05:11:53 +00:00
Jerry Zhang	c9b8e06060	[quant] Enable quantization for wav2letter (#109830 ) Summary: Also added annotation support for conv1d_relu and conv1d in XNNPACKQuantizer, the quantized results still matches fx quant path (didn't quantize conv1d) so tests are not disabled Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=w2l --verify Differential Revision: D49479546 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109830 Approved by: https://github.com/kimishpatel	2023-09-29 00:47:34 +00:00
Jerry Zhang	e3eb1d92d8	[quant][docs] Add documentation for `prepare_pt2e`, `prepare_qat_pt2e` and `convert_pt2e` (#110097 ) Summary: att Test Plan: . Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110097 Approved by: https://github.com/kimishpatel	2023-09-28 18:24:58 +00:00
Sindi Shkodrani	419ec3b229	Enable pickling model prepared with QAT qconfig (#109288 ) Summary: Resolving error: AttributeError: Can't pickle local object '_add_module_to_qconfig_obs_ctr.<locals>.get_factory_kwargs_based_on_module_device' by moving nested function out to the main module Test Plan: Added test to CI Reviewed By: andrewor14 Differential Revision: D49187352 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109288 Approved by: https://github.com/andrewor14	2023-09-28 09:51:19 +00:00
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
Jiaxu Zhu	595af261b2	[ao] Support Subclasses of `FloatFunctional` in eager mode prepare (#109646 ) Summary: As title, if a module is subclassing `nnq.FloatFunctional`, also adding observers to it like `nnq.FloatFunctional` Test Plan: CI Differential Revision: D49431968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109646 Approved by: https://github.com/jerryzh168	2023-09-20 08:09:55 +00:00
Kimish Patel	73ac814148	[Pytorch][quant] Move xnnpack quantizer to use aten.linear (#109254 ) Summary: Now that quantization works on pre-dispatch aten IR, moving to full set of aten ops is ok. Plus when tracing models like ViT, the linear projections of of k, q, v uses functional.linear and not nn.Linear, which results not being able to extract nodes corresponding to linear. Test Plan: quant tests Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49252194](https://our.internmc.facebook.com/intern/diff/D49252194) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109254 Approved by: https://github.com/jerryzh168	2023-09-18 20:26:44 +00:00
Jerry Zhang	3943afc94e	[quant][be] Remove unused APIs (#109342 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/109342 Approved by: https://github.com/kimishpatel, https://github.com/andrewor14	2023-09-15 16:07:01 +00:00
Jerry Zhang	41e2189843	[quant] Remove reference representation rewrite for adaptive_avg_pool2d (#108924 ) Summary: integer adaptive_avg_pool2d is not well defined due to different possible ways of rounding fp32 value to integer value, and this op isn't too critical for numerics (since it appears not too often), so we'll skip this for now. we might need to revert the changes that adds integer impl for adaptive_avg_pool op as well Test Plan: python test/test_quantization.py TestQuantizePT2ERepresentation Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108924 Approved by: https://github.com/kimishpatel	2023-09-14 10:18:36 +00:00
Jerry Zhang	cf26e5575d	[quant][be] Reduce warnings in tests (#108922 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108922 Approved by: https://github.com/andrewor14 ghstack dependencies: #108920, #108921	2023-09-12 21:54:33 +00:00
Jerry Zhang	b01b934aca	[quant][be] Cleanup xnnpack_quantizer implementation (#108921 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108921 Approved by: https://github.com/andrewor14	2023-09-12 19:28:41 +00:00
Jerry Zhang	241e84bf98	[quant][be] Rewrite xnnpack_quantizer_utils.py to use decorators (#108920 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108920 Approved by: https://github.com/kimishpatel	2023-09-12 00:09:13 +00:00
Andrew Or	e8a402c56e	[quant][pt2] Fix and rename `move_model_to_eval` (#108891 ) Summary: This commit fixes two silent correctness problems with the current implementation of `move_model_to_eval`: (1) Previously the user had to manually call `eliminate_dead_code` before calling `move_model_to_eval`, otherwise the dropout pattern won't actually get eliminated. This is because subgraph rewriter complains the match is not self-contained, and so silently does not do the replacement. (2) We wish to error when the user calls `model.train()` or `model.eval()` on an exported model. This error is raised correctly immediately after export today, but no longer raised after the user calls prepare or convert. We fix (1) by moving the `eliminate_dead_code` call into `move_model_to_eval`, and fix (2) by ensuring the respective errors are thrown after prepare and convert as well. Additionally, this commit renames `move_model_to_eval` to `move_exported_model_to_eval` to be more explicit. bypass-github-export-checks Test Plan: python test/test_quantization.py TestQuantizePT2E.test_disallow_eval_train python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_to_eval Imported from OSS Differential Revision: D49097293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108891 Approved by: https://github.com/jerryzh168	2023-09-11 15:37:01 +00:00
Jerry Zhang	b0de6a8002	[quant][executorch] Support inception_v4 in examples (#108382 ) Summary: Verified that pt2e quant flow matches the fx flow with executorch backend config Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=ic4 --verify ``` [INFO 2023-08-31 16:08:06,923 example.py:77] prepare sqnr: inf [INFO 2023-08-31 16:08:06,932 example.py:81] quant diff max: 0.0 [INFO 2023-08-31 16:08:06,936 example.py:85] quant sqnr: inf ``` full output: https://www.internalfb.com/intern/paste/P818520579/ Differential Revision: D48889075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108382 Approved by: https://github.com/kimishpatel	2023-09-08 17:39:31 +00:00
Paul Zhang	51c2b587c9	Back out "[PyPer][BE] Fix test_scripted_module in StatCollector" (#108588 ) Differential Revision: D48908507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108588 Approved by: https://github.com/jerryzh168	2023-09-08 14:33:58 +00:00
Kimish Patel	c1877e99c5	[Quant] Move to BFS instead of DFS to check for connectedness (#108572 ) Summary: Using dfs to check if two nodes are connecgted is making it very slow. Use of BFS makes it much faster. Test Plan: https://gist.github.com/leslie-fang-intel/9cd828623f567a3afbf41564d3546398 Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48971710](https://our.internmc.facebook.com/intern/diff/D48971710) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108572 Approved by: https://github.com/jerryzh168, https://github.com/osalpekar	2023-09-07 00:26:28 +00:00
Jerry Zhang	32a16d4999	[quant][pt2e] Support int16 quantization (#108453 ) Summary: Previously we can only use native pytorch int dtypes that has corresponding quantized dtypes (e.g. quint8, qint8), this PR removes this assumption in observers/fake_quants so that users can use all pytorch native dtypes (except for int64, we can add it later if need) the main addition here is int16. Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108453 Approved by: https://github.com/kimishpatel	2023-09-06 19:31:20 +00:00
Kimish Patel	ffc0c46092	[Quantization] Add metadata porting for nodes added by quantization (#107107 ) Summary: This diff adds adding metadata to q-dq nodes by inferring the quatization intent from node annotations. Annotations on the node are way for user to specify how a node or subgraph is supposed to be quantized. We continue to use that information to copy metadata on Q/DQ node from appropriate nodes. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488416](https://our.internmc.facebook.com/intern/diff/D48488416) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107107 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105, #107106, #107899, #107900	2023-09-02 06:38:14 +00:00
Kimish Patel	eb67c452c8	[Quant] Add DQ duplication pass (#107900 ) Summary: During convert step observers are first replaced by Q-DQ pair. In some scenarios like following output DQ has a fan out. ---> OP2 -> Q -> DQ / OP -> Q -> DQ - \ ---> OP3 -> Q -> DQ If either op OP2 or OP3 are configured to be quantized, then the input is expected to quantized. In this case quantized equivalent of some pattern, that quantizer asked to be quantized, should look like: [DQ -> {pattern} -> Q]. However, in scenario like above where DQ node is shared between multiple "quantized" patterns, boundary of "quantized" pattern is not clear because DQ now belongs to multiple quantized patterns. This poses challenge for: - Porting metadata: which "quantized" partition this DQ node belongs - Quantized representation, equivalently, needs to identify self-contained quantized pattern that is replaced by its equivalent pattern that captures compute in the quantized precision. Test Plan: test_duplicate_dq_pass Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107900 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14, https://github.com/leslie-fang-intel ghstack dependencies: #107105, #107106, #107899	2023-09-02 06:20:03 +00:00
Kimish Patel	f8d1ca9835	[Quant] Bug fix (#107899 ) Summary: When two layers are quantized differently, observer map update updates map for key (observed_node, node), whereas it should really be (original_input, node) Test Plan: Test in the next diff adds a test where it otherwise fails Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663145](https://our.internmc.facebook.com/intern/diff/D48663145) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107899 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105, #107106	2023-09-02 06:20:03 +00:00
Kimish Patel	37b0d76e35	[Quantization] Make annotation util functions return annotated nodes (#107106 ) Summary: Having annotation functions return nodes that are annotated is useful specifically for adding "quantization_tag" to those nodes Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488415](https://our.internmc.facebook.com/intern/diff/D48488415) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107106 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105	2023-09-02 06:19:55 +00:00
Kimish Patel	99168c1fa9	[Quant] Use input_qspec_map for weight quantization of linear (#107105 ) Summary: In prepararation for metadata porting diff, it is required that weight quant annotation happens via edge quantization, i.e. input_qspec_map. Reason: Metadata is ported via associating DQ node's metadata with its consumer while associating Q node's metadata with its producer. Furthermore, such porting must be qualified via user intent to see if the consumder of DQ, or producer of Q, actually specified intent of quantization By making quantization annotation on linear node's weight via input_qspec_map, we can enable associating DQ of [weight -> Q -> DQ], with the linear module. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488414](https://our.internmc.facebook.com/intern/diff/D48488414) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107105 Approved by: https://github.com/jerryzh168	2023-09-02 06:19:50 +00:00
Paul Zhang	4a9c6f1b73	[PyPer][BE] Fix test_scripted_module in StatCollector (#108232 ) Summary: D41985889 removed the cast to int for the inputs to torch.histc below, allowing the inputs to still be tensors. These tensors still have require_grad_ set to True, causing issues with the call to torch.histc. Test Plan: buck2 test 'fbcode//mode/opt' fbcode//dper3/dper3/modules/low_level_modules/tests:stat_collector_test -- --exact 'dper3/dper3/modules/low_level_modules/tests:stat_collector_test - test_scripted_module (dper3.dper3.modules.low_level_modules.tests.stat_collector_test.StatCollectorTest_1)' Differential Revision: D48800879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108232 Approved by: https://github.com/jerryzh168	2023-09-01 04:23:57 +00:00
Jerry Zhang	a9fe0b5b74	[quant][pt2e] Move propagate_annotation from quant flow to quantizer (#108320 ) Summary: Previously we run propagate_annotation by default in quantization flow to propagate annotations for ops like reshape, view etc. Not all quantizers would need this so we moved this to xnnpack_quantizer_utils for now. Next Step: * make propagate_annotation function configurable with a custom list of ops * remove unneeded ops in `_is_share_obs_or_fq_op` Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48856985](https://our.internmc.facebook.com/intern/diff/D48856985) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108320 Approved by: https://github.com/kimishpatel	2023-09-01 01:49:19 +00:00
leslie-fang-intel	6c342ec368	Revert PR-107951 to only support new graph capture API in Quantization (#108317 ) Summary Revert the changes in https://github.com/pytorch/pytorch/pull/107951 to make the utils function only support graph captured by `capture_pre_autograd_graph`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108317 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #108214	2023-09-01 00:47:10 +00:00
leslie-fang-intel	fb808c30c7	x86_inductor_quantizer switches to new graph capture API (#108214 ) Summary Update `X86InductorQuantizer` and related testcase to the new graph capture API `capture_pre_autograd_graph`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108214 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-09-01 00:43:45 +00:00
andrewor14	057b807178	[quant] Move dropout replacement to `move_model_to_eval` (#108184 ) Summary: This commit adds a public facing `torch.ao.quantization.move_model_to_eval` util function for QAT users. Instead of calling model.eval() on an exported model (which doesn't work, see https://github.com/pytorch/pytorch/issues/103681), the user would call this new util function instead. This ensures special ops such as dropout and batchnorm (not supported yet) will have the right behavior when the graph is later used for inference. Note: Support for an equivalent `move_model_to_train` will be added in the future. This is difficult to do for dropout currently because the eval pattern of dropout is simply a clone op, which we cannot just match and replace with a dropout op. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_move_model_to_eval Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D48814735](https://our.internmc.facebook.com/intern/diff/D48814735) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108184 Approved by: https://github.com/jerryzh168	2023-08-30 16:33:17 +00:00
Jerry Zhang	147b3495e2	[quant][pt2e] Add reference representation for dynamic quantized linear (#108073 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_dynamic_linear buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e -- 'test_representation_dynamic_linear' Reviewed By: kimishpatel Differential Revision: D48703076 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108073 Approved by: https://github.com/andrewor14	2023-08-29 07:12:55 +00:00
Jerry Zhang	9ae3d7ca90	[reland][quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) (#107992 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107992 Approved by: https://github.com/digantdesai, https://github.com/mcr229	2023-08-27 14:50:03 +00:00
Xia, Weiwen	e9b0f62a19	[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer (#106781 ) Summary Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. Test plan python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api Pull Request resolved: https://github.com/pytorch/pytorch/pull/106781 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #105818	2023-08-27 10:50:17 +00:00
leslie-fang-intel	c85c5954f2	[Quant][PT2E]Make _fuse_conv_bn_ support graph capture by torch._dynamo.export (#107951 ) Summary The latest check-in `a0cfaf0688` for the conv-bn folding assumes the graph is captured by the new graph capture API `torch._export.capture_pre_autograd_graph`. Since we still need to use the original graph capture API `torch._dynamo_export` in 2.1 release. So, this check-in made negative impact to workloads' performance heavily. Made this PR to fix this issue by trying to make the conv-bn folding function workable with both new and original graph capture API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107951 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #106836, #106838, #106958	2023-08-26 17:19:41 +00:00
leslie-fang-intel	1147a28b0b	[Quant][PT2E] Add cat and avg_pool2d recipe into x86InductorQuantizer (#106836 ) Summary Add `cat` and `avg_pool2d` quantization recipe as input output share observer into `x86InductorQuantizer`. Test Plan ``` clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_same_inputs clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_single_input clear && python -m pytest test_x86inductor_quantizer.py -k test_avg_pool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106836 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-08-26 16:51:13 +00:00
Jerry Zhang	15d4dedbbf	[quant][pt2e] Add reference representation rewrite for statically quantized linear (#107994 ) Summary: att Test Plan: ``` python test/test_quantization.py TestQuantizePT2E.test_representation_linear buck2 test 'fbcodemode/opt' fbcodecaffe2/test:quantization_pt2e -- 'test_representation_linear' ``` Reviewed By: kimishpatel Differential Revision: D48674862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107994 Approved by: https://github.com/mcr229, https://github.com/guangy10	2023-08-26 15:39:52 +00:00
leslie-fang-intel	70ca18f8a0	[Quant][PT2E] Enable X86InductorQuantizer single quantizable op(maxpool2d) (#105639 ) Summary In this PR, we mainly enable 2 things. - Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`. - Add quantization recipe of `maxpool2d` and annotate it as input./output share observer. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105639 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456	2023-08-26 08:34:15 +00:00
andrewor14	240bdbea61	[quant][pt2e] Fix annotation for conv no bias case (#107971 ) Summary: This fixes the no bias case for conv annotations. Previously this would result in an index out of bounds, since the new aten.conv2d op may not have the bias arg (unlike the old aten.convolution op). This was not caught because of a lack of test cases, which are added in this commit. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_no_bias python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_relu_fusion_no_conv_bias Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel Differential Revision: [D48696874](https://our.internmc.facebook.com/intern/diff/D48696874) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107971 Approved by: https://github.com/jerryzh168	2023-08-26 01:01:54 +00:00
Jerry Zhang	f92f69dbfb	[quant][pt2e] Enable testing for reference quant model representations (#107474 ) Summary: Previously these tests were disabled due to time out in dynamo export in fbcode, this might have been resolved, so trying to enable again Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48619072](https://our.internmc.facebook.com/intern/diff/D48619072) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107474 Approved by: https://github.com/andrewor14	2023-08-26 00:37:45 +00:00
PyTorch MergeBot	8d44b0f5a5	Revert "[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 )" This reverts commit `1d1739dc6d`. Reverted https://github.com/pytorch/pytorch/pull/107930 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107930#issuecomment-1694069330))	2023-08-26 00:37:02 +00:00
Jerry Zhang	1d1739dc6d	[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107930 Approved by: https://github.com/kimishpatel	2023-08-25 23:36:19 +00:00
leslie-fang-intel	1374974d60	[Quant][Inductor] Enable quantization conv_binary(add/add_relu) pattern fusion inside inductor (#105456 ) Summary Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor. Test Plan ``` clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105456 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588, #104590, #105455	2023-08-25 21:16:02 +00:00
Jerry Zhang	a0cfaf0688	[quant][pt2e] Make sure XNNPACKQuantizer works with the pre_dispatch=True (#107872 ) Summary: att Test Plan: ``` buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18 buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e ``` Reviewed By: andrewor14, tugsbayasgalan Differential Revision: D48415977 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107872 Approved by: https://github.com/andrewor14	2023-08-25 05:04:01 +00:00
Kimish Patel	2fbe6ef2f8	[pytorch][Quant] Fix bias quant bug (#107810 ) Summary: Bias should be quantized by act_scale * weight_scale in conv and linear Test Plan: Rewrite tests Differential Revision: D48606828 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107810 Approved by: https://github.com/jerryzh168	2023-08-24 23:44:19 +00:00
Jerry Zhang	16fcb07846	[quant][pt2e] Add support for channel in DerivedQuantizationSpec (#107833 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E.test_derived_qspec_per_channel Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48630535](https://our.internmc.facebook.com/intern/diff/D48630535) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107833 Approved by: https://github.com/andrewor14	2023-08-24 07:45:13 +00:00
Sherlock Huang	ee4b99cc3a	Decomp for aten.dropout (#106274 ) When exporting dropout with cpu tensor, we get following graph module ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format) bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9); empty_memory_format = None div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9); bernoulli_p = None mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar); arg0_1 = div_scalar = None return (mul_tensor,) ``` In addition, if we export with eval() mode, we will have an empty graph. However, when exporting with cuda tensor, we got ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True); arg0_1 = None getitem: f32[512, 10] = native_dropout_default[0]; native_dropout_default = None return (getitem,) ``` and exporting under eval() mode will still have a dropout node in graph. This PR make exporting with CPU tensor also produce aten.native_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274 Approved by: https://github.com/ezyang	2023-08-23 21:12:37 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Jerry Zhang	28be2c674a	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 ) Summary: Currently in quantizer/quantize_pt2e we import things from specific quantizers (XNNPACKQuantizer, QuantizationConfig) etc. this PR removes them so it's clearer that they are not part of the core quantization code base This PR also removed get_supported_operators from main Quantizer since we haven't seen a clear need for this API Test Plan: CIs Imported from OSS Differential Revision: D48340367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107259 Approved by: https://github.com/kimishpatel	2023-08-18 21:29:09 +00:00
Jerry Zhang	d3c4ec767b	[quant][pt2e] Fix handling for SharedQuantizationSpec (#106922 ) Summary: Previously if we have: ``` conv1 -> cat conv2 / ``` and configure output of conv1/conv2 to be int8 quantized, and cat also int8 quantized and with shared inputs, it will not produce expected results (input of cat will not be shared) The problem is that there is some missing checks when inserting observers for input for cat This PR fixes the problem. Fixes: https://github.com/pytorch/pytorch/issues/106760 Test Plan: python tes/test_quantization.py TestQuantzePT2E.test_shared_qspec Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106922 Approved by: https://github.com/kimishpatel	2023-08-16 21:16:45 +00:00
Jerry Zhang	4afab40b56	[quant][pt2e] Removed mean/hardtanh annotations and refactored adaptive_avg_pool annotation (#106805 ) Summary: Removed annotations for some ops, since they are handled in torch/ao/quantization/pt2e/_propagate_annotation.py Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106805 Approved by: https://github.com/kimishpatel	2023-08-10 04:51:06 +00:00
Jerry Zhang	97ce979e5d	[quant][pt2e] Add reference representation for quantized conv2d (#105784 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel Although right now it is not really testing things since there is some problem with dynamo export Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105784 Approved by: https://github.com/kimishpatel ghstack dependencies: #105783	2023-08-09 22:41:35 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	a44c072c89	Make InternalModel and Resnet work with rexportable flow (#106676 ) Summary: Internal model and Resnet uses "re-export" flow now. Also did some refactoring to make the code little cleaner Some changes for OSS: 1. Correctly use the "cached" fake tensors so that static symbols are still resolved to static 2. Change logic in PassBase to allocate static shapes for parameters 3. Add "is_torch_exported" tag to every node to make it survive during various graph transformations. 4. Added experimental wrapper API for quantization team to get pre_dispatch=True graph. Note that it doesn't actually do that right now. But we plan to switch soon. Test Plan: CI Differential Revision: D47890878 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106676 Approved by: https://github.com/jerryzh168	2023-08-09 20:10:48 +00:00
Jerry Zhang	e1a1780626	[quant][pt2e] Move annotate functions in XNNPACKQuantizer to utils (#106642 ) Summary: This is to allow sharing these annotate functions by other quantizers so that writing a new quantizer is easier note that these annotation functions will be maintained by XNNPACKQuantizer developers instead of AO team Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106642 Approved by: https://github.com/andrewor14	2023-08-09 18:52:39 +00:00
Jerry Zhang	69ecad6f2b	[quant][pt2e] Add reference representation for quantize_per_channel and dequantize_per_channel (#105783 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel Although right now it is not really testing things since there is some problem with dynamo export Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105783 Approved by: https://github.com/kimishpatel	2023-08-09 01:39:52 +00:00
Jiaxu Zhu	9e35df4adc	[pytorch][ao] force weight observer/fake_quant to be on the same device as the weight tensor (#106755 ) Summary: As title. There's a corner case where both cpu and gpu are avaiable, although the model is moved to cpu, the newly created PTQ weight observer is still on gpu. Therefore, during the convert, this line will fail https://fburl.com/4rhipfvb Test Plan: CI Differential Revision: D48141494 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106755 Approved by: https://github.com/jerryzh168	2023-08-09 00:22:49 +00:00
Jerry Zhang	2156f0434c	[quant][pt2e] Add reference representation for quantized adaptive_avg_pool2d (#105709 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_adaptive_avg_pool2d Although right now it is not really testing things since there is some problem with dynamo export Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105709 Approved by: https://github.com/andrewor14 ghstack dependencies: #105708	2023-08-04 18:49:14 +00:00
Jerry Zhang	9e301949ec	[quant][pt2e] Add reference representation for quantized max_pool2d (#105708 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_maxpool2d Although right now it is not really testing things since there is some problem with dynamo export Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105708 Approved by: https://github.com/andrewor14	2023-08-04 08:19:52 +00:00
Jerry Zhang	820e68b58a	[quant][pt2e] Add reference representation for quantized add - relu (#105707 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_add_relu Although right now it is not really testing things since there is some problem with dynamo export Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105707 Approved by: https://github.com/andrewor14	2023-08-03 00:42:06 +00:00
Jerry Zhang	d528a137e0	[quant][pt2e][quantizer] Suppoert set_module_type in XNNPACKQuantizer (#106094 ) Summary: Added support to allow users to set configurations based on module type in XNNPACKQuantizer, can also serve as an example for implementing new quantizers Test Plan: python test/test_quantization.py TestQuantizePT2E.test_xnnpack_quantizer_set_module_type Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106094 Approved by: https://github.com/andrewor14 ghstack dependencies: #106087	2023-08-02 08:33:58 +00:00
Leon	850ad54139	correct spelling mistake (#106309 ) Fixes #ISSUE_NUMBER correct spelling mistake Pull Request resolved: https://github.com/pytorch/pytorch/pull/106309 Approved by: https://github.com/kit1980	2023-08-02 04:38:23 +00:00
Jerry Zhang	92a22a8098	[quant][pt2e][quantizer] Suppoert set_module_name in XNNPACKQuantizer (#106087 ) Summary: Added support to allow users to set configurations based on module name in XNNPACKQuantizer, can also serve as an example for implementing new quantizers Test Plan: python test/test_quantization.py TestQuantizePT2E.test_xnnpack_quantizer_set_module_name Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106087 Approved by: https://github.com/andrewor14	2023-08-02 01:19:23 +00:00
PyTorch MergeBot	93b2036bef	Revert "[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894 )" This reverts commit `3ca71ed735`. Reverted https://github.com/pytorch/pytorch/pull/105894 on behalf of https://github.com/huydhn due to breaking executorch tests internally ([comment](https://github.com/pytorch/pytorch/pull/105894#issuecomment-1654831950))	2023-07-28 01:16:02 +00:00
Edward Z. Yang	7b9d250f06	Change _dynamo.export to be export(f)(args, *kwargs) (#106109 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106109 Approved by: https://github.com/voznesenskym	2023-07-27 21:41:13 +00:00
Jerry Zhang	3ca71ed735	[quant][pt2e] store scale/zero_point as tensor attributes to support serialization (#105894 ) Summary: Currently scale/zero_point for per tensor quant is stored as burnt in literals, this means these values can't be serialized in state_dict, this PR changes them to buffers/Tensors so that they can be serialized Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D47770963](https://our.internmc.facebook.com/intern/diff/D47770963) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105894 Approved by: https://github.com/kimishpatel	2023-07-26 20:15:06 +00:00
Jerry Zhang	3a77f9aaaf	[quant][api] Move torch.ao.quantization.pt2e.quantizer to torch.ao.quantization.quantizer (#105885 ) Summary: moving quantizer to torch.ao.quantization to make it a public api, since pt2e is a folder for implementations Test Plan: CIs sanity check: "buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18" Differential Revision: D47727838 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105885 Approved by: https://github.com/andrewor14	2023-07-26 18:20:09 +00:00
Jerry Zhang	d767cff7c7	[quant][fx] Fix docs for prepare_fx/prepare_qat_fx (#105979 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/103661 Test Plan: visual inspectation of docs https://pytorch.org/docs/2.0/generated/torch.ao.quantization.quantize_fx.prepare_fx.html#torch.ao.quantization.quantize_fx.prepare_fx Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105979 Approved by: https://github.com/andrewor14	2023-07-26 09:56:18 +00:00
Aaron Gokaslan	6d43c89f37	[BE]: Update Ruff to 0.0.280 (#105724 ) Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724 Approved by: https://github.com/ezyang, https://github.com/janeyx99	2023-07-22 23:03:34 +00:00
Jerry Zhang	143c83d637	[quant][pt2e][be] Remove unneeded code (#105676 ) Summary: att Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105676 Approved by: https://github.com/andrewor14	2023-07-21 00:51:22 +00:00
Jerry Zhang	dff4e034b8	[quant][pt2e][be] Rename qnnpack quantizer to xnnpack quantizer (#105551 ) Summary: att Test Plan: sandcastle CI and OSS CI Reviewed By: andrewor14 Differential Revision: D47422894 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105551 Approved by: https://github.com/andrewor14	2023-07-20 03:52:40 +00:00
Max Ren	bc6bca9d42	[XNNPACK][QS8] torch.slice (#105252 ) Differential Revision: [D47487423](https://our.internmc.facebook.com/intern/diff/D47487423/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105252 Approved by: https://github.com/digantdesai	2023-07-19 23:36:02 +00:00
leslie-fang-intel	fa6be2fa6f	[Quant][PT2E] Remove x86 inductor pt2e backend config (#105039 ) Summary For the Quantization PT2E path, we recommend to use `X86InductorQuantizer` instead of backend config of `x86_inductor_pt2e_backend_config`. Remove the `x86_inductor_pt2e_backend_config` and the relevant testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105039 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-07-19 23:18:29 +00:00
Justin Chu	c0d8a4af0a	[BE] Enable ruff's UP rules and autoformat ao/ (#105430 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105430 Approved by: https://github.com/albanD, https://github.com/malfet	2023-07-19 13:44:37 +00:00
Jerry Zhang	554052f321	[quant][pt2e][be] Rename prepare_pt2e_quantizer to prepare_pt2e (#105484 ) Summary: att Test Plan: sandcastle and OSS CI Reviewed By: andrewor14 Differential Revision: D47422892 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105484 Approved by: https://github.com/andrewor14	2023-07-19 04:51:37 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	5666d20bb8	Add unlifting pass under private config (#104897 ) Summary: We wanna do this little by little. For now, I tried only on DissectedPartsModel which needs to use aot_export version. Test Plan: CI Reviewed By: zhxchen17 Differential Revision: D46785735 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104897 Approved by: https://github.com/JacobSzwejbka	2023-07-19 01:16:35 +00:00
maxren	88f1885ec9	[XNNPACK][QS8] torch.cat (#104800 ) Differential Revision: [D47304143](https://our.internmc.facebook.com/intern/diff/D47304143/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104800 Approved by: https://github.com/digantdesai	2023-07-19 00:15:05 +00:00
Nikita Shulga	78829d6e07	Fix `isinstance` check in `quat_utils` (#105476 ) Calling `isinstance(x, Tuple[Node, Node])` would either fail, or raise a type error on a more modern Python, as none of the tuples are actually instances of `Tuple` ```python >>> from typing import Tuple >>> from torch.fx import Node >>> edge_or_node=(Node(None, "foo", "output", "foo", None, None), Node(None, "bar", "output", "bar", None, None)) >>> isinstance(edge_or_node, tuple) and len(edge_or_node) == 2 and all(isinstance(x, Node) for x in edge_or_node) True >>> isinstance(edge_or_node, Tuple[Node, Node]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/malfet/miniconda3/lib/python3.10/typing.py", line 994, in __instancecheck__ return self.__subclasscheck__(type(obj)) File "/Users/malfet/miniconda3/lib/python3.10/typing.py", line 997, in __subclasscheck__ raise TypeError("Subscripted generics cannot be used with" TypeError: Subscripted generics cannot be used with class and instance checks ``` <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 40fa451</samp> > _Fix type annotation_ > _Quantize nodes in the graph_ > _Autumn leaves falling_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/105476 Approved by: https://github.com/jerryzh168	2023-07-18 21:16:05 +00:00
Jerry Zhang	ed2b9f1af1	[quant][pt2e] rename _quantize_pt2e to quantize_pt2e (#105377 ) Summary: att Test Plan: CIs Reviewed By: andrewor14 Differential Revision: D47234357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105377 Approved by: https://github.com/andrewor14	2023-07-18 16:46:05 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
Jerry Zhang	7b4d080496	[quant][pt2e] Rename _pt2e to pt2e (#104668 ) Summary: X-link: https://github.com/pytorch/executorch/pull/3 att Test Plan: Imported from OSS Differential Revision: D47202807 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104668 Approved by: https://github.com/andrewor14	2023-07-15 06:34:17 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit `c9c4f8efc3`. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00
PyTorch MergeBot	3c5a494d7a	Revert "Update mypy to 1.4.1 (#91983 )" This reverts commit `634659e262`. Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))	2023-07-14 15:59:16 +00:00
Jerry Zhang	90b50f0303	[quant][pt2e] change internal code to only import from _quantize_pt2e (#105162 ) Summary: This is to make public api clear so that we can make implementation details change easier in the future Test Plan: CIs Differential Revision: D47445767 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105162 Approved by: https://github.com/andrewor14	2023-07-14 05:14:29 +00:00
Tuan Tran	85745cd3d9	Fix bug in fuse_modules (#105069 ) Summary: This diff fixes the issue reported in https://github.com/pytorch/pytorch/issues/105063 and also related to internal caffe2 bug (reproduced error in internal fb pytorch: N3945540) Test Plan: Wait for sandcastle with the added unit test in caffe2/torch/ao/quantization/eager/test_fuse_eager Differential Revision: D47402357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105069 Approved by: https://github.com/jerryzh168	2023-07-13 23:39:59 +00:00
Nikita Shulga	634659e262	Update mypy to 1.4.1 (#91983 ) Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi	2023-07-13 16:30:36 +00:00
Aaron Gokaslan	96b91ab248	Fix merged lintrunner error (#105005 ) Fixes lintrunner linter race condition. Follow up to #104917 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105005 Approved by: https://github.com/malfet, https://github.com/ezyang	2023-07-11 22:04:49 +00:00
Aaron Gokaslan	2f95a3d0fc	[BE]: Apply ruff PERF fixes to torch (#104917 ) Applies automated ruff fixes in the PERF modules and enables all automatic ones. I also updated ruff which applied some additional fixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104917 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-07-11 20:45:21 +00:00
Andrew Or	4b29829ece	[quant][pt2] Fix QAT convert for mobilenetv2 (#104110 ) Summary: QAT convert for mobilenetv2 was previously not working because we incorrectly applied dropout during eval as well as training. This is because, for exported models, model.eval() does not change the behavior of dropout, unlike models with torch ops. This commit simulates the effects of model.eval() for exported models as well by replacing the aten dropout pattern before eval. As of this commit, end-to-end QAT numerics now match for mobilenetv2 between FX and PT2. Test Plan: python test/test_quantization.py TestQuantizePT2EModels.test_qat_mobilenet_v2 Differential Revision: D46750343 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104110 Approved by: https://github.com/jerryzh168	2023-07-11 18:42:42 +00:00
maxren	332f2057df	[XNNPACK][QS8] torch.nn.ELU (#104307 ) Differential Revision: [D47075933](https://our.internmc.facebook.com/intern/diff/D47075933/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104307 Approved by: https://github.com/digantdesai	2023-07-11 00:35:13 +00:00
maxren	c4e084e3c7	[XNNPACK][QS8] torch.nn.ConstantPad2d (#104306 ) Differential Revision: [D47075932](https://our.internmc.facebook.com/intern/diff/D47075932/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104306 Approved by: https://github.com/digantdesai	2023-07-11 00:35:02 +00:00
maxren	2c960c73a3	[XNNPACK][QS8] torch.permute (#104305 ) Differential Revision: [D47075934](https://our.internmc.facebook.com/intern/diff/D47075934/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104305 Approved by: https://github.com/digantdesai	2023-07-11 00:34:58 +00:00
maxren	d41c4a8338	[XNNPACK][QS8] torch.clamp (#104304 ) Differential Revision: [D47075935](https://our.internmc.facebook.com/intern/diff/D47075935/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104304 Approved by: https://github.com/digantdesai	2023-07-11 00:34:58 +00:00
leslie-fang-intel	2a21469a77	[Quant][PT2E] Enable conv2d unary and binary recipe for x86 inductor quantizer (#98826 ) Summary - Recipe to annotate `conv2d_relu` for `X86InductorQuantizer` is added. - Recipe to annotate `conv2d_add` for `X86InductorQuantizer` is added. - Recipe to annotate `conv2d_add_relu` for `X86InductorQuantizer` is added. Test Plan ``` python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98826 Approved by: https://github.com/jerryzh168	2023-07-04 00:01:10 +00:00
Kimish Patel	bd0f0f40a1	[PT2][Quant] Enable symbolic shape in linear quantization (#104473 ) When tracing with symbolic shapes, arbitrary sym_size nodes can appear in the graph. Earlier changes did not account for this and quantizer fails to annotate the right nodes. This diff fixes that by not annotating sym_size nodes, which should really not be relevant for quantization. As next steps, we should validate in quant workflow that a) sym_int nodes are not being quantized and b) add similar support, as this diff, for generic annotations Differential Revision: [D47132050](https://our.internmc.facebook.com/intern/diff/D47132050/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/104473 Approved by: https://github.com/jerryzh168	2023-07-01 05:14:30 +00:00
Digant Desai	36c4dad197	[ET][XNNPACK] Add support for quantized LeakyReLU (#104309 ) Summary: Also adds support for backend_config Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Reviewed By: mcr229 Differential Revision: D47043207 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104309 Approved by: https://github.com/salilsdesai, https://github.com/manuelcandales	2023-06-30 17:42:22 +00:00
Jerry Zhang	ecca9591d5	[quant][pt2e] Add reference representation for quantize/dequantize operators (#104395 ) Summary: Similar to quantized add, in this PR we added the reference represenation for quantize/dequantize operators Test Plan: buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_quantize (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_dequantize (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' Reviewed By: kimishpatel Differential Revision: D46959928 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104395 Approved by: https://github.com/andrewor14	2023-06-30 04:32:18 +00:00
leslie-fang-intel	945a257277	[Quant][PT2E] Supported customized _EQUIVALENT_TYPES in Module Partition API (#102516 ) Summary `Module Partition API` can simplify the pattern match process in Quantization annotation. However, current implementation of `Module Partition API` has hardcoded `_EQUIVALENT_TYPES` `999bae0f54/torch/ao/quantization/_pt2e/graph_utils.py (L13-L20)`. So, PyTorch Extension Libraries such as [intel-extension-for-pytorch](https://github.com/intel/intel-extension-for-pytorch) can't use `Module Partition API` with customized `_EQUIVALENT_TYPES` . In this PR, we plan to enable customized `_EQUIVALENT_TYPES` by pass in parameter. Test Plan ``` python -m pytest test_graph_utils.py -k test_customized_equivalet_types_dict ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/102516 Approved by: https://github.com/jgong5, https://github.com/kimishpatel	2023-06-28 00:20:25 +00:00
Jerry Zhang	c98896b76f	[quant][pt2e] Add more precise representation for quantized add (#104130 ) Summary: The planned e2e for quantization in pytorch 2.0 export is the following: float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ... inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of convert_to_reference_fx in fx grah mode quantization: ``` torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor torch.ops.quantized_decomposed.dequantize_per_tensor / ``` Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for quantized add is: ``` def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point): x = (x_scale / out_scale) * x_i8 y = (y_scale / out_scale) * y_i8 out = x + y out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale out += out_zero_point return out ``` Test Plan: ``` buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' ``` Reviewed By: kimishpatel Differential Revision: D45628032 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104130 Approved by: https://github.com/kimishpatel	2023-06-27 20:11:30 +00:00
Digant Desai	ef285faeba	[ET][XNNPACK] Add support for quantized Multiply (#104134 ) Summary: Also adds support for backend_config with relu fusion since XNNPACK allows it. We should revisit the relu fusion once we gain more clarity on quantSrcPartition or some other way to do these fusion and not having to add all combinations. We should really rename the backend config to et_xnnpack.py or something TODO Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Differential Revision: D46985169 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104134 Approved by: https://github.com/mcr229, https://github.com/salilsdesai	2023-06-27 16:59:28 +00:00
Digant Desai	bd8841101b	[ET][XNNPACK] Add support for quantized Sub (#104090 ) Summary: Also adds support for backend_config with relu fusion since XNNPACK allows it. We should revisit the relu fusion once we gain more clarity on quantSrcPartition or some other way to do these fusion and not having to add all combinations. We should really rename the backend config to et_xnnpack.py or something TODO Test Plan: `buck test fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:` Differential Revision: D46924209 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104090 Approved by: https://github.com/mcr229	2023-06-26 16:32:15 +00:00
HDCharles	8176cd8c0f	[ao] fixing quantized prelu workflow (#103455 ) Summary: https://github.com/pytorch/pytorch/issues/100654 noticed prelu was not running its observers when the quantization flow was being run, this was a bug which is now fixed and the relevant prelu tests also now check for this. Also added a corrected observer for PReLU to qconfig_mapping Test Plan: python test/test_quantization.py TestStaticQuantizedModule.test_prelu Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/103455 Approved by: https://github.com/jerryzh168	2023-06-23 16:45:40 +00:00
Andrew Or	7320ef5651	[quant][pt2] Add prepare QAT test for mobilenetv2 (#104068 ) Summary: Prepare QAT for mobilenetv2 has matching numerics with FX. There were two changes needed to achieve this, however. First, this commit adds observer sharing for ReLU6, which is used extensively throughout this model. Second, in the tests we have to use the same manual seed every time we call the models in order to get the same results between FX and PT2. This is because there is a dropout at the end of the model. Test Plan: python test/test_quantization.py TestQuantizePT2EModels.test_qat_mobilenet_v2 Reviewed By: kimishpatel Differential Revision: D46707786 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104068 Approved by: https://github.com/jerryzh168	2023-06-23 16:34:25 +00:00
andrewor14	0d5f1cb666	[quant] Add torch.flatten to executorch backend_config (#103988 ) Summary: This is needed to make the short-term and long-term quantization numerics match for mobilenetv2. Test Plan: python test/test_quantization.py TestQuantizeFx Reviewers: jerryzh, kimishpatel Subscribers: jerryzh, kimishpatel Differential Revision: [D46909962](https://our.internmc.facebook.com/intern/diff/D46909962) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103988 Approved by: https://github.com/jerryzh168	2023-06-22 22:11:48 +00:00
Andrew Or	303ff84b04	[quant][pt2] Update special qspecs after QAT rewrite (#103970 ) Summary: Special qspecs like `SharedQuantizationSpec` and `DerivedQuantizationSpec` refer to other nodes in the graph. However, after subgraph rewriting in QAT, the nodes referred to in these special qspecs may be replaced by new nodes. This could lead to the following error when inserting observers according to these qspecs: ``` AssertionError: please make sure only refer to edge or node that has observer/fake_quant inserted: 'getitem' not in dict_keys([(arg0, convolution_default_1), (mul_tensor, convolution_default_1), getitem_3]) ``` This commit fixes this by keeping track of the nodes that are replaced during subgraph rewriting in QAT, and using this mapping to update the dangling references used in these special qspecs. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_update_shared_qspec Reviewed By: jerryzh168 Differential Revision: D46606614 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103970 Approved by: https://github.com/jerryzh168	2023-06-22 20:05:57 +00:00
Andrew Or	873f772df2	[quant][pt2] Fix QAT convert for resnet18 (#103759 ) Summary: Before this commit, only prepare QAT numerics matched between PT2 and FX for resnet18. Convert numerics diverged, however, for two reasons: (1) Existing patterns did not handle inplace ReLUs. This commit fixes this by adding extra patterns that use these ReLUs instead of the normal ones. (2) Subgraph rewriter could not handle skip connections in quantized models, because the dequantize node is used in both the conv node within the match pattern, and an inplace add node outside of the match pattern. This led the subgraph matcher to filter out the match, complaining that it was not self contained. This commit fixes this problem by duplicating the dequantize nodes, one for each user, such that subsequent matches will be self contained. Test Plan: python test/test_quantization.py TestQuantizePT2EModels.test_qat_resnet18 Reviewed By: jerryzh168 Differential Revision: D46564114 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103759 Approved by: https://github.com/jerryzh168	2023-06-21 15:36:07 +00:00
leslie-fang-intel	9832cfbbfe	Quantization oneDNN backend only support VNNI CPU (#103653 ) Summary - Update the quantization document that default qconfig with oneDNN backend is recommended to be used on CPUs with Vector Neural Network Instruction support. - Add the warning message when user uses default qconfig with oneDNN backend on CPU without Vector Neural Network Instruction support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103653 Approved by: https://github.com/jgong5, https://github.com/malfet	2023-06-19 09:50:07 +00:00
leslie-fang-intel	dbc8eb2a8f	[Quant][PT2E]Enable x86 inductor quantizer (#98730 ) Summary - Enable `X86InductorQuantizer` basics. - Recipe to annotate conv2d is added. Test Plan ``` python -u -m pytest -s -v test_x86inductor_quantizer.py -k TestQuantizePT2EX86Inductor ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98730 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-06-17 06:10:23 +00:00
Andrew Or	2bc56bec07	[quant][pt2] Handle literal conv args in convert QAT (#103731 ) Summary: Similar to the prepare case, we need to manually copy over literal conv args such as padding and stride to the new, replaced conv nodes, since these args are not captured by the subgraph rewriter. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_fusion_literal_args Reviewed By: jerryzh168 Differential Revision: D46383130 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103731 Approved by: https://github.com/jerryzh168	2023-06-16 17:15:37 +00:00
Andrew Or	dad29f906b	[quant][pt2] Fix no conv bias in convert QAT (#103298 ) Summary: Previously, the QAT pattern for conv + bn with no conv bias was not actually replaced in convert. This commit adds an extra pattern in the convert path for this case and the numerics now match FX's. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_prepare_qat_conv_bn_fusion_no_conv_bias Reviewed By: jerryzh168 Differential Revision: D46382819 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103298 Approved by: https://github.com/jerryzh168	2023-06-16 01:59:48 +00:00
Kimish Patel	90ee6a7354	[PT2][Quant] Update op names for decomposed quantized lib (#103251 ) Summary: Dynamo trace, via dynamo.export, with aten_graph, generates graph with nodes whose target is an isntance of torch._ops.OpOverload. Quantization workflow inserting quantize/dequantize ops which are sometimes instances of torch._ops.OpOverload (quantize_per_tensor.tensor) while other times instances of torch._ops.OpOverloadPacket (quantizer_per_tensor) is a bit inconsistent. Also not sure if it is a valid exported model, if it has nodes with target of type torch._ops.OpOverloadPacket. Without op overload name attached to the 'target', it fails during executorch tracing. Reason is that executorch tracing expects node's targets to be instances of torch._ops.OpOverload and not torch._ops.OpOverloadPacket. So for consistency and tracing reasons, fixing convert pass to insert ops which are torch._ops.OpOverload Test Plan: CI Reviewed By: jerryzh168 Differential Revision: D46342822 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103251 Approved by: https://github.com/andrewor14	2023-06-15 04:37:58 +00:00
Piotr Sebastian Kluska	b4056ba744	chore: Update ModelReportObserver variables to buffers (#97971 ) This commit changes ModelReportObserver variables to buffers similar to other observers. This will allow for gathering data on other device than CPU. Moreover, updates InputWeightEqualizationDetector to compute weight stats that are on GPU Tested with running tests `test/quantization/fx/test_model_report_fx.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/97971 Approved by: https://github.com/vkuzo	2023-06-15 03:15:41 +00:00
Kimish Patel	49dcf48e66	[PT2][Quant] Change quat conv bn fusion code (#103556 ) Summary: Dynamo burn in scalars instead of keeping them on module. This results in quantize_per_tensor and dequantize_per_tensor nodes to have burnt in scale and zero point value, if we trace them scalar. Graph rewrite ignores literals and when match pattern is replaced with replacement pattern, we lose the scale/zp and other values from nodes in original graph and instead get one from replacement graph. This diff fixes that for q/dq per tensor node by manually copying these values over. Note that this is not robust because it works only when there is only a single q/dq node Test Plan: quantization_pt2e Reviewed By: andrewor14 Differential Revision: D46614000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103556 Approved by: https://github.com/andrewor14	2023-06-14 18:37:43 +00:00
Jerry Zhang	0cd155b042	[reland][quant][pt2e] Annotate GRU module (#103358 ) (#103526 ) Summary: att, we use module partition API to identify the GRU submodule and annotate all necessary patterns Test Plan: buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' Differential Revision: D46689428 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103526 Approved by: https://github.com/andrewor14	2023-06-13 23:43:10 +00:00
PyTorch MergeBot	13777e3391	Revert "[quant][pt2e] Annotate GRU module (#103358 )" This reverts commit `23892d8ee4`. Reverted https://github.com/pytorch/pytorch/pull/103358 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/103358#issuecomment-1588729657))	2023-06-13 07:45:40 +00:00
Jerry Zhang	23892d8ee4	[quant][pt2e] Annotate GRU module (#103358 ) Summary: att, we use module partition API to identify the GRU submodule and annotate all necessary patterns Test Plan: buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' Reviewed By: kimishpatel Differential Revision: D46384329 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103358 Approved by: https://github.com/HDCharles	2023-06-13 04:10:13 +00:00
Yash Vardhan	6ed3c4499a	Fix fuse_custom_config_dict arg from being None (#102154 ) `fuse_custom_config_dict` in [fuse_modules.py](https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/fuse_modules.py#L164) being passed as None even if a fuse_custom_config_dict is provided. This patch fixes the `fuse_custom_config_dict` from being passed as None. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102154 Approved by: https://github.com/kit1980	2023-06-13 03:45:20 +00:00
maxren	f37be77813	[Quant][XNNPACK] Delegate add_relu fusion (#103266 ) Quantized Resnet currently sees fused add-relu ``` --> dq \ add --> relu --> quant / --> dq ``` Let us support this fusion in the delegate as xnnpack can use the output_min and output_max of the op nodes to clamp the values and perform a fused add - relu operation Differential Revision: [D45258028](https://our.internmc.facebook.com/intern/diff/D45258028/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103266 Approved by: https://github.com/jerryzh168	2023-06-12 04:35:29 +00:00
Andrew Or	89d57f269f	[quant][pt2] Fix convert in Conv + BN + ReLU QAT fusion (#102993 ) Summary: Previously, the QAT pattern for conv + bn + relu was not actually replaced in convert. This is because the quantized QAT pattern used in convert doesn't actually have a relu node. This commit adds this extra pattern in the convert path and the numerics now match FX's. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_relu_numerics Reviewed By: jerryzh168 Differential Revision: D46372411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102993 Approved by: https://github.com/jerryzh168	2023-06-08 22:10:29 +00:00
Kimish Patel	a49aefdce2	[PT2][Quant] In linear partition include functional.linear (#103186 ) Summary: as title Test Plan: tested in subsequent diff Reviewed By: jerryzh168 Differential Revision: D46342824 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103186 Approved by: https://github.com/jerryzh168	2023-06-08 09:48:09 +00:00
Kimish Patel	471407cf78	[PT2][Quant] Use composble quantizer for embedding + static conv + dynamic (#103116 ) Summary: In this diff we test a module that does a) emedding lookup b) runs 1D (converted to 2D) conv and c) runs linear on the output of 1d conv. a is quantized using embedding quantizer. c is quantized using dynamic quantization. b is quantized using static quantization. We compose quantizer from [a, c, b]. Tested it against similar fx config. Test Plan: test_embedding_conv_linear_quantization Reviewed By: jerryzh168 Differential Revision: D46267688 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103116 Approved by: https://github.com/jerryzh168	2023-06-07 17:34:59 +00:00
Kimish Patel	8e0837cf84	[PT2][Quant] Move embedding quantization to osss (#103088 ) Summary: This is in preperation to enable embeddign quantization on models with embeddings. Test Plan: test_embedding_quantizer Reviewed By: jerryzh168 Differential Revision: D46267689 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103088 Approved by: https://github.com/andrewor14	2023-06-06 23:07:57 +00:00
Xuan Xie	6261055471	dst_bin_of_end_center is defined twice (#102755 ) (line 995 and line 1011) both definations are the same. Delete one of them. Fixes Pull Request resolved: https://github.com/pytorch/pytorch/pull/102755 Approved by: https://github.com/janeyx99	2023-06-06 21:17:07 +00:00
Kimish Patel	8824101fb6	[PT2][Quant] Introduce composable quantizer (#102846 ) Summary: Using composable quantizer, we can now composable two or more quantizers. In the test here we compose quantizer configured with dynamic linear quantization, with quantizer configured for static quantization. Note that composable quantizer has strict order in which annotations are applied Test Plan: test_composable_quantizer* Reviewed By: jerryzh168 Differential Revision: D46267690 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102846 Approved by: https://github.com/andrewor14	2023-06-06 14:01:55 +00:00
Jerry Zhang	5fbbae4283	[quant][pt2e][be] Cleanup prepare function in _pt2e (#103022 ) Summary: att Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' ``` Differential Revision: D46346087 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103022 Approved by: https://github.com/andrewor14	2023-06-06 04:33:05 +00:00
Andrew Or	604a414bfc	[quant][pt2] Fix convert in Conv + BN QAT fusion (#102224 ) Summary: Previously, the test for the convert flow in Conv + BN QAT fusion was not enabled by mistake. However, reenabling this test uncovered several bugs: (1) The replaced nodes returned by subgraph rewriter were not handled correctly. This is because a recent change in the subgraph rewriter (#100556) fixed only the prepare case but not the convert case. This commit brings this fix to the convert case as well and deduplicates some code between the two cases. (2) When folding BN into conv, we used the wrong arg index to get the BN eps value. This resulted in an incorrect conv weight. (3) In FX, we currently do a hack for weighted modules where we observe the weights once in convert in order to ensure we get the right shapes for these weight observers. This caused the numerics to diverge between PT2 and FX. This commit fixes this by skipping this unnecessary hack for `_convert_to_reference_decomposed_fx`. (4) Per channel support was simply missing. This commit adds support for this by matching the quantize_per_channel and dequantize_per_channel ops in addition to the existing ones. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_numerics Reviewed By: jerryzh168 Differential Revision: D46097783 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102224 Approved by: https://github.com/jerryzh168	2023-06-05 18:09:28 +00:00
Jerry Zhang	eb0971cfe9	[quant][pt2e][be] Remove _input_output_share_observers and _reuse_input_obs_or_fq from QuantizationAnnotation (#102854 ) Summary: att, after we support SharedQuantizationSpec we don't need these things anymore, this PR refactors the uses of _input_output_share_observers to SharedQuantizationSpec Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' ``` Reviewed By: andrewor14 Differential Revision: D46301342 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102854 Approved by: https://github.com/andrewor14	2023-06-03 07:31:09 +00:00
Kimish Patel	a53acafd2b	[PT2][Quant] Enable dynamic quantization (#102703 ) Enable dynamic quantization of linear layers. Differential Revision: [D46235070](https://our.internmc.facebook.com/intern/diff/D46235070/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102703 Approved by: https://github.com/andrewor14	2023-06-02 17:52:14 +00:00
Kimish Patel	2301b624ae	[PT2][Quant] Update quconfig to contain input/qoutput activation qspec (#102702 ) As title Differential Revision: [D46342823](https://our.internmc.facebook.com/intern/diff/D46342823/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102702 Approved by: https://github.com/andrewor14	2023-06-02 17:41:46 +00:00
Kimish Patel	6492b7d22e	[PT2][Quant][BE] Refactor qnnpack_quantizer.py (#102701 ) This diff refactors annotate functions so as to couple annotate functions with corresponding quantization configs that they support. This will help in dynamic quantization which is only supported for linear layers Differential Revision: [D46235071](https://our.internmc.facebook.com/intern/diff/D46235071/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102701 Approved by: https://github.com/jerryzh168	2023-06-02 17:14:56 +00:00
Jerry Zhang	ce8d31551b	[quant][be] Change return type for zero_point to be int32 Tensor (#102234 ) Summary: This is probably a typo Test Plan: CI Reviewed By: salilsdesai Differential Revision: D46172706 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102234 Approved by: https://github.com/salilsdesai	2023-06-01 18:30:44 +00:00
Jerry Zhang	d930bfc419	[quant][pt2e][be] Add QuantizationSpecBase (#102582 ) Summary: Make all quantization spec to inherit from the same base class in order to simplify the typing for QuantizationAnnotation Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' ``` Reviewed By: kimishpatel Differential Revision: D46173954 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102582 Approved by: https://github.com/andrewor14	2023-06-01 17:55:22 +00:00
Jerry Zhang	f14ac74fce	[quant][pt2e] Add support for FixedQParamsQuantizationSpec (#102439 ) Summary: This PR adds support for FixedQParamsQuantizationSpec: ``` dataclass(eq=True, frozen=True) class FixedQParamsQuantizationSpec(QuantizationSpecBase): dtype: torch.dtype scale: float zero_point: int quant_min: Optional[int] = None quant_max: Optional[int] = None qscheme: Optional[torch.qscheme] = None ``` This is useful to define quantization spec for operators like sigmoid which has predefined and fixed scale/zero_point Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' buck2 test mode/opt caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_fixed_qparams_qspec (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)' ``` Reviewed By: kimishpatel Differential Revision: D46153082 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102439 Approved by: https://github.com/kimishpatel	2023-05-30 21:28:13 +00:00
Kimish Patel	af70fe9f3e	[PT2][Quant] Enable test_qnnpack_quantizer_conv_linear test (#102399 ) Earlier this test was disabled due to pattern matching not working correctly. Enablign this test now since we moved to module partitioner based matching. Differential Revision: [D46130722](https://our.internmc.facebook.com/intern/diff/D46130722/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102399 Approved by: https://github.com/jerryzh168	2023-05-28 06:44:16 +00:00
Kimish Patel	0d876f7d43	[PT2][Quant] Move observer sharing ops to use module partitions (#102398 ) As title Differential Revision: [D46095331](https://our.internmc.facebook.com/intern/diff/D46095331/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102398 Approved by: https://github.com/jerryzh168	2023-05-28 05:50:15 +00:00
Kimish Patel	9fac5afbcc	[PT2][Quant] Move add/add relu pattern via module partitioner (#102397 ) This diff uses module partitioners to find add and add + relu patterns. Differential Revision: [D46095330](https://our.internmc.facebook.com/intern/diff/D46095330/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102397 Approved by: https://github.com/jerryzh168	2023-05-28 05:47:43 +00:00
Kimish Patel	3d8f405022	[PT2][Quant] Move maxpool_2d quant to use module partitioners (#102396 ) As summary Differential Revision: [D46095332](https://our.internmc.facebook.com/intern/diff/D46095332/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102396 Approved by: https://github.com/jerryzh168	2023-05-28 05:44:54 +00:00
Kimish Patel	d997e3aac6	[PT2][Quant] Use module partitions for conv2d and conv2d + relu (#102395 ) In this diff we continue to use source partition for identifying node patterns to annotate. Here we expand the usecase for conv2d+relu and conv2d Differential Revision: [D46095329](https://our.internmc.facebook.com/intern/diff/D46095329/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102395 Approved by: https://github.com/jerryzh168	2023-05-28 05:40:45 +00:00
Kimish Patel	4cb6add471	[PT2][Quant] Use module partition for fused patterns (#102394 ) This diff introduces utility `find_sequential_partitions`. This utility allows one to specify sequential pattern of nn.Module/nn.functional and returns a list. Each item in the list contains a List[SourcePartition] that represents sequentially connected partitions that are of the pattern requested. For example `find_sequential_partitions(model, [nn.Conv2d, nn.ReLU])` will find all nn.Conv2d and nn.ReLU partitions that are sequentially connected. Furthmore, move to using `find_sequential_partitions` for conv_bn/conv_bn_relu for QAT. Differential Revision: [D45948057](https://our.internmc.facebook.com/intern/diff/D45948057/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D45948057/)! Pull Request resolved: https://github.com/pytorch/pytorch/pull/102394 Approved by: https://github.com/jerryzh168	2023-05-28 05:29:16 +00:00
Jerry Zhang	eda5abf5e0	[quant][pt2e] Fix propagate_annotation after recent refactors (#102422 ) Summary: Recently we changed the annotation from "target_dtype_info" to "quantization_annotation" and introduced QuantizationAnnotation API and SharedQuantizationSpec API for users to convey sharing between input/outputs, this PR updates the _propagate_annotation pass to accommadate the recent changes Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' ``` Reviewed By: kimishpatel Differential Revision: D46153084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102422 Approved by: https://github.com/kimishpatel	2023-05-27 16:01:47 +00:00
Jerry Zhang	23223402eb	[quant][pt2e] Add Support for DerivedQuantizationSpec (#102282 ) Summary: ``` """ 4. DerivedQuantizationSpec this is the quantization spec for the Tensors whose quantization parameters are derived from other Tensors """ class DerivedQuantizationSpec(QuantizationSpecBase): # specifies which Tensors the quantization parameters are derived from # this can either be an edge from argument to node, or a node derived_from: List[EdgeOrNode] derive_qparams_fn: Callabale[List[ObserverOrFakeQuantize], Tuple[Tensor, Tensor]] ... ``` Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' buck2 test mode/opt caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_resnet18_with_quantizer_api (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2EModels)' ``` Reviewed By: kimishpatel Differential Revision: D46097855 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102282 Approved by: https://github.com/andrewor14	2023-05-27 00:24:39 +00:00
Jerry Zhang	ed87508b32	[quant][pt2e] Add support for SharedQuantizationSpec (#102184 ) Summary: This PR adds support for SharedQuantizationSpec, it's used to express the sharing between two Tensors in the prepared graph, the Tensor will either be input of some node (expressed as a Tuple of fx nodes) or output of some node (expressed as an fx Node) Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- 'caffe2/test:quantization_pt2e' buck2 test mode/opt caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_resnet18_with_quantizer_api (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2EModels)' ``` Differential Revision: D46043026 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102184 Approved by: https://github.com/kimishpatel, https://github.com/leslie-fang-intel	2023-05-25 17:31:59 +00:00
Riley Dulin	424c930f76	Add quantization lowering for nn.PixelShuffle and nn.PixelUnshuffle (#101926 ) Similar to https://github.com/pytorch/pytorch/pull/96160 but for the modules nn.PixelShuffle and nn.PixelUnshuffle. torch.nn.PixelUnshuffle accepts both float and quantized inputs. However, previously we would unnecessarily dequantize quantized inputs into floats before passing them to the function. This commit fixes this by lowering the pattern [dequant - PixelShuffle - quant]. [dequant - PixelUnshuffle - quant]. Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_pixel_shuffle_module python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle_module Pull Request resolved: https://github.com/pytorch/pytorch/pull/101926 Approved by: https://github.com/jerryzh168	2023-05-24 19:33:26 +00:00
Jerry Zhang	3baa67caee	[quant][pt2e][be] Move annotate helper function to quantizer/utils.py (#102127 ) Summary: att Test Plan: ``` buck2 test mode/opt caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_resnet18_with_quantizer_api (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2EModels)' ``` Reviewed By: kimishpatel Differential Revision: D46001285 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102127 Approved by: https://github.com/kimishpatel	2023-05-24 16:13:28 +00:00

... 3 4 5 6 7 ...

1153 Commits