pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Aaron Gokaslan	3555ebb63d	[BE]: Update ruff to 0.11.8 (#153249 ) Fixes a ton of false negatives throughout the codebase. RUFF also properly validates NOQA comments now and most of the changes are fixing typos there or removing filewide flake8 suppressions that were also silencing ruff issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153249 Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/seemethere	2025-05-12 18:30:52 +00:00
Chen Lai	708428704e	patch for block-wise quantization + pt2e (#146946 ) Summary: https://github.com/pytorch/pytorch/pull/144492 was reverted due to duplicate kernel registration. This PR will re-introduce the patch Differential Revision: D69488779 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146946 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14	2025-02-18 01:15:26 +00:00
PyTorch MergeBot	f522502b97	Revert "patch for block-wise quantization + pt2e (#144492 )" This reverts commit `1d43b81508`. Reverted https://github.com/pytorch/pytorch/pull/144492 on behalf of https://github.com/albanD due to Broke a few things in trunk ([comment](https://github.com/pytorch/pytorch/pull/144492#issuecomment-2598485291))	2025-01-17 14:27:53 +00:00
Chen Lai	1d43b81508	patch for block-wise quantization + pt2e (#144492 ) Summary: As title, needed for enable qcom block-wise quantization kernel Test Plan: local test Differential Revision: D67985303 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144492 Approved by: https://github.com/angelayi, https://github.com/billmguo	2025-01-17 04:10:49 +00:00
Shangdi Yu	bb574abe73	[BC-Breaking]Remove capture_pre_autograd_graph references in quantization (#139505 ) Summary: As title This is a BC-breaking change because graph produced by "capture_pre_autograd_graph" cannot be input to quantization anymore. But this is ok, since this API is deprecated for a while and is going to be deleted. We have removed all call sites of it. We remove the deprecated API references in code, docs, and tests. We also removed two tests that specific to capture_pre_autograd_graph API. Test Plan: CI Differential Revision: D65351887 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139505 Approved by: https://github.com/tugsbayasgalan, https://github.com/andrewor14, https://github.com/jerryzh168	2024-12-13 22:26:22 +00:00
Shen Xu	efe8482c0d	Add prepare_obs_or_fq_callback to quantizer (#140863 ) Test Plan: CI. Differential Revision: D65982003 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140863 Approved by: https://github.com/jerryzh168	2024-11-19 01:13:38 +00:00
Shangdi Yu	c0a930b104	Change to export_for_training in quantize_pt2e tests (#137233 ) Summary: as title also change it in `prepare_pt2e()` docstring Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:quantization_pt2e_qat buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization ``` Differential Revision: D63345059 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137233 Approved by: https://github.com/tugsbayasgalan	2024-10-04 18:33:02 +00:00
Riley Dulin	d61815cb7d	[torch][ao] Use returned model from Quantizer.transform_for_annotation in prepare_pt2e (#132893 ) Summary: The Quantizer subclass can return a new model from `transform_for_annotation`, and this is common if it uses any ExportPass subclass which does not mutate in-place. Use the returned model instead of assuming its the same. Differential Revision: D60869676 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132893 Approved by: https://github.com/jerryzh168	2024-08-12 17:23:19 +00:00
Oguz Ulgen	72d2dba992	Add None return type to init (#132335 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335 Approved by: https://github.com/albanD	2024-08-01 15:26:45 +00:00
Xuehai Pan	2ce734cee9	[BE] enable UFMT for `torch/ao/quantization/` (#128863 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128863 Approved by: https://github.com/ezyang ghstack dependencies: #128861, #128862	2024-07-25 04:17:54 +00:00
Chen Lai	7827afca14	Copy the constant folding pass to the pass under export/passes folder (#127456 ) It's a generic pass and I'm trying to find a good place to host it. It's currently needed by quantization flow. See context in D55930580, it's too much effort to land a fix in the inductor folder. Differential Revision: [D57934182](https://our.internmc.facebook.com/intern/diff/D57934182/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127456 Approved by: https://github.com/angelayi	2024-05-30 18:04:08 +00:00
Jerry Zhang	7082e24ce8	[quant][pt2e][bc-breaking] Set `fold_quantize` to True in `convert_pt2e` (#119425 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to set `fold_quantize` flag to True in `convert_pt2e` Test Plan: CI Differential Revision: D53550237 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119425 Approved by: https://github.com/andrewor14	2024-02-09 18:13:43 +00:00
PyTorch MergeBot	81abc2b249	Revert "[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 )" This reverts commit `482d952e88`. Reverted https://github.com/pytorch/pytorch/pull/118701 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/118701#issuecomment-1932866964))	2024-02-07 20:56:16 +00:00
Jerry Zhang	482d952e88	[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to remove `fold_quantize` flag from `convert_pt2e` Test Plan: CI Differential Revision: D53247301 BC Breaking Note: flag `fold_quantize` set to True `convert_pt2e` and now we'll fold the quantize op in the weight by default, so users will see model size reduction by default after pt2e quantization. 2.2 ``` folded_model = convert_pt2e(model, fold_quantize=True) non_folded_model = convert_pt2e(model) ``` 2.3 ``` folded_model = convert_pt2e(model) non_folded_model = convert_pt2e(model, fold_quantize=False) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118701 Approved by: https://github.com/andrewor14, https://github.com/leslie-fang-intel	2024-02-07 19:10:51 +00:00
Jerry Zhang	82a7460b67	[quant][bc-breaking] Turn on fold_quantize by default (#118605 ) Summary: Previously by default we don't generate quantized weight, that is, we'll have fp32 weight, and `fp32 weight -> q -> dq -> linear -> ...` in the quantized model After this PR, we'll produce a graph with int8 weight by default after convert_pt2e: `int8 weight -> dq -> linear -> ...` We'll remove the fold_quantize flag in the next PR Test Plan: CI Differential Revision: D51730862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118605 Approved by: https://github.com/andrewor14	2024-01-30 21:42:29 +00:00
Jerry Zhang	41f265b06a	[quant][pt2e] Preserve numeric_debug_handle in quantization flows (#116477 ) Summary: We introduced `node.meta["numeric_debug_handle"]` in https://github.com/pytorch/pytorch/pull/114315 to indicate the numeric debug handle for values in the graph, in this PR we supported preserving this field in prepare and convert so that we can use these for numerical debugging Next: we also want to preserve these in deepcopy of GraphModule as well Test Plan: python test/test_quantization.py -k test_quantize_pt2e_preserve_handle Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116477 Approved by: https://github.com/tugsbayasgalan	2024-01-03 03:39:00 +00:00
Jerry Zhang	8173d98c57	[quant][be] Skip conv-bn folding when there are no batchnorm ops (#116440 ) Summary: `_fold_conv_bn_qat` is taking a long time currently, so skipping it when it's not necessary, we can have follow up fixes to actually reduce the patterns or cache the patterns if possible Test Plan: uncomment the print in `test_speed`, run python test/test_quantization.py -k test_speed and make sure the convert time is low, e.g. 0.1s instead of 8-9 seconds Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116440 Approved by: https://github.com/andrewor14	2023-12-28 23:33:21 +00:00
Jerry Zhang	501d118255	[quant][pt2e] Add transform_for_annotation method in Quantizer (#113115 ) Summary: Adding the method so that people can do some transformations before annotation to make the graph easier to annotate Test Plan: python test/test_quantization.py TestQuantizePT2E.test_transform_for_annotation Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51141080](https://our.internmc.facebook.com/intern/diff/D51141080) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113115 Approved by: https://github.com/kimishpatel	2023-11-09 20:23:29 +00:00
Jerry Zhang	43c211facb	[quant][pt2e] Actually support transitive sharing for SharedQuantizationSpec (#111172 ) Summary: Previously we actually did not really support this, this PR added the support. Next * clean up insert observer logic * add allow_transitive_sharing boolean flag to allow people to turn this op for certain edges Test Plan: python test/test_quantization.py TestQuantizePT2E.test_shared_qspec_transitivity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D50250789](https://our.internmc.facebook.com/intern/diff/D50250789) Pull Request resolved: https://github.com/pytorch/pytorch/pull/111172 Approved by: https://github.com/kimishpatel	2023-10-20 23:25:17 +00:00
Jerry Zhang	e3eb1d92d8	[quant][docs] Add documentation for `prepare_pt2e`, `prepare_qat_pt2e` and `convert_pt2e` (#110097 ) Summary: att Test Plan: . Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/110097 Approved by: https://github.com/kimishpatel	2023-09-28 18:24:58 +00:00
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
Jerry Zhang	3943afc94e	[quant][be] Remove unused APIs (#109342 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/109342 Approved by: https://github.com/kimishpatel, https://github.com/andrewor14	2023-09-15 16:07:01 +00:00
Andrew Or	e8a402c56e	[quant][pt2] Fix and rename `move_model_to_eval` (#108891 ) Summary: This commit fixes two silent correctness problems with the current implementation of `move_model_to_eval`: (1) Previously the user had to manually call `eliminate_dead_code` before calling `move_model_to_eval`, otherwise the dropout pattern won't actually get eliminated. This is because subgraph rewriter complains the match is not self-contained, and so silently does not do the replacement. (2) We wish to error when the user calls `model.train()` or `model.eval()` on an exported model. This error is raised correctly immediately after export today, but no longer raised after the user calls prepare or convert. We fix (1) by moving the `eliminate_dead_code` call into `move_model_to_eval`, and fix (2) by ensuring the respective errors are thrown after prepare and convert as well. Additionally, this commit renames `move_model_to_eval` to `move_exported_model_to_eval` to be more explicit. bypass-github-export-checks Test Plan: python test/test_quantization.py TestQuantizePT2E.test_disallow_eval_train python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_to_eval Imported from OSS Differential Revision: D49097293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108891 Approved by: https://github.com/jerryzh168	2023-09-11 15:37:01 +00:00
Kimish Patel	ffc0c46092	[Quantization] Add metadata porting for nodes added by quantization (#107107 ) Summary: This diff adds adding metadata to q-dq nodes by inferring the quatization intent from node annotations. Annotations on the node are way for user to specify how a node or subgraph is supposed to be quantized. We continue to use that information to copy metadata on Q/DQ node from appropriate nodes. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488416](https://our.internmc.facebook.com/intern/diff/D48488416) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107107 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105, #107106, #107899, #107900	2023-09-02 06:38:14 +00:00
Kimish Patel	eb67c452c8	[Quant] Add DQ duplication pass (#107900 ) Summary: During convert step observers are first replaced by Q-DQ pair. In some scenarios like following output DQ has a fan out. ---> OP2 -> Q -> DQ / OP -> Q -> DQ - \ ---> OP3 -> Q -> DQ If either op OP2 or OP3 are configured to be quantized, then the input is expected to quantized. In this case quantized equivalent of some pattern, that quantizer asked to be quantized, should look like: [DQ -> {pattern} -> Q]. However, in scenario like above where DQ node is shared between multiple "quantized" patterns, boundary of "quantized" pattern is not clear because DQ now belongs to multiple quantized patterns. This poses challenge for: - Porting metadata: which "quantized" partition this DQ node belongs - Quantized representation, equivalently, needs to identify self-contained quantized pattern that is replaced by its equivalent pattern that captures compute in the quantized precision. Test Plan: test_duplicate_dq_pass Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107900 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14, https://github.com/leslie-fang-intel ghstack dependencies: #107105, #107106, #107899	2023-09-02 06:20:03 +00:00
Jerry Zhang	a9fe0b5b74	[quant][pt2e] Move propagate_annotation from quant flow to quantizer (#108320 ) Summary: Previously we run propagate_annotation by default in quantization flow to propagate annotations for ops like reshape, view etc. Not all quantizers would need this so we moved this to xnnpack_quantizer_utils for now. Next Step: * make propagate_annotation function configurable with a custom list of ops * remove unneeded ops in `_is_share_obs_or_fq_op` Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48856985](https://our.internmc.facebook.com/intern/diff/D48856985) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108320 Approved by: https://github.com/kimishpatel	2023-09-01 01:49:19 +00:00
andrewor14	057b807178	[quant] Move dropout replacement to `move_model_to_eval` (#108184 ) Summary: This commit adds a public facing `torch.ao.quantization.move_model_to_eval` util function for QAT users. Instead of calling model.eval() on an exported model (which doesn't work, see https://github.com/pytorch/pytorch/issues/103681), the user would call this new util function instead. This ensures special ops such as dropout and batchnorm (not supported yet) will have the right behavior when the graph is later used for inference. Note: Support for an equivalent `move_model_to_train` will be added in the future. This is difficult to do for dropout currently because the eval pattern of dropout is simply a clone op, which we cannot just match and replace with a dropout op. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_move_model_to_eval Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D48814735](https://our.internmc.facebook.com/intern/diff/D48814735) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108184 Approved by: https://github.com/jerryzh168	2023-08-30 16:33:17 +00:00
Jerry Zhang	a0cfaf0688	[quant][pt2e] Make sure XNNPACKQuantizer works with the pre_dispatch=True (#107872 ) Summary: att Test Plan: ``` buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18 buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e ``` Reviewed By: andrewor14, tugsbayasgalan Differential Revision: D48415977 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107872 Approved by: https://github.com/andrewor14	2023-08-25 05:04:01 +00:00
Jerry Zhang	28be2c674a	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 ) Summary: Currently in quantizer/quantize_pt2e we import things from specific quantizers (XNNPACKQuantizer, QuantizationConfig) etc. this PR removes them so it's clearer that they are not part of the core quantization code base This PR also removed get_supported_operators from main Quantizer since we haven't seen a clear need for this API Test Plan: CIs Imported from OSS Differential Revision: D48340367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107259 Approved by: https://github.com/kimishpatel	2023-08-18 21:29:09 +00:00
Jerry Zhang	3a77f9aaaf	[quant][api] Move torch.ao.quantization.pt2e.quantizer to torch.ao.quantization.quantizer (#105885 ) Summary: moving quantizer to torch.ao.quantization to make it a public api, since pt2e is a folder for implementations Test Plan: CIs sanity check: "buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18" Differential Revision: D47727838 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105885 Approved by: https://github.com/andrewor14	2023-07-26 18:20:09 +00:00
Jerry Zhang	143c83d637	[quant][pt2e][be] Remove unneeded code (#105676 ) Summary: att Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105676 Approved by: https://github.com/andrewor14	2023-07-21 00:51:22 +00:00
Jerry Zhang	dff4e034b8	[quant][pt2e][be] Rename qnnpack quantizer to xnnpack quantizer (#105551 ) Summary: att Test Plan: sandcastle CI and OSS CI Reviewed By: andrewor14 Differential Revision: D47422894 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105551 Approved by: https://github.com/andrewor14	2023-07-20 03:52:40 +00:00
Jerry Zhang	554052f321	[quant][pt2e][be] Rename prepare_pt2e_quantizer to prepare_pt2e (#105484 ) Summary: att Test Plan: sandcastle and OSS CI Reviewed By: andrewor14 Differential Revision: D47422892 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105484 Approved by: https://github.com/andrewor14	2023-07-19 04:51:37 +00:00
Jerry Zhang	ed2b9f1af1	[quant][pt2e] rename _quantize_pt2e to quantize_pt2e (#105377 ) Summary: att Test Plan: CIs Reviewed By: andrewor14 Differential Revision: D47234357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105377 Approved by: https://github.com/andrewor14	2023-07-18 16:46:05 +00:00

34 Commits