pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
Jiaxu Zhu	595af261b2	[ao] Support Subclasses of `FloatFunctional` in eager mode prepare (#109646 ) Summary: As title, if a module is subclassing `nnq.FloatFunctional`, also adding observers to it like `nnq.FloatFunctional` Test Plan: CI Differential Revision: D49431968 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109646 Approved by: https://github.com/jerryzh168	2023-09-20 08:09:55 +00:00
Kimish Patel	73ac814148	[Pytorch][quant] Move xnnpack quantizer to use aten.linear (#109254 ) Summary: Now that quantization works on pre-dispatch aten IR, moving to full set of aten ops is ok. Plus when tracing models like ViT, the linear projections of of k, q, v uses functional.linear and not nn.Linear, which results not being able to extract nodes corresponding to linear. Test Plan: quant tests Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49252194](https://our.internmc.facebook.com/intern/diff/D49252194) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109254 Approved by: https://github.com/jerryzh168	2023-09-18 20:26:44 +00:00
Jerry Zhang	3943afc94e	[quant][be] Remove unused APIs (#109342 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/109342 Approved by: https://github.com/kimishpatel, https://github.com/andrewor14	2023-09-15 16:07:01 +00:00
Jerry Zhang	41e2189843	[quant] Remove reference representation rewrite for adaptive_avg_pool2d (#108924 ) Summary: integer adaptive_avg_pool2d is not well defined due to different possible ways of rounding fp32 value to integer value, and this op isn't too critical for numerics (since it appears not too often), so we'll skip this for now. we might need to revert the changes that adds integer impl for adaptive_avg_pool op as well Test Plan: python test/test_quantization.py TestQuantizePT2ERepresentation Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108924 Approved by: https://github.com/kimishpatel	2023-09-14 10:18:36 +00:00
Jerry Zhang	cf26e5575d	[quant][be] Reduce warnings in tests (#108922 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108922 Approved by: https://github.com/andrewor14 ghstack dependencies: #108920, #108921	2023-09-12 21:54:33 +00:00
Jerry Zhang	b01b934aca	[quant][be] Cleanup xnnpack_quantizer implementation (#108921 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108921 Approved by: https://github.com/andrewor14	2023-09-12 19:28:41 +00:00
Jerry Zhang	241e84bf98	[quant][be] Rewrite xnnpack_quantizer_utils.py to use decorators (#108920 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108920 Approved by: https://github.com/kimishpatel	2023-09-12 00:09:13 +00:00
Andrew Or	e8a402c56e	[quant][pt2] Fix and rename `move_model_to_eval` (#108891 ) Summary: This commit fixes two silent correctness problems with the current implementation of `move_model_to_eval`: (1) Previously the user had to manually call `eliminate_dead_code` before calling `move_model_to_eval`, otherwise the dropout pattern won't actually get eliminated. This is because subgraph rewriter complains the match is not self-contained, and so silently does not do the replacement. (2) We wish to error when the user calls `model.train()` or `model.eval()` on an exported model. This error is raised correctly immediately after export today, but no longer raised after the user calls prepare or convert. We fix (1) by moving the `eliminate_dead_code` call into `move_model_to_eval`, and fix (2) by ensuring the respective errors are thrown after prepare and convert as well. Additionally, this commit renames `move_model_to_eval` to `move_exported_model_to_eval` to be more explicit. bypass-github-export-checks Test Plan: python test/test_quantization.py TestQuantizePT2E.test_disallow_eval_train python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_to_eval Imported from OSS Differential Revision: D49097293 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108891 Approved by: https://github.com/jerryzh168	2023-09-11 15:37:01 +00:00
Jerry Zhang	b0de6a8002	[quant][executorch] Support inception_v4 in examples (#108382 ) Summary: Verified that pt2e quant flow matches the fx flow with executorch backend config Test Plan: with-proxy buck2 run executorch/examples/quantization:example -- -m=ic4 --verify ``` [INFO 2023-08-31 16:08:06,923 example.py:77] prepare sqnr: inf [INFO 2023-08-31 16:08:06,932 example.py:81] quant diff max: 0.0 [INFO 2023-08-31 16:08:06,936 example.py:85] quant sqnr: inf ``` full output: https://www.internalfb.com/intern/paste/P818520579/ Differential Revision: D48889075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108382 Approved by: https://github.com/kimishpatel	2023-09-08 17:39:31 +00:00
Paul Zhang	51c2b587c9	Back out "[PyPer][BE] Fix test_scripted_module in StatCollector" (#108588 ) Differential Revision: D48908507 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108588 Approved by: https://github.com/jerryzh168	2023-09-08 14:33:58 +00:00
Kimish Patel	c1877e99c5	[Quant] Move to BFS instead of DFS to check for connectedness (#108572 ) Summary: Using dfs to check if two nodes are connecgted is making it very slow. Use of BFS makes it much faster. Test Plan: https://gist.github.com/leslie-fang-intel/9cd828623f567a3afbf41564d3546398 Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48971710](https://our.internmc.facebook.com/intern/diff/D48971710) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108572 Approved by: https://github.com/jerryzh168, https://github.com/osalpekar	2023-09-07 00:26:28 +00:00
Jerry Zhang	32a16d4999	[quant][pt2e] Support int16 quantization (#108453 ) Summary: Previously we can only use native pytorch int dtypes that has corresponding quantized dtypes (e.g. quint8, qint8), this PR removes this assumption in observers/fake_quants so that users can use all pytorch native dtypes (except for int64, we can add it later if need) the main addition here is int16. Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/108453 Approved by: https://github.com/kimishpatel	2023-09-06 19:31:20 +00:00
Kimish Patel	ffc0c46092	[Quantization] Add metadata porting for nodes added by quantization (#107107 ) Summary: This diff adds adding metadata to q-dq nodes by inferring the quatization intent from node annotations. Annotations on the node are way for user to specify how a node or subgraph is supposed to be quantized. We continue to use that information to copy metadata on Q/DQ node from appropriate nodes. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488416](https://our.internmc.facebook.com/intern/diff/D48488416) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107107 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105, #107106, #107899, #107900	2023-09-02 06:38:14 +00:00
Kimish Patel	eb67c452c8	[Quant] Add DQ duplication pass (#107900 ) Summary: During convert step observers are first replaced by Q-DQ pair. In some scenarios like following output DQ has a fan out. ---> OP2 -> Q -> DQ / OP -> Q -> DQ - \ ---> OP3 -> Q -> DQ If either op OP2 or OP3 are configured to be quantized, then the input is expected to quantized. In this case quantized equivalent of some pattern, that quantizer asked to be quantized, should look like: [DQ -> {pattern} -> Q]. However, in scenario like above where DQ node is shared between multiple "quantized" patterns, boundary of "quantized" pattern is not clear because DQ now belongs to multiple quantized patterns. This poses challenge for: - Porting metadata: which "quantized" partition this DQ node belongs - Quantized representation, equivalently, needs to identify self-contained quantized pattern that is replaced by its equivalent pattern that captures compute in the quantized precision. Test Plan: test_duplicate_dq_pass Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663147](https://our.internmc.facebook.com/intern/diff/D48663147) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107900 Approved by: https://github.com/jerryzh168, https://github.com/andrewor14, https://github.com/leslie-fang-intel ghstack dependencies: #107105, #107106, #107899	2023-09-02 06:20:03 +00:00
Kimish Patel	f8d1ca9835	[Quant] Bug fix (#107899 ) Summary: When two layers are quantized differently, observer map update updates map for key (observed_node, node), whereas it should really be (original_input, node) Test Plan: Test in the next diff adds a test where it otherwise fails Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48663145](https://our.internmc.facebook.com/intern/diff/D48663145) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107899 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105, #107106	2023-09-02 06:20:03 +00:00
Kimish Patel	37b0d76e35	[Quantization] Make annotation util functions return annotated nodes (#107106 ) Summary: Having annotation functions return nodes that are annotated is useful specifically for adding "quantization_tag" to those nodes Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488415](https://our.internmc.facebook.com/intern/diff/D48488415) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107106 Approved by: https://github.com/jerryzh168 ghstack dependencies: #107105	2023-09-02 06:19:55 +00:00
Kimish Patel	99168c1fa9	[Quant] Use input_qspec_map for weight quantization of linear (#107105 ) Summary: In prepararation for metadata porting diff, it is required that weight quant annotation happens via edge quantization, i.e. input_qspec_map. Reason: Metadata is ported via associating DQ node's metadata with its consumer while associating Q node's metadata with its producer. Furthermore, such porting must be qualified via user intent to see if the consumder of DQ, or producer of Q, actually specified intent of quantization By making quantization annotation on linear node's weight via input_qspec_map, we can enable associating DQ of [weight -> Q -> DQ], with the linear module. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48488414](https://our.internmc.facebook.com/intern/diff/D48488414) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107105 Approved by: https://github.com/jerryzh168	2023-09-02 06:19:50 +00:00
Paul Zhang	4a9c6f1b73	[PyPer][BE] Fix test_scripted_module in StatCollector (#108232 ) Summary: D41985889 removed the cast to int for the inputs to torch.histc below, allowing the inputs to still be tensors. These tensors still have require_grad_ set to True, causing issues with the call to torch.histc. Test Plan: buck2 test 'fbcode//mode/opt' fbcode//dper3/dper3/modules/low_level_modules/tests:stat_collector_test -- --exact 'dper3/dper3/modules/low_level_modules/tests:stat_collector_test - test_scripted_module (dper3.dper3.modules.low_level_modules.tests.stat_collector_test.StatCollectorTest_1)' Differential Revision: D48800879 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108232 Approved by: https://github.com/jerryzh168	2023-09-01 04:23:57 +00:00
Jerry Zhang	a9fe0b5b74	[quant][pt2e] Move propagate_annotation from quant flow to quantizer (#108320 ) Summary: Previously we run propagate_annotation by default in quantization flow to propagate annotations for ops like reshape, view etc. Not all quantizers would need this so we moved this to xnnpack_quantizer_utils for now. Next Step: * make propagate_annotation function configurable with a custom list of ops * remove unneeded ops in `_is_share_obs_or_fq_op` Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48856985](https://our.internmc.facebook.com/intern/diff/D48856985) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108320 Approved by: https://github.com/kimishpatel	2023-09-01 01:49:19 +00:00
leslie-fang-intel	6c342ec368	Revert PR-107951 to only support new graph capture API in Quantization (#108317 ) Summary Revert the changes in https://github.com/pytorch/pytorch/pull/107951 to make the utils function only support graph captured by `capture_pre_autograd_graph`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108317 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #108214	2023-09-01 00:47:10 +00:00
leslie-fang-intel	fb808c30c7	x86_inductor_quantizer switches to new graph capture API (#108214 ) Summary Update `X86InductorQuantizer` and related testcase to the new graph capture API `capture_pre_autograd_graph`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108214 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-09-01 00:43:45 +00:00
andrewor14	057b807178	[quant] Move dropout replacement to `move_model_to_eval` (#108184 ) Summary: This commit adds a public facing `torch.ao.quantization.move_model_to_eval` util function for QAT users. Instead of calling model.eval() on an exported model (which doesn't work, see https://github.com/pytorch/pytorch/issues/103681), the user would call this new util function instead. This ensures special ops such as dropout and batchnorm (not supported yet) will have the right behavior when the graph is later used for inference. Note: Support for an equivalent `move_model_to_train` will be added in the future. This is difficult to do for dropout currently because the eval pattern of dropout is simply a clone op, which we cannot just match and replace with a dropout op. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_move_model_to_eval Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Differential Revision: [D48814735](https://our.internmc.facebook.com/intern/diff/D48814735) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108184 Approved by: https://github.com/jerryzh168	2023-08-30 16:33:17 +00:00
Jerry Zhang	147b3495e2	[quant][pt2e] Add reference representation for dynamic quantized linear (#108073 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_dynamic_linear buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e -- 'test_representation_dynamic_linear' Reviewed By: kimishpatel Differential Revision: D48703076 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108073 Approved by: https://github.com/andrewor14	2023-08-29 07:12:55 +00:00
Jerry Zhang	9ae3d7ca90	[reland][quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) (#107992 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107992 Approved by: https://github.com/digantdesai, https://github.com/mcr229	2023-08-27 14:50:03 +00:00
Xia, Weiwen	e9b0f62a19	[Quant][PT2E] Enable linear and linear-unary post-op quant recipe for x86 inductor quantizer (#106781 ) Summary Add linear and linear-unary post-op quantization recipe to x86 inductor quantizer. For PT2E with Inductor. With this, the quantization path will add `quant-dequant` pattern for linear and linear-unary post op. Test plan python test/test_quantization.py -k test_linear_with_quantizer_api python test/test_quantization.py -k test_linear_unary_with_quantizer_api Pull Request resolved: https://github.com/pytorch/pytorch/pull/106781 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #105818	2023-08-27 10:50:17 +00:00
leslie-fang-intel	c85c5954f2	[Quant][PT2E]Make _fuse_conv_bn_ support graph capture by torch._dynamo.export (#107951 ) Summary The latest check-in `a0cfaf0688` for the conv-bn folding assumes the graph is captured by the new graph capture API `torch._export.capture_pre_autograd_graph`. Since we still need to use the original graph capture API `torch._dynamo_export` in 2.1 release. So, this check-in made negative impact to workloads' performance heavily. Made this PR to fix this issue by trying to make the conv-bn folding function workable with both new and original graph capture API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107951 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #106836, #106838, #106958	2023-08-26 17:19:41 +00:00
leslie-fang-intel	1147a28b0b	[Quant][PT2E] Add cat and avg_pool2d recipe into x86InductorQuantizer (#106836 ) Summary Add `cat` and `avg_pool2d` quantization recipe as input output share observer into `x86InductorQuantizer`. Test Plan ``` clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_same_inputs clear && python -m pytest test_x86inductor_quantizer.py -k test_cat_recipe_single_input clear && python -m pytest test_x86inductor_quantizer.py -k test_avg_pool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106836 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-08-26 16:51:13 +00:00
Jerry Zhang	15d4dedbbf	[quant][pt2e] Add reference representation rewrite for statically quantized linear (#107994 ) Summary: att Test Plan: ``` python test/test_quantization.py TestQuantizePT2E.test_representation_linear buck2 test 'fbcodemode/opt' fbcodecaffe2/test:quantization_pt2e -- 'test_representation_linear' ``` Reviewed By: kimishpatel Differential Revision: D48674862 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107994 Approved by: https://github.com/mcr229, https://github.com/guangy10	2023-08-26 15:39:52 +00:00
leslie-fang-intel	70ca18f8a0	[Quant][PT2E] Enable X86InductorQuantizer single quantizable op(maxpool2d) (#105639 ) Summary In this PR, we mainly enable 2 things. - Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`. - Add quantization recipe of `maxpool2d` and annotate it as input./output share observer. Test Plan ``` python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105639 Approved by: https://github.com/jgong5, https://github.com/jerryzh168 ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456	2023-08-26 08:34:15 +00:00
andrewor14	240bdbea61	[quant][pt2e] Fix annotation for conv no bias case (#107971 ) Summary: This fixes the no bias case for conv annotations. Previously this would result in an index out of bounds, since the new aten.conv2d op may not have the bias arg (unlike the old aten.convolution op). This was not caught because of a lack of test cases, which are added in this commit. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_qat_conv_no_bias python test/test_quantization.py TestQuantizePT2E.test_qat_conv_bn_relu_fusion_no_conv_bias Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel Differential Revision: [D48696874](https://our.internmc.facebook.com/intern/diff/D48696874) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107971 Approved by: https://github.com/jerryzh168	2023-08-26 01:01:54 +00:00
Jerry Zhang	f92f69dbfb	[quant][pt2e] Enable testing for reference quant model representations (#107474 ) Summary: Previously these tests were disabled due to time out in dynamo export in fbcode, this might have been resolved, so trying to enable again Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48619072](https://our.internmc.facebook.com/intern/diff/D48619072) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107474 Approved by: https://github.com/andrewor14	2023-08-26 00:37:45 +00:00
PyTorch MergeBot	8d44b0f5a5	Revert "[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 )" This reverts commit `1d1739dc6d`. Reverted https://github.com/pytorch/pytorch/pull/107930 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107930#issuecomment-1694069330))	2023-08-26 00:37:02 +00:00
Jerry Zhang	1d1739dc6d	[quant][pt2e][xnnpack_quantizer] Add support for mul and mul_relu (#107930 ) Summary: att Test Plan: buck2 run executorch/examples/quantization:example -- -m=mv3 --verify Differential Revision: D48588121 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107930 Approved by: https://github.com/kimishpatel	2023-08-25 23:36:19 +00:00
leslie-fang-intel	1374974d60	[Quant][Inductor] Enable quantization conv_binary(add/add_relu) pattern fusion inside inductor (#105456 ) Summary Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor. Test Plan ``` clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105456 Approved by: https://github.com/jgong5, https://github.com/eellison ghstack dependencies: #104580, #104581, #104588, #104590, #105455	2023-08-25 21:16:02 +00:00
Jerry Zhang	a0cfaf0688	[quant][pt2e] Make sure XNNPACKQuantizer works with the pre_dispatch=True (#107872 ) Summary: att Test Plan: ``` buck test //executorch/backends/xnnpack/test:test_xnnpack_quantized_models -- test_resnet18 buck2 test 'fbcode//mode/opt' fbcode//caffe2/test:quantization_pt2e ``` Reviewed By: andrewor14, tugsbayasgalan Differential Revision: D48415977 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107872 Approved by: https://github.com/andrewor14	2023-08-25 05:04:01 +00:00
Kimish Patel	2fbe6ef2f8	[pytorch][Quant] Fix bias quant bug (#107810 ) Summary: Bias should be quantized by act_scale * weight_scale in conv and linear Test Plan: Rewrite tests Differential Revision: D48606828 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107810 Approved by: https://github.com/jerryzh168	2023-08-24 23:44:19 +00:00
Jerry Zhang	16fcb07846	[quant][pt2e] Add support for channel in DerivedQuantizationSpec (#107833 ) Summary: att Test Plan: python test/test_quantization.py TestQuantizePT2E.test_derived_qspec_per_channel Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D48630535](https://our.internmc.facebook.com/intern/diff/D48630535) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107833 Approved by: https://github.com/andrewor14	2023-08-24 07:45:13 +00:00
Sherlock Huang	ee4b99cc3a	Decomp for aten.dropout (#106274 ) When exporting dropout with cpu tensor, we get following graph module ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format) bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9); empty_memory_format = None div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9); bernoulli_p = None mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar); arg0_1 = div_scalar = None return (mul_tensor,) ``` In addition, if we export with eval() mode, we will have an empty graph. However, when exporting with cuda tensor, we got ``` class GraphModule(torch.nn.Module): def forward(self, arg0_1: f32[512, 10]): native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True); arg0_1 = None getitem: f32[512, 10] = native_dropout_default[0]; native_dropout_default = None return (getitem,) ``` and exporting under eval() mode will still have a dropout node in graph. This PR make exporting with CPU tensor also produce aten.native_dropout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274 Approved by: https://github.com/ezyang	2023-08-23 21:12:37 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
Jerry Zhang	28be2c674a	[quant][pt2e] Move specific quantizer related things outside of main quant code base (#106806 ) (#107259 ) Summary: Currently in quantizer/quantize_pt2e we import things from specific quantizers (XNNPACKQuantizer, QuantizationConfig) etc. this PR removes them so it's clearer that they are not part of the core quantization code base This PR also removed get_supported_operators from main Quantizer since we haven't seen a clear need for this API Test Plan: CIs Imported from OSS Differential Revision: D48340367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107259 Approved by: https://github.com/kimishpatel	2023-08-18 21:29:09 +00:00
Jerry Zhang	d3c4ec767b	[quant][pt2e] Fix handling for SharedQuantizationSpec (#106922 ) Summary: Previously if we have: ``` conv1 -> cat conv2 / ``` and configure output of conv1/conv2 to be int8 quantized, and cat also int8 quantized and with shared inputs, it will not produce expected results (input of cat will not be shared) The problem is that there is some missing checks when inserting observers for input for cat This PR fixes the problem. Fixes: https://github.com/pytorch/pytorch/issues/106760 Test Plan: python tes/test_quantization.py TestQuantzePT2E.test_shared_qspec Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106922 Approved by: https://github.com/kimishpatel	2023-08-16 21:16:45 +00:00
Jiaxu Zhu	152203d3c3	[pytorch][ao] Add `torch.matmul` in FloatFunctional/QFunctional (#106831 ) Summary: As title Test Plan: new unit tests Differential Revision: D48172841 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106831 Approved by: https://github.com/jerryzh168	2023-08-10 22:43:36 +00:00
Jerry Zhang	4afab40b56	[quant][pt2e] Removed mean/hardtanh annotations and refactored adaptive_avg_pool annotation (#106805 ) Summary: Removed annotations for some ops, since they are handled in torch/ao/quantization/pt2e/_propagate_annotation.py Test Plan: CIs Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106805 Approved by: https://github.com/kimishpatel	2023-08-10 04:51:06 +00:00
Jerry Zhang	97ce979e5d	[quant][pt2e] Add reference representation for quantized conv2d (#105784 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel Although right now it is not really testing things since there is some problem with dynamo export Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105784 Approved by: https://github.com/kimishpatel ghstack dependencies: #105783	2023-08-09 22:41:35 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	a44c072c89	Make InternalModel and Resnet work with rexportable flow (#106676 ) Summary: Internal model and Resnet uses "re-export" flow now. Also did some refactoring to make the code little cleaner Some changes for OSS: 1. Correctly use the "cached" fake tensors so that static symbols are still resolved to static 2. Change logic in PassBase to allocate static shapes for parameters 3. Add "is_torch_exported" tag to every node to make it survive during various graph transformations. 4. Added experimental wrapper API for quantization team to get pre_dispatch=True graph. Note that it doesn't actually do that right now. But we plan to switch soon. Test Plan: CI Differential Revision: D47890878 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106676 Approved by: https://github.com/jerryzh168	2023-08-09 20:10:48 +00:00
Jerry Zhang	e1a1780626	[quant][pt2e] Move annotate functions in XNNPACKQuantizer to utils (#106642 ) Summary: This is to allow sharing these annotate functions by other quantizers so that writing a new quantizer is easier note that these annotation functions will be maintained by XNNPACKQuantizer developers instead of AO team Test Plan: python test/test_quantization.py TestQuantizePT2E Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/106642 Approved by: https://github.com/andrewor14	2023-08-09 18:52:39 +00:00
Jerry Zhang	69ecad6f2b	[quant][pt2e] Add reference representation for quantize_per_channel and dequantize_per_channel (#105783 ) Summary: Implementing reference representation for quantized ops we decided in https://docs.google.com/document/d/17h-OEtD4o_hoVuPqUFsdm5uo7psiNMY8ThN03F9ZZwg/edit#heading=h.ov8z39149wy8 Test Plan: python test/test_quantization.py TestQuantizePT2E.test_representation_quantize_dequantize_per_channel Although right now it is not really testing things since there is some problem with dynamo export Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/105783 Approved by: https://github.com/kimishpatel	2023-08-09 01:39:52 +00:00

1 2 3 4 5 ...

1006 Commits