pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Shen Xu	159f30331f	[quant][pt2e] Call sub-quantizers' transform_for_annotation in ComposableQuantizer (#121548 ) Test Plan: ``` buck run caffe2/test:quantization_pt2e ``` Differential Revision: D54454707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/121548 Approved by: https://github.com/jerryzh168	2024-03-12 02:59:12 +00:00
Jerry Zhang	a6a67da333	[quant] Add error check for input_edge annotation (#121536 ) Summary: Raises error when an input edge contains non-Node elements like constant values etc in annotation. Test Plan: python test/test_quantization.py -k test_input_edge_sanity_check Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/121536 Approved by: https://github.com/andrewor14	2024-03-09 06:13:04 +00:00
albanD	6791b0c09e	Change default torch_function behavior to be disabled when torch_dispatch is defined (take 2) (#120632 ) This does not introduce a new test but is tested by checking that all the classes we already have still behave as before now that they don't explicitly disable torch_function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120632 Approved by: https://github.com/ezyang	2024-03-09 01:08:37 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	b474a523c6	Ban passing in free function into capture_pre_autograd_graph (#120817 ) Summary: Today we don't allow free functions to be tracing callable in torch.export. As a part of migrating capture_preautograd_graph usages to torch.export, we need to ban free functions to capture_preautograd_graph as well Test Plan: CI Differential Revision: D54319597 Pull Request resolved: https://github.com/pytorch/pytorch/pull/120817 Approved by: https://github.com/zhxchen17, https://github.com/andrewor14	2024-03-01 19:38:58 +00:00
Nikita Shulga	98c4ba683e	[EZ][BE] Fix ResourceWarning (#120886 ) By closing the file handle Fixes ``` /Users/nshulga/git/pytorch/pytorch/test/quantization/core/test_docs.py:132: ResourceWarning: unclosed file <_io.TextIOWrapper name='/Users/nshulga/git/pytorch/pytorch/docs/source/quantization.rst' mode='r' encoding='UTF-8'> ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/120886 Approved by: https://github.com/seemethere, https://github.com/kit1980, https://github.com/Skylion007	2024-02-29 17:07:39 +00:00
andrewor14	91190d8087	[quant][pt2e] Relax `model_is_exported` input (#120720 ) Summary: This commit relaxes the `model_is_exported` API to additionally work for `torch.nn.Module`s in addition to just `torch.fx.GraphModule`s, simplifying downstream uses. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_model_is_exported Differential Revision: [D54263935](https://our.internmc.facebook.com/intern/diff/D54263935) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120720 Approved by: https://github.com/tugsbayasgalan	2024-02-28 18:32:03 +00:00
andrewor14	6ea4480818	[quant][pt2e] Add `model_is_exported` util function (#119726 ) Summary: This commit adds the `model_is_exported` util function for users to be able to easily tell what APIs to call to move their models between train and eval modes. This has the additional advantage of hiding the implementation of how we detect a model is exported, in case the metadata format changes in the future. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_model_is_exported Differential Revision: [D53812972](https://our.internmc.facebook.com/intern/diff/D53812972) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119726 Approved by: https://github.com/tugsbayasgalan, https://github.com/albanD	2024-02-16 19:29:36 +00:00
gs-olive	e0f6fa6a7c	Windows Dynamo Error Removal CI Check (#115969 ) Rebase of #111313 onto `main`, for CI validation Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/115969 Approved by: https://github.com/PaliC, https://github.com/thiagocrepaldi	2024-02-14 21:14:36 +00:00
atalman	244b124bb8	Add linux cpu test for 3.12 (#117853 ) This is continuation of work: https://github.com/pytorch/pytorch/pull/113987 Co-authored-by: albanD <desmaison.alban@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117853 Approved by: https://github.com/albanD	2024-02-14 20:52:23 +00:00
PyTorch MergeBot	4a5b2cd6cb	Revert "Windows Dynamo Error Removal CI Check (#115969 )" This reverts commit `45e7af5818`. Reverted https://github.com/pytorch/pytorch/pull/115969 on behalf of https://github.com/PaliC due to this pr ended up breaking some of our periodic tests ([comment](https://github.com/pytorch/pytorch/pull/115969#issuecomment-1942934386))	2024-02-14 01:11:46 +00:00
Sergii Dymchenko	bd9db6a9c7	Update to TorchFix 0.4.0 (#119424 ) `torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424 Approved by: https://github.com/zou3519	2024-02-12 23:30:12 +00:00
Riley Dulin	44796682d0	[torch][ao] Fix module name filter for pytorch2 quantization for underscores (#119344 ) Summary: There was a bug in the module name filter for modules that had an underscore already in them, as it was replaced with a "dot" notation. This is because it was thought that underscores always meant a module separator, but this isn't the case for modules whose name contains an underscore. Test Plan: Added a unit test. Before this change, that test failed (due to applying the wrong qscheme). Now it passes. Differential Revision: D53502771 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119344 Approved by: https://github.com/jerryzh168	2024-02-10 00:29:08 +00:00
Jerry Zhang	7082e24ce8	[quant][pt2e][bc-breaking] Set `fold_quantize` to True in `convert_pt2e` (#119425 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to set `fold_quantize` flag to True in `convert_pt2e` Test Plan: CI Differential Revision: D53550237 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119425 Approved by: https://github.com/andrewor14	2024-02-09 18:13:43 +00:00
gs-olive	45e7af5818	Windows Dynamo Error Removal CI Check (#115969 ) Rebase of #111313 onto `main`, for CI validation Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/115969 Approved by: https://github.com/ezyang	2024-02-08 21:23:45 +00:00
Angela Yi	0827510fd3	[export] Remove torch._export.export (#119095 ) XLA changes: https://github.com/pytorch/xla/pull/6486 Test Plan: CI Differential Revision: D53316196 Pull Request resolved: https://github.com/pytorch/pytorch/pull/119095 Approved by: https://github.com/ydwu4, https://github.com/zhxchen17, https://github.com/tugsbayasgalan, https://github.com/avikchaudhuri, https://github.com/jerryzh168	2024-02-08 21:22:04 +00:00
PyTorch MergeBot	81abc2b249	Revert "[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 )" This reverts commit `482d952e88`. Reverted https://github.com/pytorch/pytorch/pull/118701 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/118701#issuecomment-1932866964))	2024-02-07 20:56:16 +00:00
Jerry Zhang	482d952e88	[quant][pt2e][bc-breaking] Remove fold_quantize flag (#118701 ) Summary: This is a follow up to https://github.com/pytorch/pytorch/pull/118605 to remove `fold_quantize` flag from `convert_pt2e` Test Plan: CI Differential Revision: D53247301 BC Breaking Note: flag `fold_quantize` set to True `convert_pt2e` and now we'll fold the quantize op in the weight by default, so users will see model size reduction by default after pt2e quantization. 2.2 ``` folded_model = convert_pt2e(model, fold_quantize=True) non_folded_model = convert_pt2e(model) ``` 2.3 ``` folded_model = convert_pt2e(model) non_folded_model = convert_pt2e(model, fold_quantize=False) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/118701 Approved by: https://github.com/andrewor14, https://github.com/leslie-fang-intel	2024-02-07 19:10:51 +00:00
andrewor14	6c1cca153e	[quant][pt2e] Allow users to override train/eval behavior (#119091 ) Summary: This commit adds a util for PT2E quantization users to call `model.train()` and `model.eval()` without error. Instead, these will automatically call the equivalent `move_exported_model_to_train/eval` for the user, which only switch behavior for special ops like dropout and batchnorm. This enables users to onboard to the PT2E flow more easily. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_allow_exported_model_train_eval Reviewers: jerryzh168, tugsbayasgalan, zhxchen17 Subscribers: jerryzh168, tugsbayasgalan, zhxchen17, supriyar Differential Revision: [D53426636](https://our.internmc.facebook.com/intern/diff/D53426636) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119091 Approved by: https://github.com/jerryzh168, https://github.com/tugsbayasgalan, https://github.com/zhxchen17	2024-02-06 22:19:58 +00:00
andrewor14	70605d150b	[quant][pt2] Add `move_exported_model_to_train` (#113492 ) Summary: This is the equivalent API to `model.train()` for exported models, analogous to `move_exported_model_to_eval`. Test Plan: python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout_inplace python test/test_quantization.py TestQuantizePT2E.test_move_exported_model_dropout_bn Reviewers: jerryzh168, kimishpatel Subscribers: jerryzh168, kimishpatel, supriyar Pull Request resolved: https://github.com/pytorch/pytorch/pull/113492 Approved by: https://github.com/jerryzh168, https://github.com/tugsbayasgalan	2024-02-02 17:39:47 +00:00
Jiaxu Zhu	b97ab47619	[pytorch][ao] Update `PerChannelMinMaxObserver` default `_load_from_state_dict` (#118659 ) Summary: When `version` is missing in the metadata, use `min_val/max_val` as keys instead of `max_vals/min_vals` ## Reasons 1. It's been almost 2 years since this change D30003700, which means now most checkpoints are using the `max_val/min_val` keys 2. most checkpoints dumps using `model.state_dict()` don't have version info, which will lead a fake `missing keys` error when loading state_dict Test Plan: CI Differential Revision: D53233012 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118659 Approved by: https://github.com/jerryzh168	2024-02-01 04:39:31 +00:00
Aaron Gokaslan	1562dae62c	[BE]: Apply RUF025 dict.fromkeys preview rule (#118637 ) Simplifies and optimizes dict construction using the `fromkeys` classmethod ctor. This also makes it really obvious when all the keys will have the same static value, which could be a bug if unintentional. It is also significantly faster than using a dict comprehension. The rule is in preview, but I am adding a forward fix for when it becomes stable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118637 Approved by: https://github.com/albanD	2024-01-30 20:46:54 +00:00
Peter Bell	1460334436	[quant] Remove deprecated torch.jit.quantized APIs (#118406 ) The `torch.jit.quantized` interface has been deprecated since #40102 (June 2020). BC-breaking message: All functions and classes under `torch.jit.quantized` will now raise an error if called/instantiated. This API has long been deprecated in favor of `torch.ao.nn.quantized`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118406 Approved by: https://github.com/jerryzh168	2024-01-27 18:32:45 +00:00
Jerry Zhang	af1ebc45d3	[quant][pt2e] Add fold_quantize=True for all convert_pt2e calls (#117797 ) Summary: In preparation for enabling fold_quantize=True by default Test Plan: CI Differential Revision: D52879612 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117797 Approved by: https://github.com/andrewor14	2024-01-24 17:54:13 +00:00
le-zheng	94f0472579	[Quant] [PT2] Add Hardswish into X86InductorQuantizer Conv2d Unary Annotation (#117488 ) Summary Add `hardswish` into X86InductorQuantizer Conv2d Unary Annotation TestPlan ``` python -m pytest test_x86inductor_quantizer.py -k test_conv2d_unary python -m pytest test_x86inductor_quantizer.py -k test_qat_conv2d_unary ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117488 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5 ghstack dependencies: #117487	2024-01-20 01:37:33 +00:00
le-zheng	f115f1cde1	[Quant] Enable QConv2d with hardswish post op (#117487 ) Summary Enable QConv2d implementation with post op `hardswish` Test Plan ``` python -m pytest test_quantized_op.py -k test_qconv2d_hardswish_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117487 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5	2024-01-19 13:24:06 +00:00
rzou	db1a6eda9e	[codemod] markDynamoStrictTest batch 22 (#117729 ) [codemod] markDynamoStrictTest test_autograd [codemod] markDynamoStrictTest test_ao_sparsity [codemod] markDynamoStrictTest test_jit [codemod] markDynamoStrictTest test_quantization Pull Request resolved: https://github.com/pytorch/pytorch/pull/117729 Approved by: https://github.com/bdhirsh	2024-01-18 16:59:26 +00:00
Jerry Zhang	8f1bc876b2	[quant] Support custom qmin/qmax for activation and weight for xnnpack quantizer (#117305 ) Summary: att, this allows us to experiment with 4 bit quant in xnnpack Test Plan: python test/test_quantization.py -k test_dynamic_linear_int4_weight Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/117305 Approved by: https://github.com/digantdesai	2024-01-17 03:22:49 +00:00
Jerry Zhang	3e397cefc5	Add uint1 to uint7 dtypes (#117208 ) Summary: These dtypes are added since we see more demand for these sub byte dtypes, especially with the popularity of LLMs (https://pytorch.org/blog/accelerating-generative-ai-2/#step-4-reducing-the-size-of-the-weights-even-more-with-int4-quantization-and-gptq-2021-toks) Note these are just placeholders, the operator support for these dtypes will be implemented with tensor subclass. e.g. torch.empty(..., dtype=torch.uint1) will return a tensor subclass of uint1, that supports different operations like bitwsise ops, add, mul etc. (will be added later) Also Note that these are not quantized data types, we'll implement quantization logic with tensor subclass backed up by these dtypes as well. e.g `Int4GroupedQuantization(torch.Tensor)` will be implemented with torch.uint4 Tensors (see https://github.com/pytorch-labs/ao/pull/13 as an example) Test Plan: CIs python test/test_quantization.py -k test_uint1_7_dtype Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/117208 Approved by: https://github.com/ezyang	2024-01-13 01:09:23 +00:00
rzou	7b753cc7b8	Skip some slow tests (under Dynamo) (#117389 ) Otherwise these may cause timeouts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/117389 Approved by: https://github.com/jerryzh168, https://github.com/voznesenskym ghstack dependencies: #117318, #117320	2024-01-12 22:18:07 +00:00
Xia, Weiwen	94db6578cc	[Quant] Add dynamic quantization config for x86 inductor backend (#115337 ) Description Add dynamic quantization config for x86 inductor backend. To support the QKV structure in self-attention, we removed an assertion in port-metadata-pass that requires single dequantize node after quantize node. Test plan ``` python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_dynamic_quant_linear python test/test_quantization.py -k TestQuantizePT2EX86Inductor.test_qat_dynamic_quant_linear ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115337 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-01-10 11:33:37 +00:00
Huy Do	907e80239d	Fix broken lint after #117052 (#117080 ) https://hud.pytorch.org/pr/pytorch/pytorch/117052#20318344490 breaks lint, forward fixing with `lintrunner -a` Pull Request resolved: https://github.com/pytorch/pytorch/pull/117080 Approved by: https://github.com/atalman, https://github.com/clee2000, https://github.com/Skylion007	2024-01-10 00:44:19 +00:00
Max Ren	d2033a0639	[quant][pt2e][xnnpack_quantizer] add support for linear_relu (#117052 ) Add support for linear_relu annotation for XNNPACKQuantizer, this allows the input to linear and the output to relu to share the same quantization parameter.s Differential Revision: [D52574086](https://our.internmc.facebook.com/intern/diff/D52574086/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/117052 Approved by: https://github.com/jerryzh168, https://github.com/digantdesai	2024-01-09 23:19:52 +00:00
Jerry Zhang	28e2e12b2a	[quant][be] enable xnnpack_quantizer tests to run in internal CI (#116911 ) Summary: fixed an import problem for test_xnnpack_quantizer so that it can run in CI Test Plan: internal CI sanity check: buck2 test 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- --exact 'caffe2/test/quantization:test_quantization - test_conv2d (caffe2.test.quantization.pt2e.test_xnnpack_quantizer.TestXNNPACKQuantizer)' Differential Revision: D52576449 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116911 Approved by: https://github.com/mcr229	2024-01-08 23:43:47 +00:00
Aaron Gokaslan	3fe437b24b	[BE]: Update flake8 to v6.1.0 and fix lints (#116591 ) Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling. - Replace `assert(0)` with `raise AssertionError()` - Remove extraneous parenthesis i.e. - `assert(a == b)` -> `assert a == b` - `if(x > y or y < z):`->`if x > y or y < z:` - And `return('...')` -> `return '...'` Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591 Approved by: https://github.com/albanD, https://github.com/malfet	2024-01-03 06:04:44 +00:00
Jerry Zhang	41f265b06a	[quant][pt2e] Preserve numeric_debug_handle in quantization flows (#116477 ) Summary: We introduced `node.meta["numeric_debug_handle"]` in https://github.com/pytorch/pytorch/pull/114315 to indicate the numeric debug handle for values in the graph, in this PR we supported preserving this field in prepare and convert so that we can use these for numerical debugging Next: we also want to preserve these in deepcopy of GraphModule as well Test Plan: python test/test_quantization.py -k test_quantize_pt2e_preserve_handle Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116477 Approved by: https://github.com/tugsbayasgalan	2024-01-03 03:39:00 +00:00
Le-Zheng	95a86ed9ca	[Quant] Add int8 linear op gelu for quantization PT2E with Inductor. input is an int8 CPU tensor; weight is an int8 MdkldnnCPU tensor (#114852 ) Summary Enable Int8 Linear Gelu post operator fusions for Stock PyTorch Inductor. The input is an int8 CPU tensor and weight is an int8 MkldnnCPU tensor. Test plan python test/test_quantization.py -k test_qlinear_gelu_pt2e Pull Request resolved: https://github.com/pytorch/pytorch/pull/114852 Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5	2024-01-02 08:11:26 +00:00
Aaron Gokaslan	bd10fea79a	[BE]: Enable F821 and fix bugs (#116579 ) Fixes #112371 I tried to fix as many of the bugs as I could, a few I could not figure out what the proper fix for them was though and so I left them with noqas. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116579 Approved by: https://github.com/ezyang	2024-01-01 08:40:46 +00:00
Jerry Zhang	8173d98c57	[quant][be] Skip conv-bn folding when there are no batchnorm ops (#116440 ) Summary: `_fold_conv_bn_qat` is taking a long time currently, so skipping it when it's not necessary, we can have follow up fixes to actually reduce the patterns or cache the patterns if possible Test Plan: uncomment the print in `test_speed`, run python test/test_quantization.py -k test_speed and make sure the convert time is low, e.g. 0.1s instead of 8-9 seconds Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/116440 Approved by: https://github.com/andrewor14	2023-12-28 23:33:21 +00:00
Nikita Shulga	e86636266f	[Quantized] Fixed `equal_quantized_cpu` for QUInt4 (#116307 ) - Return false if scalar_type is different (because QInt8 and QUint8 has identical item_size but shouldn't be compared by comparing data) - Compute data_size correctly for QUInt4x2 and QUInt2x4 dtypes - Add regression test Fixes https://github.com/pytorch/pytorch/issues/116087 Pull Request resolved: https://github.com/pytorch/pytorch/pull/116307 Approved by: https://github.com/jerryzh168	2023-12-26 21:52:28 +00:00
leslie-fang-intel	81cebca3d2	[Inductor] [Quant] Fix QConv Binary Inplace Layout Issue (#115613 ) This pull request primarily addresses two issues to resolve the `QConvPointWiseBinaryPT2E` layout problem: - As the changes made in `611a7457ca`, for `QConvPointWiseBinaryPT2E` with post-op `sum`, we should also utilize `NoneLayout` and return `accum` instead of `QConvPointWiseBinaryPT2E`. - Additionally, this pull request fixes an issue in the `_quantized_convolution_onednn` implementation. Given that we expect `accum` to be inplace changed, we should avoid copying `accum` by changing the memory format or data type inside the kernel implementation. Instead, we have moved the necessary changes of memory format or data type to the lowering of `QConvPointWiseBinaryPT2E`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115613 Approved by: https://github.com/jgong5, https://github.com/oulgen ghstack dependencies: #116172	2023-12-24 08:04:29 +00:00
leslie-fang-intel	dfb6815170	[Reland] [PT2] [Quant] Change the QConv2d Binary post op name from add to sum (#116172 ) Summary Re-land https://github.com/pytorch/pytorch/pull/115329. Open a new PR since the origin branch has been deleted. Change the QConv2d Binary fusion post op name from `add` to `sum`, since we are actually using OneDNN `post op sum` instead of `Binary_Add` for now. TestPlan ``` python -m pytest test_quantized_op.py -k test_qconv2d_sum_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_float_output_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/116172 Approved by: https://github.com/kit1980	2023-12-24 08:00:21 +00:00
PyTorch MergeBot	b6d0d0819a	Revert "[PT2] [Quant] Change the QConv2d Binary post op name from add to sum (#115329 )" This reverts commit `9ae0e62929`. Reverted https://github.com/pytorch/pytorch/pull/115329 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, please check internal diff to get the list and logs, @jerryzh168 please support the author in order to get these changes merged and landed ([comment](https://github.com/pytorch/pytorch/pull/115329#issuecomment-1863021726))	2023-12-19 15:52:57 +00:00
leslie-fang-intel	9ae0e62929	[PT2] [Quant] Change the QConv2d Binary post op name from add to sum (#115329 ) Summary Change the QConv2d Binary fusion post op name from `add` to `sum`, since we are actually using OneDNN `post op sum` instead of `Binary_Add` for now. TestPlan ``` python -m pytest test_quantized_op.py -k test_qconv2d_sum_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_pt2e python -m pytest test_quantized_op.py -k test_qconv2d_sum_relu_float_output_pt2e ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115329 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-12-15 05:10:47 +00:00
angelayi	dd42201cb8	[export] Preserve FQN in export_to_torch_ir (#115462 ) AOTInductor currently relies of export_to_torch_ir to generate a graph, and passes it to inductor to generate the .so. They would like the FQN to be consistent so that they can easily find/update the weights in the .so. Note that since export flattens all modules in to a single computational graph, we will change the FQNs in the original module by replacing all periods with underscores. For example, `foo.child1param`, which points to a submodule named `foo`'s parameter named `child1param`, will be renamed to `foo_child1param` since we no longer have the submodule `foo`. This is done just by doing `name.replace(".", "_")`. Outputted AOTInductor c++ code: https://www.internalfb.com/phabricator/paste/view/P900120950?lines=377-355%2C354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/115462 Approved by: https://github.com/tugsbayasgalan	2023-12-13 04:58:47 +00:00
Aaron Gokaslan	794545c11f	[BE]: Enable RUF015 codebase wide (#115507 ) Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507 Approved by: https://github.com/malfet	2023-12-11 15:51:01 +00:00
HDCharles	b5d3d3ebf0	[ao] making hist_obs handle torch.inf and closeby values (#103467 ) Summary: This PR does 2 things: 1) Previously this would simply error, now it will ignore any torch.inf values that it recieves. note: The code checks for torch.inf after aminmax that way if there are no torch.inf values found, the perf is a relatively unchanged 2) as mentioned in https://github.com/pytorch/pytorch/issues/100051, values close to (but not quite at) the maximum/minimum float value could overflow to infinity in the course of _adjust_min_max() (when this large value would be multiplied by something in the middle of a calculation that would otherwise result in a non inf value). This was fixed by rearranging the order of operations for the lines in question without altering the actual equations. Specifically, where operations in lines 1095, 1098 and 1100 have multiplication and division of large values, its better to divide the two large values before multiplying, rather than multiplying the two large values together (creating overflow) before dividing like it had been. Test Plan: python test/test_quantization.py TestObserver.test_histogram_observer_ignore_infinity python test/test_quantization.py TestObserver.test_histogram_observer_handle_close_to_infinity Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D51489345](https://our.internmc.facebook.com/intern/diff/D51489345) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103467 Approved by: https://github.com/andrewor14	2023-12-08 21:41:31 +00:00
Jerry Zhang	cc8f6f56dc	[quant][pt2e] Add convert callback to Observer module (#115001 ) Summary: This is to allow easier extension of quant workflow in the future, as we are seening more diverse ways of doing quantization putting up this for feedbacks first Test Plan: python test/test_quantization.py TestQuantizePT2E.test_observer_callback Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/115001 Approved by: https://github.com/kimishpatel	2023-12-08 13:47:37 +00:00
Jerry Zhang	ecba053cff	[quant][pt2e] XNNPACKQuantizer skip inserting observers for non-float Tensors (#114999 ) Summary: att Test Plan: python test/test_quantization.py -k test_add_mul_long Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/114999 Approved by: https://github.com/kimishpatel, https://github.com/guangy10	2023-12-07 22:13:36 +00:00
Jerry Zhang	a93b9ee9d8	[quant][be] Add a test for per channel quant for groupwise conv (#115224 ) Summary: just making sure this works Test Plan: python test/test_quantization.py -k test_groupwise_per_channel_quant Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/115224 Approved by: https://github.com/andrewor14	2023-12-07 04:46:20 +00:00
leslie-fang-intel	7ec145bfed	[Quant] [PT2] Fix XNNPACKQuantizer set_module_type issue (#115252 ) Summary Fix the issue https://github.com/pytorch/pytorch/issues/115251, the root-cause is we pass the `filter_fn` parameter of `find_sequential_partitions` in wrong position. Use keyword arg to fix this issue. Summary ``` python -u -m pytest -s -v test_quantization.py -k test_set_module_type_case_2 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/115252 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-12-07 03:08:20 +00:00

1 2 3 4 5 ...

1637 Commits