pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	030de07aff	[2/N] Use 'is' in callable comparisons (#166685 ) It is generally advised to use `is/is not` for comparisons against torch functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166685 Approved by: https://github.com/xmfan, https://github.com/mlazos	2025-10-31 08:08:07 +00:00
Yuanyuan Chen	694db5f549	Use 'is' in callable comparisons (#166624 ) Just like we use `is/is not` for class comparisons, it is generally advised to use `is/is not` for comparisons against torch functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166624 Approved by: https://github.com/Lucaskabela, https://github.com/Skylion007	2025-10-30 19:00:09 +00:00
Yuanyuan Chen	fb64da0791	[2/N] Use "is" in python type comparison (#165142 ) This is follow-up of #165037. It generally recommended to use `is/is not` to compare types. Therefore this series of changes apply this suggestion in the code base, and it aims to finally enabling related linter checks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165142 Approved by: https://github.com/albanD	2025-10-10 15:36:44 +00:00
wengshiy	b44306d368	Add dont constant fold flag (#154945 ) For support https://github.com/pytorch/ao/issues/2228 > What we want to do now is to enable FP8 quantization in PyTorch. And similar as INT8 quantization, we need to insert quantize and dequantize ops into the graph. > > However we met problems with these q/dq ops both in the PyTorch core and Torchao. > > PyTorch core: > > The quantize_per_tensor op does not support FP8. We want to fix it via https://github.com/pytorch/pytorch/pull/153601. And as you commented, the op is deprecated. > Torchao: > > In the fusion pass in Inductor, we want to match the pattern fp8_weight -> torchao.dequantize_affine_float8 -> fp32_op and fuse it as fp8_weight -> weight_pack -> fp8_op. We have done so for INT8 PT2E quantization. However, the pattern matching pass is applied after a constant folding pass in Inductor: > `100ec0b34a/torch/_inductor/fx_passes/freezing_patterns.py (L69C1-L74C1)` > After constant_fold(gm), the pattern will be folded as fp32_weight -> fp32_op. Then the original pattern cannot be found any more and the FP8 semantics is lost since the pattern is entirely in fp32 now. > For INT8, the int8_weight -> quantized_decomposed.dequantize_per_channel -> fp32_op pattern won't be folded because we mark quantized_decomposed.dequantize_per_channel impure so that it won't be folded: `100ec0b34a/torch/_inductor/constant_folding.py (L139C1-L149C1)` . But for the torchao.dequantize_affine_float8, we cannot do this because > It is an op from Torchao, which is unknown to the constant folder > It is decomposed to smaller ops, so we cannot put it in the list as a single op. > So, we think an easy and short-term solution is to modify the ops in PyTorch core via https://github.com/pytorch/pytorch/pull/153601. > However, if we want to resolve the issue with Torchao, we need to > Add a method in the constant folder in Inductor to allow registration of impure ops Based on [Jansel‘s reply](https://github.com/pytorch/ao/issues/2228#issuecomment-2914560340), add dont constant fold flag on this patch Pull Request resolved: https://github.com/pytorch/pytorch/pull/154945 Approved by: https://github.com/jansel Co-authored-by: Jason Ansel <jansel@jansel.net>	2025-06-10 14:52:26 +00:00
PyTorch MergeBot	05dd638ee9	Revert "Add dont constant fold flag (#154945 )" This reverts commit `196c95d463`. Reverted https://github.com/pytorch/pytorch/pull/154945 on behalf of https://github.com/malfet due to This broke halide test sanity, see `a3098a74d4/1` ([comment](https://github.com/pytorch/pytorch/pull/154945#issuecomment-2945598901))	2025-06-05 18:25:59 +00:00
wengshiy	196c95d463	Add dont constant fold flag (#154945 ) For support https://github.com/pytorch/ao/issues/2228 > What we want to do now is to enable FP8 quantization in PyTorch. And similar as INT8 quantization, we need to insert quantize and dequantize ops into the graph. > > However we met problems with these q/dq ops both in the PyTorch core and Torchao. > > PyTorch core: > > The quantize_per_tensor op does not support FP8. We want to fix it via https://github.com/pytorch/pytorch/pull/153601. And as you commented, the op is deprecated. > Torchao: > > In the fusion pass in Inductor, we want to match the pattern fp8_weight -> torchao.dequantize_affine_float8 -> fp32_op and fuse it as fp8_weight -> weight_pack -> fp8_op. We have done so for INT8 PT2E quantization. However, the pattern matching pass is applied after a constant folding pass in Inductor: > `100ec0b34a/torch/_inductor/fx_passes/freezing_patterns.py (L69C1-L74C1)` > After constant_fold(gm), the pattern will be folded as fp32_weight -> fp32_op. Then the original pattern cannot be found any more and the FP8 semantics is lost since the pattern is entirely in fp32 now. > For INT8, the int8_weight -> quantized_decomposed.dequantize_per_channel -> fp32_op pattern won't be folded because we mark quantized_decomposed.dequantize_per_channel impure so that it won't be folded: `100ec0b34a/torch/_inductor/constant_folding.py (L139C1-L149C1)` . But for the torchao.dequantize_affine_float8, we cannot do this because > It is an op from Torchao, which is unknown to the constant folder > It is decomposed to smaller ops, so we cannot put it in the list as a single op. > So, we think an easy and short-term solution is to modify the ops in PyTorch core via https://github.com/pytorch/pytorch/pull/153601. > However, if we want to resolve the issue with Torchao, we need to > Add a method in the constant folder in Inductor to allow registration of impure ops Based on [Jansel‘s reply](https://github.com/pytorch/ao/issues/2228#issuecomment-2914560340), add dont constant fold flag on this patch Pull Request resolved: https://github.com/pytorch/pytorch/pull/154945 Approved by: https://github.com/leslie-fang-intel, https://github.com/jansel Co-authored-by: Jason Ansel <jansel@jansel.net>	2025-06-05 13:42:44 +00:00
Mu-Chu Lee	83acb688bb	Fix constant folding cloning constants (#152273 ) Summary: Bug fix for #135060 Simple review: https://github.com/pytorch/pytorch/pull/135060/files#diff-f23386709ff7e1235b15e18f835a48e5124e0ddd596aeb33c201daad1abbedd7R357 We mistakenly typed get_attr into getattr. This causes constants never get untagged, and forces all constants get cloned twice which greatly increases the memory consumption. Test Plan: python test/inductor/test_aot_inductor.py -k test_empty_constant_folding Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/152273 Approved by: https://github.com/trieuat, https://github.com/zhxchen17	2025-05-01 17:34:39 +00:00
Shangdi Yu	01e9036bd2	skip torchbind in cosntant folding (#148993 ) Summary: Do not fold torchbind objects in constant folding Any operation on these torchbind objects can have arbitrary side effects, so we can't effectively constant fold anything torchbind-obj-related anyway. Test Plan: ``` buck run fbcode//mode/dev-nosan //caffe2/test/inductor:torchbind -- -r aot_compile_constant_folding ``` Reviewed By: angelayi Differential Revision: D69946541 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148993 Approved by: https://github.com/angelayi	2025-03-12 18:08:08 +00:00
Xuehai Pan	1cb4e2df65	[BE][PYFMT] migrate PYFMT for `torch._inductor` to `ruff format` (#144550 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144550 Approved by: https://github.com/jansel	2025-02-28 13:33:19 +00:00
Tuan Trieu	ca397d82a6	[Sigmoid] Fix issues with constant folding and fba_ops (#146948 ) Summary: There are 2 issues: - `skip_folding_node_fn` isn't considered when propagating constant values. So given a skipped node with constant inputs, it outputs a constant and its users can output constant values and then be included in the constant graph. However, the skipped node is not included in the constant graph when extracting the constant graph. This issue is fixed by checking for skipped node when propagating the constant values and making the skipped node to output unknown value (not constant) so that its users cannot output constant. - `fba_linear` op can be included in the constant graph but it is not implemented for CPU so constant graph cannot be executed. This issue is fixed by converting `fba_linear` to `aten.addmm`. - A refactor to allow more fba_ops to be included in the constant graph (via mapping fba_ops to aten ops). Reviewed By: StellarrZ Differential Revision: D68716393 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146948 Approved by: https://github.com/zhxchen17	2025-02-18 23:17:47 +00:00
Sam Larsen	2811f33d12	Fix code cache + freezing compile-time regression (#145868 ) Summary: The current implementation introduces a compile-time regression due to overhead hashing large constants. To support freezing+caching, we consider only the tensor metadata of frozen params, but we neglect to do the same for any constants created as a result of folding frozen params. This PR Explicitly marks the constants created during freezing (and constant folding during freezing) and uses that info in the inductor cache to determine when to hash a tensor value+metadata vs. metadata only. Test Plan: `python benchmarks/dynamo/torchbench.py --backend inductor --device cuda --only alexnet --bfloat16 --cold-start-latency --print-compilation-time --inference --performance --freezing` Pull Request resolved: https://github.com/pytorch/pytorch/pull/145868 Approved by: https://github.com/eellison	2025-01-31 02:04:15 +00:00
Aaron Orenstein	893ca1dfe1	PEP585 update - torch/_inductor/[_-i]* (#145137 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145137 Approved by: https://github.com/bobrenjc93	2025-01-19 01:22:47 +00:00
Yiming Zhou	6d56277682	[export] Fix torchbind constant folding (#144684 ) Summary: `CallTorchBind` should not be folded during constant folding Test Plan: ``` buck2 run mode/dev-nosan sigmoid/inference/test:test_passes -- -r test_const_folding_torchbind ``` Reviewed By: henryoier Differential Revision: D67721272 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144684 Approved by: https://github.com/zhxchen17	2025-01-14 01:58:44 +00:00
Tom Ritchford	da67a6a7bb	[inductor] Replace set by OrderedSet (#138466 ) Uses the set_linter from https://github.com/pytorch/pytorch/pull/138454 and considerable manual editing Pull Request resolved: https://github.com/pytorch/pytorch/pull/138466 Approved by: https://github.com/eellison	2024-12-13 16:08:45 +00:00
Xia, Weiwen	c863227be3	[Quant][Inductor][X86] add fusion pass for linear_dynamic_fp16 (#141549 ) Description For `linear_dynamic_fp16`, we insert `quantize` and `dequantize` between x/w and linear to have the following pattern: ``` x \| linear <- to_fp32 <- to_fp16 <- w ``` In Inductor, the pattern we finally see will be ``` fp32 activation \| (reshape) \| mm/addmm <- t <- to_fp32 <- tp_fp16 <- weight \| (reshape) ``` Or ``` fp32 activation \| expand \| bmm <- expand <- t <- to_fp32 <- tp_fp16 <- weight \| (add) ``` The second pattern is for x.ndim > 2 and x is not contiguous. The first pattern is for other cases. Fuse the pattern with weight prepack, and we get ``` fp32 activation \| onednn.linear_dynamic_fp16 <- onednn.linear_prepack_fp16 <- weight ``` After freezing, the prepack op is gone. Test plan ``` python test/inductor/test_mkldnn_pattern_matcher.py -k test_linear_dynamic_fp16 ``` Differential Revision: [D66802159](https://our.internmc.facebook.com/intern/diff/D66802159) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141549 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2024-12-07 03:08:08 +00:00
Zhengxu Chen	011650adc5	[sigmoid] Refactor out a helper function to insert const graph into top level graph. (#140854 ) Summary: Add the helper function to put a const graph back to the toplevel graph, can be useful when we're taking const graphs from delegates. Test Plan: CI Reviewed By: trieuat Differential Revision: D63031982 Pull Request resolved: https://github.com/pytorch/pytorch/pull/140854 Approved by: https://github.com/SherlockNoMad	2024-11-26 20:07:46 +00:00
Edward Z. Yang	612122af8f	Fix type-safety of torch.nn.Module instances (#141240 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/141240 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-11-22 00:05:05 +00:00
Tuan Trieu	633dcf1a2d	Constant folding for lifted graph (#135060 ) Summary: Current implementation for lifted graph takes a dict of [constant name: constant value]. And the constant value is used to run_node and excute the constant graph to get the folded values and then create new getattr nodes for folded values. We don't have constant values for lifted graph during model compilation on MTIA. I think it is more general to allow the constant folding pass to just take the constant names only to produce the constant graph and represent the folded nodes as placeholders to make it consistent with lifted graph. Additionally, this mimic the real situation on Sigmoid, where Sigmoid executes the constant graph, get the folded values and set the folded values to the main graph. This diff is to update the pass to work with a list of constant names. Test Plan: ``` buck run mode/opt caffe2/test:test_export -- -r split_const_gm ``` Differential Revision: D62144791 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135060 Approved by: https://github.com/SherlockNoMad Co-authored-by: Tuan Trieu <tuant@meta.com>	2024-10-28 06:28:31 +00:00
leslie-fang-intel	b18ba9419e	[AO][Inductor] Enable WOQ fusion pattern with permute (#135928 ) Summary Fix https://github.com/pytorch/pytorch/issues/135831 and https://github.com/pytorch/ao/issues/890. The root cause of the numerical failure was that the customized woq-int8 kernel was not triggered due to changes in the pattern. After re-adding the fusion pattern, the accuracy check now passes. I will open a separate TorchAO PR to enable these unit tests in TorchAO. Test Plan ``` python test/inductor/test_mkldnn_pattern_matcher.py -k test_woq_int8 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/135928 Approved by: https://github.com/jgong5, https://github.com/eellison	2024-09-18 00:56:16 +00:00
eellison	aaabfc8930	[Easy] Check if quant registered in constant folding (#135875 ) Belated fix for https://github.com/pytorch/pytorch/issues/110904 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135875 Approved by: https://github.com/shunting314	2024-09-12 22:16:39 +00:00
Aaron Orenstein	d95aedf5fd	[BE] typing for decorators - fx/_compatibility (part 1) (#134202 ) Part of #134054. This corresponds to the pytorch mypy changes from D61493706. Updating takes so long and touches so many files that it's impossible to land as a whole without conflicting with some other intermediate change. So landing these 'type: ignore' for pytorch in advance of them actually being needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134202 Approved by: https://github.com/Skylion007	2024-08-22 17:07:33 +00:00
Edward Z. Yang	9282e6ca78	Don't use _disable_current_modes as decorator (#132809 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/132809 Approved by: https://github.com/albanD ghstack dependencies: #132801, #132802, #132804	2024-08-07 23:59:46 +00:00
Oguz Ulgen	09f9c256ad	Add basic mypy annotations to inductor (#132416 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132416 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu ghstack dependencies: #132415	2024-08-04 18:43:37 +00:00
PyTorch MergeBot	f2ddd5e9e0	Revert "Add basic mypy annotations to inductor (#132416 )" This reverts commit `78927d37f6`. Reverted https://github.com/pytorch/pytorch/pull/132416 on behalf of https://github.com/ZainRizvi due to Sorry, this PR has entered a weird state in the diff train. Trying to revert it to skip it, and then we can try relanding it ([comment](https://github.com/pytorch/pytorch/pull/132415#issuecomment-2267631785))	2024-08-04 18:39:29 +00:00
Oguz Ulgen	78927d37f6	Add basic mypy annotations to inductor (#132416 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/132416 Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu ghstack dependencies: #132415	2024-08-01 20:14:25 +00:00
PyTorch MergeBot	945bf78894	Revert "[BE] typing for decorators - fx/_compatibility (#131568 )" This reverts commit `193f62fde9`. Reverted https://github.com/pytorch/pytorch/pull/131568 on behalf of https://github.com/clee2000 due to same as https://github.com/pytorch/pytorch/pull/131572#issuecomment-2254328359 but I clicked the wrong link by accident. This is where it actually starts ([comment](https://github.com/pytorch/pytorch/pull/131568#issuecomment-2254330781))	2024-07-28 03:43:39 +00:00
Aaron Orenstein	193f62fde9	[BE] typing for decorators - fx/_compatibility (#131568 ) See #131429 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131568 Approved by: https://github.com/justinchuby, https://github.com/oulgen, https://github.com/zou3519	2024-07-25 22:24:19 +00:00
Zhengxu Chen	a86909d251	[inductor] Type annotate constant_folding.py (#131364 ) Summary: Type annotate constant_folding.py Test Plan: mypy Reviewed By: angelayi Differential Revision: D60063872 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131364 Approved by: https://github.com/angelayi	2024-07-24 18:20:06 +00:00
Xuehai Pan	b6d477fd56	[BE][Easy][16/19] enforce style for empty lines in import segments in `torch/_i*/` (#129768 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129768 Approved by: https://github.com/jansel	2024-07-20 16:20:58 +00:00
Zhengxu Chen	abb3f2822c	[aotinductor] Support additional lifted constants supplied to const folding. (#130743 ) Summary: In export workflow, we always have a lifted graph which doesn't fetch constants through get_attr nodes. This cause some compatibility issue when we're trying to use inductor's split_const_gm function with a lifted graph. This diff make an additive change to split_const_gm's interface, such that, when the pass sees a placeholder node is present in the lifted_constants table, it will also use that as the source of constness. This change won't break the existing code and the lifted_constants table can be used orthogonal to the existing const folding mechanisms. Also as required from MTIA team, we want to introduce a small callback function used to skip certain nodes during const folding. For the internal followup counterpart, see D59685145 Test Plan: buck run mode/opt caffe2/test:test_export -- -r split_const_gm Differential Revision: D59692790 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130743 Approved by: https://github.com/desertfire, https://github.com/SherlockNoMad	2024-07-19 16:48:56 +00:00
eellison	9ab8d47f9d	Constant folding for dynamic shape node (#129686 ) Extend constant folding for dynamic shape node, only support pointwise op and some restricted ops We support dynamic shapes by limiting constant folding of ops that are guaranteed to have uniform values (full, pointwise ops, and views) and running these operators with tensors of shape 1. This also eliminates the possibility of memory overhead of constant folding. Taken over from https://github.com/pytorch/pytorch/pull/128937 joint work with @imzhuhl Pull Request resolved: https://github.com/pytorch/pytorch/pull/129686 Approved by: https://github.com/Chillee ghstack dependencies: #130367	2024-07-16 00:17:11 +00:00
PyTorch MergeBot	9df4bc6a0d	Revert "Constant folding for dynamic shape node (#129686 )" This reverts commit `b7d287fbec`. Reverted https://github.com/pytorch/pytorch/pull/129686 on behalf of https://github.com/atalman due to Failing internally. Test: https://github.com/pytorch/ao/blob/main/test/prototype/mx_formats/test_mx_linear.py ([comment](https://github.com/pytorch/pytorch/pull/129686#issuecomment-2228755295))	2024-07-15 15:19:24 +00:00
eellison	b7d287fbec	Constant folding for dynamic shape node (#129686 ) Extend constant folding for dynamic shape node, only support pointwise op and some restricted ops We support dynamic shapes by limiting constant folding of ops that are guaranteed to have uniform values (full, pointwise ops, and views) and running these operators with tensors of shape 1. This also eliminates the possibility of memory overhead of constant folding. Taken over from https://github.com/pytorch/pytorch/pull/128937 joint work with @imzhuhl Pull Request resolved: https://github.com/pytorch/pytorch/pull/129686 Approved by: https://github.com/Chillee ghstack dependencies: #130367	2024-07-12 03:44:29 +00:00
Aaron Orenstein	ea614fb2b1	Flip default value for mypy disallow_untyped_defs [2/11] (#127839 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127839 Approved by: https://github.com/oulgen	2024-06-08 18:23:08 +00:00
angelayi	9e1826deff	[torchbind] Add inductor support (#123709 ) Example inductor generated python code: [P1245776497](https://www.internalfb.com/phabricator/paste/view/P1245776497) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123709 Approved by: https://github.com/eellison	2024-05-13 18:18:17 +00:00
Jason Ansel	6022600cc6	[inductor] Handle meta tensor ops in graph (#123786 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123786 Approved by: https://github.com/anijain2305 ghstack dependencies: #123700, #123705	2024-04-12 19:03:13 +00:00
Aaron Orenstein	4044e93a51	Add mm_pattern and bmm_pattern to serialized_patterns (#121313 ) Make it easier to serialize patterns by adding `pattern_matcher.gen_register_replacement()` which is like `pattern_matcher.register_replacement()` but also requires the replacement to be precompiled. To precompile patterns (and save to disk) run: ``` torchgen/fuse_attention_patterns/gen_attention_patterns.py ``` - Updated the sfdp patterns to use `gen_register_replacement`. - Add serialized patterns for mm_pattern and bmm_pattern (The 'misc' patterns don't serialize cleanly so can't be added). - Updated the testing so it checked the round-trip patterns match and not just that it serialized the same way. - Checking that the patterns round-trip properly found that the `users` field wasn't being serialized properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121313 Approved by: https://github.com/eellison	2024-04-09 19:42:19 +00:00
Valentine233	666a628bea	[Inductor pattern] support int8 woq mm pattern matcher with freezing passe (#122955 ) There exist some issues in the previous PR (https://github.com/pytorch/pytorch/pull/120985) of supporting int8 WOQ mm pattern matcher. This PR tends to further optimize it. 1. New patterns are added to match int8 woq mm in gpt-fast model, due to different input layouts. 2. In constant folding, `int8_weight -> dq -> bf16_weight` should be kept for pattern match. 3. Currently, GPT-Fast enables `coordinate_descent_tuning` for CPU. This flag is only useful for CUDA, but it could change the graph: from the non-decomposed fallback pass to the decomposed one. We will disable the flag in GPT-Fast script for CPU, in order to have neat patterns. @yanbing-j Pull Request resolved: https://github.com/pytorch/pytorch/pull/122955 Approved by: https://github.com/jgong5, https://github.com/jansel	2024-04-09 05:06:52 +00:00
Oguz Ulgen	f8465df9f0	Use graph.find_nodes in inductor (#122256 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122256 Approved by: https://github.com/jansel ghstack dependencies: #121565, #122255	2024-04-07 18:51:14 +00:00
Mu-Chu Lee	2b48891e62	[AOTInductor] Add Runtime Constant-folding for AOTInductor (#118765 ) Summary: Add Runtime Constant-folding for AOTInductor. This also include the invocation of constant folding at load time. The constant folding lowering is a 2-step process. First, we split the graph into 2 modules, one of it is the constant module, which doesn't depend on any input and the whole module could be inferred (constant-folded) one-time and be reused. The constant module, is lowered, and being codegen-ed as usual and cached (let's call this constant code). The constant code reuses the whole lowering/profiling/etc. process, only difference is that we do not generate any headers or initialization for the constant code. Second, after handling the constant module, we take care of the main module (which is the part that would depend on the user input.) For the main module, we take in one additional component, the constant code, compare with a normal lowering. Addition step we do here is that, we inject the constant code into the codegen-ed main module, and create the caller for the main module to consume the result of the constant module. Test Plan: Unit tests included in commit. Differential Revision: D53274382 Pull Request resolved: https://github.com/pytorch/pytorch/pull/118765 Approved by: https://github.com/chenyang78	2024-02-01 04:54:25 +00:00
Oguz Ulgen	01b979fc9a	[Inductor] Fix constant folding and extern kernel mutation tracking bugs (#115908 ) This PR fixes two bugs 1) Constant folding a triton kernel results in the kernel's inputs to be returned back without any modification. Disable constant folding for triton kernels. Need more investigation 2) NoneLayout buffers should not be deleted as they do not exist Pull Request resolved: https://github.com/pytorch/pytorch/pull/115908 Approved by: https://github.com/aakhundov, https://github.com/jansel	2023-12-19 02:06:50 +00:00
Digant Desai	6c597ef015	[PyTorch] Fix attr cleanup after constant folding (#113957 ) Summary: Two nodes can point to the same attribute via node.target. This makes sure, - we don't try to delete already deleted attribute, i.e. delete attr only once - we do delete all the nodes pointing to the attribute Test Plan: ``` buck run fbcode//mode/dev-nosan fbcode//executorch/backends/xnnpack/test:test_xnnpack_passes -- executorch.backends.xnnpack.test.passes.test_batch_norm_fusion.TestBatchNormFusion.test_q8_batch_norm_fusion ``` Differential Revision: D51419442 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113957 Approved by: https://github.com/Skylion007	2023-11-21 07:48:15 +00:00
Peter Bell	66c32d099a	Use `pytree.arg_tree_leaves` everywhere (#112394 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394 Approved by: https://github.com/lezcano ghstack dependencies: #112391, #112392, #112393	2023-10-31 15:57:06 +00:00
Peter Bell	bbd5b935e4	Use `pytree.tree_leaves` everywhere (#112324 ) This changes all the instances I could find of `tree_flatten(...)[0]` or `x, _ = tree_flatten` to use `tree_leaves`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324 Approved by: https://github.com/lezcano ghstack dependencies: #112327, #112323	2023-10-30 03:39:04 +00:00
Jerry Zhang	1b51d29b66	[quant][pt2e] Enable constant folding for quantize ops (#109343 ) Summary: This PR added constant folding for quantize ops so that instead of storing fp32 weight in the quantized model, we'll get int8/int16 etc. weight Test Plan: python test/test_quantization.py TestQuantizePT2E.test_fold_quantize also will verify in executorch later Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D49399210](https://our.internmc.facebook.com/intern/diff/D49399210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/109343 Approved by: https://github.com/kimishpatel, https://github.com/jgong5	2023-09-27 06:04:45 +00:00
eellison	c8e72a4a5c	Improve mem efficiency of constant folding (#108421 ) Couple changes to make it more efficient. - Because we replacing nodes that only have a single value, only store a single value instead of the whole tensor for node replacement - torch.fx.Interpreter will preserve a Tensor in the env as long as it has more uses. That also applies even to output uses, but we are not going to constant fold that use. Instead of using last use for garbage collection, use last non output use. If reviewers would prefer I ghstack this bc of code movement let me know. Fix for https://github.com/pytorch/pytorch/issues/108388 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108421 Approved by: https://github.com/jansel	2023-09-06 02:19:30 +00:00
eellison	ed92d9345e	Refactorings for constant folding (#108450 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/108450 Approved by: https://github.com/jansel	2023-09-02 03:49:05 +00:00

47 Commits