pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	8de85896e0	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-13 01:48:55 +00:00
PyTorch MergeBot	816fb7f48d	Revert "Enable ruff rule E721 (#165162 )" This reverts commit `9e7c19f72b`. Reverted https://github.com/pytorch/pytorch/pull/165162 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165162#issuecomment-3393328271))	2025-10-11 13:25:40 +00:00
Yuanyuan Chen	9e7c19f72b	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-11 06:43:53 +00:00
Jiang, Yanbing	f2f25a5444	Upgrade submodule oneDNN to v3.7.1 (#148293 ) This PR is to upgrade submodule oneDNN to v3.7.1. ## Improvements - Improved performance of convolution and matmul primitives on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Improved performance of int8 and fp32 forward convolution primitive on processors with Intel AVX2 instruction set support. - Improved performance of fp8 matmul primitives with bf16 and fp16 bias data type on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Introduced initial optimizations for Intel GPUs based on Xe3 architecture. - Added bfloat16 support for SDPA, implemented fp16 and bf16 gemm kernel in SDPA. - Fixed f16 matmul accuracy, the issue of SDPA cannot dispatched to ukernel, bf16/fp16/fp32 conv performance, INT8 Kernel trigger page fault, deconvolution precision issue on complex128 and fp64 and gemm correctness issue in float16 issues. - Improved bf16 matmul performance with fp32 destination with Arm Compute Library (ACL). - Improved bf16 to fp32 reorder performance. - Improved bf16 reorder performance. - Improved bf16 convolution with ACL. Fixes https://github.com/pytorch/pytorch/issues/136348. ## Validation results on CPU 1. NLP models accuracy/inference/training ![image](https://github.com/user-attachments/assets/859279b8-1631-4268-b226-7de9ac5870d8) ![image](https://github.com/user-attachments/assets/30ec7151-41ca-482a-9d2d-0c4850e75bab) 2. Torchbench cpu userbenchmark inference & training ![image](https://github.com/user-attachments/assets/71c9807c-caf9-4385-9990-d2ab637031cd) 3. Inductor quantization ![image](https://github.com/user-attachments/assets/3d2a3bd3-82fa-4566-8050-7ea5d6b61675) 4. Dynamo benchmarks ![image](https://github.com/user-attachments/assets/554ecce3-c85c-4a0e-88f1-2e73983c5dcd) ![image](https://github.com/user-attachments/assets/148c88f8-4367-4428-bb54-ce8a4deefd1b) ![image](https://github.com/user-attachments/assets/f2e744f4-d710-4699-acf4-1f130ecfadf1) ![image](https://github.com/user-attachments/assets/97128b80-4d0e-495a-aeda-dde3e70c96fd) ![image](https://github.com/user-attachments/assets/a9afce37-684c-45c0-b938-6dd7e0383805) ![image](https://github.com/user-attachments/assets/b8714236-9681-4fbe-8d98-be93deedab88) ![image](https://github.com/user-attachments/assets/4423061f-d133-45ba-98bd-d2f739e50431) ![image](https://github.com/user-attachments/assets/7955da10-3d23-493e-99fa-658f7f40035b) ## Validation results on XPU Accuracy is same as baseline. Performance is shown below. ![image](https://github.com/user-attachments/assets/7645304d-5b1d-43f9-b840-9f846ed380a0) ## Validation results on ARM ![image](https://github.com/user-attachments/assets/080f7c02-0238-436f-ad20-5a9e3f6aafbb) ![image](https://github.com/user-attachments/assets/443742aa-ca61-41de-ae80-5d4c65cd0c87) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148293 Approved by: https://github.com/mingfeima, https://github.com/atalman	2025-03-04 13:56:45 +00:00
cyy	b7832f0339	Enable ASAN in CUDA tests (#147812 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/147812 Approved by: https://github.com/janeyx99	2025-03-04 02:50:39 +00:00
PyTorch MergeBot	e72b4c61bf	Revert "Upgrade submodule oneDNN to v3.7 (#147498 )" This reverts commit `576ed1e400`. Reverted https://github.com/pytorch/pytorch/pull/147498 on behalf of https://github.com/wdvr due to failing some tests on trunk - see below ([comment](https://github.com/pytorch/pytorch/pull/147498#issuecomment-2679867286))	2025-02-24 22:57:39 +00:00
Jiang, Yanbing	576ed1e400	Upgrade submodule oneDNN to v3.7 (#147498 ) This PR is to upgrade submodule oneDNN to v3.7. ## Improvements - Improved performance of convolution and matmul primitives on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Improved performance of int8 and fp32 forward convolution primitive on processors with Intel AVX2 instruction set support. - Improved performance of fp8 matmul primitives with bf16 and fp16 bias data type on Intel Xeon processors with Intel AMX instruction set support (formerly Sapphire Rapids and Granite Rapids). - Introduced initial optimizations for Intel GPUs based on Xe3 architecture. - Added bfloat16 support for SDPA, implemented fp16 and bf16 gemm kernel in SDPA. - Fixed f16 matmul accuracy, the issue of SDPA cannot dispatched to ukernel, bf16/fp16/fp32 conv performance, INT8 Kernel trigger page fault, deconvolution precision issue on complex128 and fp64 and gemm correctness issue in float16 issues. - Improved bf16 matmul performance with fp32 destination with Arm Compute Library (ACL). - Improved bf16 to fp32 reorder performance. - Improved bf16 reorder performance. - Improved bf16 convolution with ACL. Fixes https://github.com/pytorch/pytorch/issues/136348. ## Validation results on CPU 1. NLP models accuracy/inference/training ![image](https://github.com/user-attachments/assets/859279b8-1631-4268-b226-7de9ac5870d8) ![image](https://github.com/user-attachments/assets/30ec7151-41ca-482a-9d2d-0c4850e75bab) 2. Torchbench cpu userbenchmark inference & training ![image](https://github.com/user-attachments/assets/71c9807c-caf9-4385-9990-d2ab637031cd) 3. Inductor quantization ![image](https://github.com/user-attachments/assets/3d2a3bd3-82fa-4566-8050-7ea5d6b61675) 4. Dynamo benchmarks ![image](https://github.com/user-attachments/assets/554ecce3-c85c-4a0e-88f1-2e73983c5dcd) ![image](https://github.com/user-attachments/assets/148c88f8-4367-4428-bb54-ce8a4deefd1b) ![image](https://github.com/user-attachments/assets/f2e744f4-d710-4699-acf4-1f130ecfadf1) ![image](https://github.com/user-attachments/assets/97128b80-4d0e-495a-aeda-dde3e70c96fd) ![image](https://github.com/user-attachments/assets/a9afce37-684c-45c0-b938-6dd7e0383805) ![image](https://github.com/user-attachments/assets/b8714236-9681-4fbe-8d98-be93deedab88) ![image](https://github.com/user-attachments/assets/4423061f-d133-45ba-98bd-d2f739e50431) ![image](https://github.com/user-attachments/assets/7955da10-3d23-493e-99fa-658f7f40035b) ## Validation results on XPU Accuracy is same as baseline. Performance is shown below. ![image](https://github.com/user-attachments/assets/7645304d-5b1d-43f9-b840-9f846ed380a0) ## Validation results on ARM ![image](https://github.com/user-attachments/assets/080f7c02-0238-436f-ad20-5a9e3f6aafbb) ![image](https://github.com/user-attachments/assets/443742aa-ca61-41de-ae80-5d4c65cd0c87) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147498 Approved by: https://github.com/fadara01, https://github.com/mingfeima, https://github.com/atalman	2025-02-24 14:32:51 +00:00
Sun, Jiayi	23e2f8ab3a	[Inductor] add flag for linear binary folding and turn it off by default (#142108 ) Fix https://github.com/pytorch/pytorch/issues/141755. Summary: linear binary folding results in a timm_model(levit_128) accuracy regression, this PR adds flag `enable_linear_binary_folding` for linear binary folding and turn it off by default. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142108 Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel	2024-12-06 07:12:29 +00:00
Sun, Jiayi	93e3c91679	[inductor] support linear+binary foldinig for freezing path (#138807 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138807 Approved by: https://github.com/jgong5, https://github.com/jansel Co-authored-by: Jiong Gong <jiong.gong@intel.com>	2024-11-20 05:34:09 +00:00
Aaron Orenstein	7f98781f84	Fix autodeps from D62049222 that pyfmt broke (#136455 ) Summary: `arc lint` changed the formatting which then caused autodeps to be confused. Test Plan: this passes: ``` arc lint --skip AUTODEPS fbpython fbcode/tools/build/buck/linters/lint_autoformat.py --linter=autodeps --default-exec-timeout=1800 -- fbcode/caffe2/test/inductor/test_memory_planning.py ``` Differential Revision: D63277059 Pull Request resolved: https://github.com/pytorch/pytorch/pull/136455 Approved by: https://github.com/bobrenjc93, https://github.com/oulgen	2024-09-24 05:06:12 +00:00
Aaron Orenstein	8c356ce3da	Fix lint errors in fbcode (#135614 ) Summary: Fixed a bunch of fbcode imports that happened to work but confused autodeps. After this autodeps still suggests "improvements" to TARGETS (which breaks our builds) but at least it can find all the imports. Test Plan: ``` fbpython fbcode/tools/build/buck/linters/lint_autoformat.py --linter=autodeps --default-exec-timeout=1800 -- fbcode/caffe2/TARGETS fbcode/caffe2/test/TARGETS ``` Before: ``` ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/testing.py:229) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fbur$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export.py:87) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fburl$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_serdes.py:9) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fb$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_serdes.py:10) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https://fburl$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_retraceability.py:7) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See https:$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_retraceability.py:6) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See ht$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export_nonstrict.py:7) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See http$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_export_nonstrict.py:6) when processing rule "test_export". Please make sure it's listed in the srcs parameter of another rule. See $ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "test_export" (from caffe2/test/export/test_export_training_ir_to_run_decomp.py:8) when processing rule "test_export". Please make sure it's listed in the srcs parameter of an$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "testing" (from caffe2/test/export/test_export_training_ir_to_run_decomp.py:10) when processing rule "test_export". Please make sure it's listed in the srcs parameter of anoth$ ERROR while processing caffe2/test/TARGETS: Found "//python/typeshed_internal:typeshed_internal_library" owner for "cv2" but it is protected by visibility rules: [] (from caffe2/test/test_bundled_images.py:7) when processing rule "test_bundled_$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "caffe2.test.profiler_test_cpp_thread_lib" (from caffe2/test/profiler/test_cpp_thread.py:29) when processing rule "profiler_test_cpp_thread". Please make sure it's listed in t$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._utils_internal.get_file_path_2" (from caffe2/test/test_custom_ops.py:23) when processing rule "custom_ops". Please make sure it's listed in the srcs parameter of anoth$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._utils_internal.get_file_path_2" (from caffe2/test/test_public_bindings.py:13) when processing rule "public_bindings". Please make sure it's listed in the srcs paramete$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._C._profiler.symbolize_tracebacks" (from caffe2/test/test_cuda.py:3348) when processing rule "test_cuda". Please make sure it's listed in the srcs parameter of another $ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for "torch._C._profiler.gather_traceback" (from caffe2/test/test_cuda.py:3348) when processing rule "test_cuda". Please make sure it's listed in the srcs parameter of another rule$ ERROR while processing caffe2/test/TARGETS: Cannot find an owner for include <torch/csrc/autograd/profiler_kineto.h> (from caffe2/test/profiler/test_cpp_thread.cpp:2) when processing profiler_test_cpp_thread_lib. Some things to try: ``` Differential Revision: D62049222 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135614 Approved by: https://github.com/oulgen, https://github.com/laithsakka	2024-09-13 02:04:34 +00:00
Xu Han	900c5083ed	[inductor] calibration inductor windows uts (9/N) (#134425 ) enable Windows inductor UTs of `test/inductor/test_binary_folding.py` Failed UT depends on https://github.com/pytorch/pytorch/pull/134427 Need to rebase after https://github.com/pytorch/pytorch/pull/134427 merged. ```cmd 2024-08-25T23:32:23.0905727Z Traceback (most recent call last): 2024-08-25T23:32:23.0906516Z File "C:\actions-runner\_work\pytorch\pytorch\test\inductor\test_binary_folding.py", line 18, in <module> 2024-08-25T23:32:23.0908200Z from inductor.test_inductor_freezing import TestCase 2024-08-25T23:32:23.0909883Z File "C:\actions-runner\_work\pytorch\pytorch\test\inductor\test_inductor_freezing.py", line 39, in <module> 2024-08-25T23:32:23.0911128Z raise unittest.SkipTest("requires sympy/functorch/filelock") 2024-08-25T23:32:23.0911801Z unittest.case.SkipTest: requires sympy/functorch/filelock 2024-08-25T23:32:23.0912370Z Got exit code 1 2024-08-25T23:32:23.0913155Z No stepcurrent file found. Either pytest didn't get to run (e.g. import error) or file got deleted (contact dev infra) ``` Local test pass: <img width="1898" alt="image" src="https://github.com/user-attachments/assets/4a6e3f66-4bbc-4aab-8f0d-2e2318046e53"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/134425 Approved by: https://github.com/ezyang, https://github.com/jansel	2024-08-26 20:57:41 +00:00
Xuehai Pan	134bc4fc34	[BE][Easy][12/19] enforce style for empty lines in import segments in `test/i*/` (#129763 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763 Approved by: https://github.com/jansel	2024-07-18 07:49:19 +00:00
PyTorch MergeBot	b732b52f1e	Revert "[BE][Easy][12/19] enforce style for empty lines in import segments in `test/i*/` (#129763 )" This reverts commit `aecc746fcc`. Reverted https://github.com/pytorch/pytorch/pull/129763 on behalf of https://github.com/XuehaiPan due to need reland after rerunning lintrunner on main ([comment](https://github.com/pytorch/pytorch/pull/129763#issuecomment-2235736732))	2024-07-18 06:39:58 +00:00
Xuehai Pan	aecc746fcc	[BE][Easy][12/19] enforce style for empty lines in import segments in `test/i*/` (#129763 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129763 Approved by: https://github.com/jansel	2024-07-18 05:13:41 +00:00
xinan.lin	cc518ebd38	[Inductor Intel GPU backend Upstream] Reuse inductor test for Intel GPU (PART 2) (#124147 ) Reuse Inductor test case for Intel GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124147 Approved by: https://github.com/EikanWang, https://github.com/jansel	2024-06-16 08:07:05 +00:00
Michael Lazos	402b289f3b	Properly register parameter for binary folding test (#128356 ) This PR properly registers the tensor used in the module compute as a parameter. This bug was hidden previously because all tensors on the nn modules would be considered constant by dynamo, with inlining NN modules, this is no longer the case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128356 Approved by: https://github.com/anijain2305 ghstack dependencies: #128355	2024-06-11 06:48:26 +00:00
Sam Larsen	4cd503c1f3	Enable FX graph cache for a batch of inductor tests (#121696 ) Summary: Get more FX graph cache coverage by enabling it for these unit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/121696 Approved by: https://github.com/eellison	2024-03-14 03:39:59 +00:00
PyTorch MergeBot	6bffde99b0	Revert "[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275 )" This reverts commit `66d09f8217`. Reverted https://github.com/pytorch/pytorch/pull/113275 on behalf of https://github.com/huydhn due to Sorry for reverting your stack, but it is failing to list test internally with buck2 ([comment](https://github.com/pytorch/pytorch/pull/113275#issuecomment-1811666004))	2023-11-15 01:44:26 +00:00
PyTorch MergeBot	1e60174891	Revert "[dynamo] Add run_inductor_tests entrypoint (#113278 )" This reverts commit `b00311ce9e`. Reverted https://github.com/pytorch/pytorch/pull/113278 on behalf of https://github.com/huydhn due to Sorry for reverting your stack, but it is failing to list test internally with buck2 ([comment](https://github.com/pytorch/pytorch/pull/113278#issuecomment-1811646325))	2023-11-15 01:19:48 +00:00
Jason Ansel	b00311ce9e	[dynamo] Add run_inductor_tests entrypoint (#113278 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/113278 Approved by: https://github.com/yanboliang	2023-11-11 08:54:43 +00:00
Jason Ansel	66d09f8217	[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275 ) This PR is just moving things around, so code shared by multiple tests files is in torch/testing/_internal/inductor_utils.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113275 Approved by: https://github.com/yanboliang ghstack dependencies: #113242	2023-11-11 03:17:35 +00:00
Peter Bell	c4fe817a69	[inductor] Fix test_dist on pre-sm80 and add skipCUDAIf decorator (#113384 ) `test_dist` uses bfloat16 which isn't well supported by triton on pre-sm80 hardware, so split the test in two and add a skip. This also adds a `skipCUDAIf` decorator which only skips on CUDA devices so the test still runs on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113384 Approved by: https://github.com/lezcano	2023-11-10 19:45:33 +00:00
PyTorch MergeBot	68bf0f1e7d	Revert "[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275 )" This reverts commit `c967dc526a`. Reverted https://github.com/pytorch/pytorch/pull/113275 on behalf of https://github.com/PaliC due to the diff this is stacked on top of appears to be causing inductor failures internally ([comment](https://github.com/pytorch/pytorch/pull/113275#issuecomment-1805131017))	2023-11-10 05:40:55 +00:00
Jason Ansel	c967dc526a	[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275 ) This PR is just moving things around, so code shared by multiple tests files is in torch/testing/_internal/inductor_utils.py. Pull Request resolved: https://github.com/pytorch/pytorch/pull/113275 Approved by: https://github.com/yanboliang	2023-11-10 00:11:09 +00:00
Kaichao You	b30ee35a6f	[Inductor][FX]Support efficient conv bn eval (#108757 ) This PR adds an `efficient_conv_bn_eval_graph_transform` pass to the inductor. It tries to identify consecutive conv + bn computation with bn in eval mode, and changes it to a more efficient implementation. It does not modify parameters, which makes it support training without any pain. If no such patterns are identified, it does nothing. Therefore, it is backward compatible. It has great benefit in terms of memory footprint: For resnet50 with input batchsize 64, image size 224, forward + backward training: \| Technique \| Memory Footprint (GB) \| Remarks \| \|-------------------------------\|----------------------------\|-------------------------------------------\| \| Eager Mode \| 5.18 \| \| \| torch.compile \| 5.46 \| Strangely, not saving memory \| \| torch.compile with this PR \| 2.88 \| Saves about 50% memory! \| The script to measure the memory footprint: ```python from torchvision.models.resnet import resnet50 import torch net = resnet50().eval().cuda() input = torch.randn(64, 3, 224, 224).cuda() opt_net = torch.compile(net) # Use torch.compile # opt_net = net # Eager mode current_memory = torch.cuda.memory_allocated() torch.cuda.reset_peak_memory_stats() for i in range(10): opt_net.zero_grad() output = opt_net(input) output.sum().backward() del output peak_memory = torch.cuda.max_memory_allocated() additional_peak_memory = peak_memory - current_memory print(f"Additional peak memory used: {additional_peak_memory / (1024 ** 3)} GB") ``` More results can be found in the corresponding paper: (this method is called Tune Mode in the tables). <img width="709" alt="image" src="https://github.com/pytorch/pytorch/assets/23236638/db4815b0-d93e-4726-b1d5-e6651f256484"> <img width="653" alt="image" src="https://github.com/pytorch/pytorch/assets/23236638/22e5e1ab-6129-4c3d-a875-3c7343293b2e"> Note: the difference between this PR and https://github.com/pytorch/pytorch/pull/106372 is that, https://github.com/pytorch/pytorch/pull/106372 tries to fix and change the implementation of `torch.fx.experimental.optimization.fuse`, which causes compatibility issues; this PR only introduces a new graph transform passes, and does not break the previous code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/108757 Approved by: https://github.com/jansel	2023-09-20 08:10:02 +00:00
XiaobingSuper	9c1802f8e3	inductor: using binary folding path to do conv+bn folding (#105650 ) This path will use binary folding to do conv+bn folding to avoid using ```make_fx``` which meets tracing errors in some model dynamic shape path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/105650 Approved by: https://github.com/eellison	2023-07-26 07:37:47 +00:00
XiaobingSuper	837363c72f	inductor: support conv+binary foldinig for freezing path (#105048 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105048 Approved by: https://github.com/jgong5, https://github.com/eellison	2023-07-26 01:50:30 +00:00

28 Commits