pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Edward Z. Yang	cad79bd0bb	Remove follow_imports = skip from sympy (#118469 ) dmypy silently ignores follow_imports = skip, so to get parity between dmypy and mypy we have to suck it up and type: ignore all of the sympy typing problems. The suppressions were added automatically with the following script generated by GPT-4: ``` import re # Read the error file with open("error_file.txt", "r") as f: errors = f.readlines() # Parse the lines with errors and error types error_lines = {} for error in errors: match = re.match(r"(.):(\d+):\d+: error:.\[(.*)\]", error) if match: file_path, line_number, error_type = match.groups() if file_path not in error_lines: error_lines[file_path] = {} error_lines[file_path][int(line_number)] = error_type # Insert ignore comments in the source files for file_path, lines in error_lines.items(): with open(file_path, "r") as f: code = f.readlines() for line_number, error_type in sorted(lines.items(), key=lambda x: x[0], reverse=True): code[line_number - 1] = code[line_number - 1].rstrip() + f" # type: ignore[{error_type}]\n" with open(file_path, "w") as f: f.writelines(code) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118469 Approved by: https://github.com/Skylion007 ghstack dependencies: #118414, #118418, #118432, #118467, #118468	2024-01-28 13:38:38 +00:00
vfdev-5	85aa372374	[inductor] Fixed conv issue with dynamic shapes (#114351 ) EDIT: fixes https://github.com/pytorch/pytorch/issues/114354 Description: The following code is failing: ```python import torch def func(x, w): return torch.nn.functional.conv2d(x, w, groups=int(w.shape[0])) x = torch.rand(1, 3, 64, 64) w = torch.rand(3, 1, 3, 3) y1 = func(x, w) cfunc = torch.compile(func, fullgraph=True, dynamic=True) y2 = cfunc(x, w) torch.testing.assert_close(y1, y2) ``` with the error: ``` File "/pytorch/torch/_inductor/kernel/conv.py", line 315, in convolution assert isinstance(groups, int) torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: LoweringException: AssertionError: target: aten.convolution.default args[0]: TensorBox(StorageBox( InputBuffer(name='arg3_1', layout=FixedLayout('cpu', torch.float32, size=[1, s0, s1, s1], stride=[s0s12, s12, s1, 1])) )) args[1]: TensorBox(StorageBox( InputBuffer(name='arg1_1', layout=FixedLayout('cpu', torch.float32, size=[s0, 1, s0, s0], stride=[s02, s0*2, s0, 1])) )) args[2]: None args[3]: [1, 1] args[4]: [0, 0] args[5]: [1, 1] args[6]: False args[7]: [0, 0] args[8]: s0 ``` where `groups` argument is a symbol but expected to be `int`. This PR specializes `group` to its int value and fixes the problem. Context: Failing tests in torchvision with gaussian blur and adjust_sharpness ops - https://github.com/pytorch/vision/actions/runs/6955843968/job/18926393710?pr=8127 Pull Request resolved: https://github.com/pytorch/pytorch/pull/114351 Approved by: https://github.com/ezyang	2023-11-23 13:13:06 +00:00
Jez Ng	c77dd684c9	Enable typechecking in _inductor/ir.py (#110112 ) I used a bunch of ignore-type comments, mostly due to https://github.com/pytorch/pytorch/issues/109963. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110112 Approved by: https://github.com/peterbell10	2023-10-07 04:19:38 +00:00
Ying Zhang	097fd43f8c	[Inductor CUTLASS backend] Step 4: CUDA (template) kernels (#107931 ) This is the step 4 to add cutlass as an alternative inductor backend. Full tests can be found from the last PR in the stack. Feature request: https://github.com/pytorch/pytorch/issues/106991. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107931 Approved by: https://github.com/aakhundov, https://github.com/jansel, https://github.com/kadeng ghstack dependencies: #107802, #107847, #107901	2023-09-12 17:44:38 +00:00
Jack Taylor	a18ee0c6ec	[ROCm] ROCm compatible configs for triton kernels (#107584 ) This PR brings in a few inductor changes required for ROCm ~1 - Introduction of a toggle for enforced channel last convolution fallbacks~ This addition is split off into its own PR after some cleanup by @pragupta https://github.com/pytorch/pytorch/pull/107812 2 - Addition of ROCm specific block sizes We are now able to support the MAX_AUTOTUNE mode on ROCm, we are proposing conditions to allow us to finetune our own block tuning. Currently triton on ROCm does not benefit from pipelining so we are setting all configs to `num_stages=1` and we have removed some upstream tunings on ROCm to avoid running out of shared memory resources. In the future we will provide more optimised tunings for ROCm but for now this should mitigate any issues ~3 - Addition of device_type to triton's compile_meta~ ~Proposing this addition to `triton_heuristics.py`, Triton on ROCm requires device_type to be set to hip https://github.com/ROCmSoftwarePlatform/triton/pull/284 suggesting to bring this change in here so we can pass down the correct device type to triton.~ This change is split off and will arrive in the wheel update PR https://github.com/pytorch/pytorch/pull/107600 leaving this PR to focus on the ROCm specific block sizes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107584 Approved by: https://github.com/jithunnair-amd, https://github.com/jansel, https://github.com/eellison	2023-08-26 18:24:55 +00:00
Jez Ng	9c9982a0aa	Turn on typechecking for _inductor/kernel/conv.py (#106258 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/106258 Approved by: https://github.com/Skylion007 ghstack dependencies: #106252	2023-08-18 08:49:18 +00:00
eellison	8298720299	Enable Lowering Channels last Conv1x1 when max autotune is set (#107004 ) This can lead to a large speedup when max autotune is set, e.g. resnet 2.1x -> 2.5x, particularly in combination with freezing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107004 Approved by: https://github.com/jansel, https://github.com/shunting314, https://github.com/int3 ghstack dependencies: #106911, #106912	2023-08-17 16:05:32 +00:00
Edward Z. Yang	a01a732954	Rename some sizevars methods for clarity (#105585 ) The guard functions require you to ALREADY KNOW that a particular condition holds. If you don't know (you want to guard on an expression being a particular value, and then get access to that value), use the evaluate functions. I renamed the functions that don't abide by this: ``` guard_min -> evaluate_min guard_max (deleted, no uses) guard_static_shape -> evaluate_static_shape guard_static_shapes -> evaluate_static_shapes ``` Some added comments. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/105585 Approved by: https://github.com/voznesenskym	2023-07-21 04:46:23 +00:00
chunyuan	d61cd03b97	Inductor cpp wrapper: support ConvTranspose and fix Convolution ir (#103308 ) The changes in this PR include: - Support ConvTranspose in cpp wrapper - Fix cpp wrapper support for aten convolution when bias is `not None`: bias is in `args` instead of `kwargs` when it is `not None`. The change is covered by ConvTranspose dynamic shapes UT since we'll fall back to aten convolution in dynamic shape cases. - Fix cpp wrapper support for `inf`. This is a UT added in https://github.com/pytorch/pytorch/issues/101865. The cpp wrapper UT is covered in `test_conv2d_unary` of `test_cpp_wrapper.py`. It's in `slowTest` category and seems not captured in the CI of that PR. I will submit another PR to remove the hard-coded schema in these `ExternKernel`s. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103308 Approved by: https://github.com/jgong5, https://github.com/desertfire	2023-06-10 03:53:05 +00:00
Shunting Zhang	86c7652503	[inductor] layout optimization for conv (#99773 ) convolution kernel with channels last runs much faster then kernel with contiguous inputs. The PR leverage that to optimize tensor layouts so we provide 'channels last' inputs to convolution. Some care need to be taken to not convert tensor layout between contiguous and channels last back and forth. Those extra copies hurt performance quite much. Latest perf number [here](https://hud.pytorch.org/benchmark/compilers?startTime=Wed%2C%2024%20May%202023%2023%3A40%3A37%20GMT&stopTime=Wed%2C%2031%20May%202023%2023%3A40%3A37%20GMT&granularity=hour&suite=torchbench&mode=training&dtype=amp&lBranch=shunting-layout-opt-19&lCommit=baa797fc100688dfb044fbcbdebcfd2591710f78&rBranch=main&rCommit=999bae0f54108ffc5b7cf2524a02a83901554b16) - TB: 1.64x -> 1.69x - HF: 1.79x -> 1.78x (random noise) - TIMM: 1.51x -> 1.65x Right now we disable layout optimization for dynamic shape since there is perf loss in that combination. Here is a GH issue to followup: https://github.com/pytorch/pytorch/issues/102670 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99773 Approved by: https://github.com/jansel	2023-06-02 21:08:18 +00:00
Edward Z. Yang	b94f143ace	SymIntify convNd and conv_transposeNd, fix inductor symint handling (#101488 ) Fixes https://github.com/pytorch/pytorch/issues/101014 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101488 Approved by: https://github.com/ngimel	2023-05-16 17:46:52 +00:00
Michael Voznesensky	a0934f8bad	Replace maybe_guard with statically_known (#99383 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/99383 Approved by: https://github.com/ngimel	2023-04-26 05:53:48 +00:00
Bin Bao	0c0e5c574e	[inductor] Consolidate constant_args and cpp_constant_args (#98742 ) Summary: Refactor code to simplify the logic. Support convolution as an extern call in CudaWrapperCodeGen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98742 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-04-12 11:59:08 +00:00
Peter Bell	b7ff717232	[inductor] Use 64-bit indexing for large tensors in triton code (#97447 ) This changes `TritonKernel` to have an `index_dtype` property which is used as the dtype in indexing calculations. By default it is `tl.int32` but if any input or output buffer is larger than `INT_MAX` then we use `tl.int64` instead. should fix #96978, #93606 (need to double check) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97447 Approved by: https://github.com/ngimel	2023-04-08 00:55:51 +00:00
Nicolas Macchioni	29608fd28d	[pt2][inductor] hardcode autotuning names (#98351 ) Summary: switch to hardcoded autotuning names, we want consistency incase the default choice changes Test Plan: CI Differential Revision: D44643318 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98351 Approved by: https://github.com/jansel	2023-04-07 03:40:33 +00:00
Yanbo Liang	ccc27bc361	[Inductor] Fix convolution lowering if stride or padding or dilation is 1 element list (#98448 ) Fixes error from 14k github models. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98448 Approved by: https://github.com/ngimel	2023-04-06 10:40:06 +00:00
Jason Ansel	9370f253e3	[inductor] Rewrite convolution triton templates (#95556 ) Fixes #95775 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95556 Approved by: https://github.com/Chillee, https://github.com/ngimel	2023-03-22 18:12:23 +00:00

17 Commits