pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Sam Larsen	358da54be5	[inductor] Better messaging when triton version is too old (#130403 ) Summary: If triton is available, but we can't import triton.compiler.compiler.triton_key, then we see some annoying behavior: 1) If we don't actually need to compile triton, the subprocess pool will still spew error messages about the import failure; it's unclear to users if this is an actual problem. 2) If we do need to compile triton, we a) see the error messages from above and b) get a vanilla import exception without the helpful "RuntimeError: Cannot find a working triton installation ..." Test Plan: Ran with and without torch.compile for a) recent version of triton, b) triton 2.2, and c) no triton. In all cases, verified expected output (success or meaningful error message) Pull Request resolved: https://github.com/pytorch/pytorch/pull/130403 Approved by: https://github.com/eellison	2024-07-10 23:45:50 +00:00
Sam Larsen	87d14ad419	[inductor] Fix TORCHINDUCTOR_FORCE_DISABLE_CACHES (#129257 ) Summary: See https://github.com/pytorch/pytorch/issues/129159; this option wasn't doing its job for a few reasons. In this PR: * Fix the with_fresh_cache_if_config() decorator * Reset the "TORCHINDUCTOR_CACHE_DIR" & "TRITON_CACHE_DIR" env vars in sub-process to support them changing in the parent process Pull Request resolved: https://github.com/pytorch/pytorch/pull/129257 Approved by: https://github.com/oulgen	2024-06-26 18:34:48 +00:00
Max Podkorytov	79959d707c	[Inductor][ROCm] Composable Kernel backend for Inductor (#125453 ) This PR adds an alternative backend for Inductor, adding Composable Kernel Universal GEMM instances to the autotune instance selection. The implementation is heavily influenced by the series of PRs which adds CUTLASS backend (https://github.com/pytorch/pytorch/issues/106991). The main differences are (1) customizing compiler for the ROCm platform (2) customizing template code generation for Composable Kernel Universal GEMM instances. We provide config tuning knobs for balancing between instance sources compilation time and finding the best instance. ### Testing Install the ck library ``` pip install git+https://github.com/rocm/composable_kernel@develop ``` Run the test ``` TORCH_LOGS=+torch._inductor \ pytest --capture=tee-sys test/inductor/test_ck_backend.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125453 Approved by: https://github.com/eellison, https://github.com/jansel	2024-06-25 20:54:14 +00:00
PyTorch MergeBot	ad76da6c16	Revert "[inductor] Fix TORCHINDUCTOR_FORCE_DISABLE_CACHES (#129257 )" This reverts commit `7b57ddd38c`. Reverted https://github.com/pytorch/pytorch/pull/129257 on behalf of https://github.com/clee2000 due to one of the PRs in the stack seems to have broken test/distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_bucketing_concat_op on distributed https://github.com/pytorch/pytorch/actions/runs/9653941844/job/26627760340 `4c1e4c5f30`, not tested on this PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/129257#issuecomment-2189444171))	2024-06-25 16:48:32 +00:00
Sam Larsen	7b57ddd38c	[inductor] Fix TORCHINDUCTOR_FORCE_DISABLE_CACHES (#129257 ) Summary: See https://github.com/pytorch/pytorch/issues/129159; this option wasn't doing its job for a few reasons. In this PR: * Fix the with_fresh_cache_if_config() decorator * Reset the "TORCHINDUCTOR_CACHE_DIR" & "TRITON_CACHE_DIR" env vars in sub-process to support them changing in the parent process Pull Request resolved: https://github.com/pytorch/pytorch/pull/129257 Approved by: https://github.com/oulgen	2024-06-24 23:39:43 +00:00
Aaron Orenstein	ea614fb2b1	Flip default value for mypy disallow_untyped_defs [2/11] (#127839 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127839 Approved by: https://github.com/oulgen	2024-06-08 18:23:08 +00:00
Sam Larsen	e8e0bdf541	[inductor] parallel-compile: call triton_key() before forking (#127639 ) Summary: A user reported severe slowdown on a workload when using parallel compile. The issue is that in some environments, the process affinity changes after forking such that all forked subprocesses use a single logical processor. Described here: https://github.com/pytorch/pytorch/issues/99625. That requires a separate fix, but during debuging we noticed that we can at least optimize the expensive call to triton_key() before forking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127639 Approved by: https://github.com/eellison, https://github.com/anijain2305	2024-06-07 04:12:57 +00:00
Aaron Gokaslan	2d47385f0f	[BE]: Enable ruff TCH rules and autofixes for better imports (#127688 ) Automated fixes to put imports that are only used in type hints into TYPE_CHECKING imports. This also enables the RUFF TCH rules which will automatically apply autofixes to move imports in and out of TYPE_CHECKING blocks as needed in the future, this will make the initial PyTorch import faster and will reduce cyclic dependencies. Co-authored-by: Xuehai Pan <XuehaiPan@pku.edu.cn> Pull Request resolved: https://github.com/pytorch/pytorch/pull/127688 Approved by: https://github.com/XuehaiPan, https://github.com/ezyang, https://github.com/malfet	2024-06-06 16:55:58 +00:00
James Wu	63d7ffe121	Retry of D58015187 Move AsyncCompile to a different file (#127691 ) Summary: This is a retry of https://github.com/pytorch/pytorch/pull/127545/files and D58015187, fixing the internal test that also imported codecache Test Plan: Same tests as CI in github, plus sandcastle for internal unit tests should pass now Differential Revision: D58054611 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127691 Approved by: https://github.com/oulgen	2024-06-03 15:29:41 +00:00
PyTorch MergeBot	22f392ba40	Revert "[easy?] Move AsyncCompile to a different file (#127235 )" This reverts commit `f58fc16e8f`. Reverted https://github.com/pytorch/pytorch/pull/127235 on behalf of https://github.com/izaitsevfb due to breaking internal tests, see [D58015187](https://www.internalfb.com/diff/D58015187) ([comment](https://github.com/pytorch/pytorch/pull/127235#issuecomment-2143518610))	2024-06-01 17:16:16 +00:00
James Wu	f58fc16e8f	[easy?] Move AsyncCompile to a different file (#127235 ) By moving AsyncCompile to its own file, we can import codecache without running the side effects of AsyncCompile. This will be important for AOTAutogradCaching, where we want to share some implementation details with codecache.py without spawning new processes. To conservatively maintain the same behavior elsewhere, every time we import codecache, I've added an import to torch._inductor.async_compile (except in autograd_cache.py, where the explicit goal is to not do this) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127235 Approved by: https://github.com/aorenste, https://github.com/oulgen, https://github.com/masnesral	2024-05-30 02:43:02 +00:00

11 Commits