pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
sekyonda	c467e59cb0	dynamo configs to torch.compiler (#163517 ) Moving some dynamo configs to torch.compiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/163517 Approved by: https://github.com/williamwen42, https://github.com/anijain2305 Co-authored-by: Svetlana Karslioglu <svekars@meta.com>	2025-10-14 22:44:53 +00:00
Pian Pawakapan	075a2e6967	[PGO] add extra read/write keys (#160715 ) Differential Revision: D80321215 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160715 Approved by: https://github.com/bobrenjc93	2025-08-18 01:41:08 +00:00
Oguz Ulgen	a29ed5e1ac	Add torch compile force disable caches alias (#158072 ) Bunch of people keep thinking current alias only disables inductor cache because it has the name inductor in it. lets globalize the name Pull Request resolved: https://github.com/pytorch/pytorch/pull/158072 Approved by: https://github.com/ezyang	2025-08-02 23:23:17 +00:00
Aaron Orenstein	b794e77b7b	Disable cudagraph GCs by default (#158649 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158649 Approved by: https://github.com/eellison ghstack dependencies: #158193	2025-07-29 19:56:11 +00:00
Aaron Orenstein	e20736bf1d	Dont't GC as often when collecting cudagraphs (#158193 ) TL;DR: Cuts vLLM cudagraph collection from 80s -> 24s Stop garbage collecting by default on every cudagraph recording. The old behavior can be re-enabled by setting `TORCH_CUDAGRAPH_GC=1` or the config `force_cudagraph_gc`. We were previously garbage collecting at the beginning of each cudagraph capture. vLLM collects 5427 graphs and most of those garbage collections weren't actually collecting any memory (CPU or GPU). This changes it to not collect more than every 10s so if we're capturing in a loop we don't burn all our cycles looking for garbage. (These number have a lot of variance from run to run but give the correct general scale) ``` \| calls \| total \| synchronize \| gcs \| collect \| empty cache \| sys freed \| cuda freed \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ before \| 5427 \| 78s \| 1.48s \| 5427 \| 53.22s \| 1.21s \| 145855 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ after \| 5427 \| 24s \| 0s \| 3 \| 1.53s \| 0.84s \| 592 \| 1539309568 \| -------+-------+-------+-------------+------+---------+-------------+-----------+------------+ ``` total - this is the total time reported by vLLM's "Graph capturing finished" log. The rest of these are measured in torch.cuda.graphs.graph.__enter__(): calls - number of times torch.cuda.graphs.graph.__enter__ was called synchronize - this is the duration taken by the cuda.synchronize call gcs - number of times gc.collect was called collect - this is the duration taken by the gc.collect call empty cache - this is the duration taken by the torch.cuda.empty_cache call sys freed - the number of bytes reported freed by gc.collect cuda freed - the number of bytes reported freed by torch.cuda.memory_reserved So it seems like the heavy lifting is done by torch.cuda.empty_cache() which is fairly quick. Cudagraph results from the TorchInductor Performance DashBoard (this is from the original version using the GC clock so the real results will be slightly better than this): <img width="1494" height="382" alt="image" src="https://github.com/user-attachments/assets/69b705ef-47ce-4b6e-9733-1ec941cad93d" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/158193 Approved by: https://github.com/ngimel	2025-07-24 21:37:11 +00:00
PyTorch MergeBot	9a7c2f1f64	Revert "Add torch compile force disable caches alias (#158072 )" This reverts commit `2ecf083b72`. Reverted https://github.com/pytorch/pytorch/pull/158072 on behalf of https://github.com/jeffdaily due to fails on rocm, signal ignored while rocm was unstable ([comment](https://github.com/pytorch/pytorch/pull/158072#issuecomment-3086740829))	2025-07-18 04:58:24 +00:00
Oguz Ulgen	2ecf083b72	Add torch compile force disable caches alias (#158072 ) Bunch of people keep thinking current alias only disables inductor cache because it has the name inductor in it. lets globalize the name Pull Request resolved: https://github.com/pytorch/pytorch/pull/158072 Approved by: https://github.com/ezyang	2025-07-17 15:40:36 +00:00
bobrenjc93	ea5b9eca74	Combine sticky pgo key with job id (#154863 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154863 Approved by: https://github.com/Mingming-Ding	2025-06-03 07:58:38 +00:00
bobrenjc93	d865b784e4	Support unbacked whitelist (#154295 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154295 Approved by: https://github.com/angelayi	2025-05-28 23:01:22 +00:00
bobrenjc93	2560c1f3f0	add sticky cache pgo (#154418 ) It's a reland of https://github.com/pytorch/pytorch/pull/154394 that hit some mergebot bug Pull Request resolved: https://github.com/pytorch/pytorch/pull/154418 Approved by: https://github.com/malfet	2025-05-27 16:40:18 +00:00
bobrenjc93	4708cfdbd9	Support whitelist of dynamic sources (#147979 ) This PR introduces the ability to whitelist sources as dynamic. This is particularly useful for large models with graph breaks, as you can keep the dynamism across graph breaks since source names stay consistent. Additionally you can use this to mark ints as dynamic. NB: I intentionally didn't complicate the interface by supporting specification of per dimension dynamism. There is virtue in keeping true to the standard way of representing sources (eg. L['x']). If we find in practice that we need more more fine grained control, we can explore further affordances at that time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147979 Approved by: https://github.com/Mingming-Ding	2025-02-28 15:43:14 +00:00
Oguz Ulgen	28d8297712	Migrate compiler config to Config (#143152 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143152 Approved by: https://github.com/ezyang ghstack dependencies: #143229	2024-12-14 07:38:25 +00:00
PyTorch MergeBot	e87f07d3b8	Revert "Migrate compiler config to Config (#143152 )" This reverts commit `1ebdfd5605`. Reverted https://github.com/pytorch/pytorch/pull/143152 on behalf of https://github.com/oulgen due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/143152#issuecomment-2542342073))	2024-12-13 20:55:14 +00:00
Oguz Ulgen	1ebdfd5605	Migrate compiler config to Config (#143152 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143152 Approved by: https://github.com/ezyang ghstack dependencies: #143150, #143151	2024-12-13 19:29:07 +00:00
Oguz Ulgen	0f6bfc58a2	Introduce remote cache key prefix to break cache (#142148 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142148 Approved by: https://github.com/jamesjwu, https://github.com/ezyang	2024-12-10 00:35:50 +00:00
Edward Z. Yang	585dbfa583	Profile guided optimization for automatic_dynamic (#139001 ) Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR. This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001 Approved by: https://github.com/oulgen	2024-11-03 06:29:57 +00:00
PyTorch MergeBot	92d7f29e59	Revert "Profile guided optimization for automatic_dynamic (#139001 )" This reverts commit `f6be44c74e`. Reverted https://github.com/pytorch/pytorch/pull/139001 on behalf of https://github.com/ezyang due to more fbcode errors ([comment](https://github.com/pytorch/pytorch/pull/139001#issuecomment-2452985581))	2024-11-02 13:11:04 +00:00
Edward Z. Yang	f6be44c74e	Profile guided optimization for automatic_dynamic (#139001 ) Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR. This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Differential Revision: [D65065497](https://our.internmc.facebook.com/intern/diff/D65065497) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001 Approved by: https://github.com/oulgen	2024-11-02 11:50:11 +00:00
PyTorch MergeBot	8d1eaa3da6	Revert "Profile guided optimization for automatic_dynamic (#139001 )" This reverts commit `a6630bcf87`. Reverted https://github.com/pytorch/pytorch/pull/139001 on behalf of https://github.com/ezyang due to internal code triggers import cycle ([comment](https://github.com/pytorch/pytorch/pull/139001#issuecomment-2452833882))	2024-11-02 03:38:15 +00:00
Edward Z. Yang	a6630bcf87	Profile guided optimization for automatic_dynamic (#139001 ) Previously: https://github.com/pytorch/pytorch/pull/138052 but the implementation is done from scratch, so I open a new PR. This implements the ability to save and load profiles of automatic dynamic decisions, so on subsequent runs we can directly make something automatically dynamic. Unlike the previous implementation, this cache is never enabled by default; instead, you have to specify a "job id" that says it's OK to share results. We will be able to automatically populate this id for internal MAST jobs but for generic OSS users you will have to explicitly opt into it. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Differential Revision: [D65065497](https://our.internmc.facebook.com/intern/diff/D65065497) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139001 Approved by: https://github.com/oulgen	2024-11-01 21:43:25 +00:00

20 Commits