pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

David Berard 7d205b22b5 [profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 ) Retry of https://github.com/pytorch/pytorch/pull/150957, which was reverted due to internal meta failures Credit to @mgmtea who wrote the initial version of this PR: https://github.com/pytorch/pytorch/pull/146604 Context: CUPTI is the NVIDIA library that Kineto uses for collecting GPU-side info during profiling. The intended usage is to register a callback while you want profiling to occur, and then unregister the callback when you want profiling to stop. But a bug would cause crashes if CUPTI callbacks were de-registered when used with cudagraphs. The workaround was to disable "CUPTI_LAZY_REINIT" and "CUPTI_TEARDOWN" in Kineto - which prevents crashes, but can result in slower execution after profiling has occurred and completed. This bug is believed to be fixed in CUDA >= 12.6, so this PR qualifies that DISABLE_CUPTI_LAZY_REINIT=1 and CUPTI_TEARDOWN=0 should only be applied if CUDA >= 12.6. Additionally, `profiler_allow_cudagraph_cupti_lazy_reinit_cuda12()` is added as an escape hatch so that we can add a killswitch in case we see more crashes related to this. Differential Revision: [D72842114](https://our.internmc.facebook.com/intern/diff/D72842114/) NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D72842114/)! Differential Revision: [D72842114](https://our.internmc.facebook.com/intern/diff/D72842114) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151124 Approved by: https://github.com/sraikund16		2025-04-15 16:11:49 +00:00
..
__init__.py	[Profiler] Enable Iterative Step without profiler in fbcode (#142077 )	2024-12-12 19:00:13 +00:00
_memory_profiler.py	[BE][Ez]: Use itertools.chain.from_iterable when possible (#148190 )	2025-03-06 20:37:06 +00:00
_pattern_matcher.py	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 )	2025-01-21 16:57:27 +00:00
_utils.py	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 )	2025-01-21 16:57:27 +00:00
itt.py	[BE][Easy][19/19] enforce style for empty lines in import segments in `torch/[o-z]*/` (#129771 )	2024-08-01 17:07:14 +00:00
profiler.py	[profiler][retry] don't disable CUPTI_LAZY_REINIT for cuda >= 12.6 (#151124 )	2025-04-15 16:11:49 +00:00
python_tracer.py	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 )	2025-01-21 16:57:27 +00:00