pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

fduwjj ae7df51232 [c10d] Fix CudaEventCache for dangling references (#144496 ) Reported in https://github.com/pytorch/pytorch/issues/143470, we have a dangling references in `CudaEventCache`. So we want to fix it. 1. We add a unit test to repro the issue mentioned in the issue. 2. Instead of converting variables to shared pointers as suggested in the issue, we then make the cache itself a shared pointer. So if the thread creates the cache dies before all events get recycled, the cache is still there until the last CudaEvent get deleted. (thanks for the suggestion from @kwen2501 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144496 Approved by: https://github.com/kwen2501		2025-01-15 05:11:48 +00:00
..
aoti_abi_check	[AOTI] Fix complex64 not defined (#132810 )	2024-08-08 18:08:23 +00:00
aoti_inference	[AOTInductor] Add standalone test for compilation from ExportedProgram (#142327 )	2024-12-10 06:50:09 +00:00
api	[ROCm][CI] upgrade CI to ROCm 6.3 (#142152 )	2025-01-09 17:14:16 +00:00
c10d	[c10d] Fix CudaEventCache for dangling references (#144496 )	2025-01-15 05:11:48 +00:00
common	[AOTI] Add ABI-compatiblity tests (#123848 )	2024-04-19 00:51:24 +00:00
dist_autograd	Set RUNPATH so installed tests can find the required shared libraries (#136627 )	2024-10-25 09:38:08 +00:00
jit	Revert "Fix poision child process issue when call getAccelerator() (#144368 )"	2025-01-10 23:36:43 +00:00
lazy	[BE]: Replace clone detach with detach clone to be more efficient (#144469 )	2025-01-09 18:28:39 +00:00
lite_interpreter_runtime	Add None return type to init -- tests (#132352 )	2024-08-01 15:44:51 +00:00
monitor
profiler	[codemod] Fix a few unused-variable issues in pytorch (#143517 )	2024-12-19 00:18:08 +00:00
rpc	[rpc] Fix unit test after c10::nullopt removal (#143690 )	2024-12-20 23:36:07 +00:00
tensorexpr	Fix floating point literals in IRPrinter (#142119 )	2024-12-18 21:59:48 +00:00
__init__.py