mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70030 range_push and range_pop do not support multi-thread. It only works for push and pop range in the same thread. For process level ranges, we should use range_start and range_end. This is important because PyTorch forward is on one thread, while the autograd is on a different thread. See NVidia implementation documentation:cab2dec760/NSight/nvToolsExt.h (L397-L407)Test Plan: ``` buck test caffe2/test:cuda Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460 ✓ ListingSuccess: caffe2/test:cuda - main (19.640) Summary ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460 ``` Reviewed By: malfet Differential Revision: D33155244 fbshipit-source-id: c7d5143f6da9b6ef0e0811e2fcae03a3e76f24de (cherry picked from commit22134e91b7)
7 lines
223 B
Python
7 lines
223 B
Python
# Defined in torch/csrc/cuda/shared/nvtx.cpp
|
|
def rangePushA(message: str) -> int: ...
|
|
def rangePop() -> int: ...
|
|
def rangeStartA(message: str) -> int: ...
|
|
def rangeEnd(int) -> None: ...
|
|
def markA(message: str) -> None: ...
|