mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
[TorchRec][PT2] disable contextlib in PT2 train pipeline (#147254)
[TorchRec][PT2] disable contextlib in PT2 train pipeline (#147254) Summary: # context * more details in the [post](https://fb.workplace.com/groups/1075192433118967/permalink/1587079018596970/) * disable contextlib with PT2 Test Plan: * run command ``` TORCH_SHOW_CPP_STACKTRACES=1 TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 TORCH_LOGS="+dynamo,+graph_code,output_code,dynamic,aot,guards,verbose_guards,recompiles,graph_breaks" TORCH_TRACE=/var/tmp/tt buck2 run fbcode//mode/opt fbcode//aps_models/ads/icvr:icvr_launcher_live -- mode=fmc/local_ig_fm_ultra_mini training.pipeline_type=pt2 data_loader.dataset.table_ds=[2024-12-02] 2>&1 | tee -a output.log ``` * old tlparse https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpYYAS3o/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100 * new tlparse https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpUJhCGZ/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100 Reviewed By: Microve Differential Revision: D68480678
This commit is contained in:
parent
fa8e3a28a7
commit
85ea679834
|
|
@ -1444,11 +1444,11 @@ class DistributedDataParallel(Module, Joinable):
|
|||
"""`TorchDynamo` requires DDP's status and module for cooperative optimization."""
|
||||
return cls._active_ddp_module
|
||||
|
||||
@torch._disable_dynamo(recursive=True)
|
||||
# note, this ctxmgr function is marked 'skip' in torchdynamo, so dynamo only kicks in
|
||||
# for the 'module_to_run' underneath
|
||||
# see torch._dynamo/eval_frame.py TorchPatcher.patch for more details
|
||||
@contextmanager
|
||||
@torch._disable_dynamo(recursive=False)
|
||||
def _inside_ddp_forward(self):
|
||||
DistributedDataParallel._active_ddp_module = self
|
||||
try:
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user