pytorch/torch/distributed
Chien-Chin Huang 6aa92806db [CP] Use TorchFunctionMode to dispatch SDPA for CP (#147902)
While we prefer not use monkey patching to dispatch SDPA, TorchFunctionMode is currently not compatible with selective activation checkpointing (https://github.com/pytorch/pytorch/issues/147995). This PR adds `TorchFunctionMode` to CP code and make it configurable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147902
Approved by: https://github.com/XilunWu
2025-04-25 23:33:48 +00:00
..
_composable [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_shard [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_sharded_tensor [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_sharding_spec
_symmetric_memory [async TP] Fix handling of case where scatter dim = 0 for 2D output tensor (#150935) 2025-04-10 18:25:48 +00:00
_tensor [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_tools Add support for non functional collectives under FakeTensorMode and fake_pg for memory tracking (#147566) 2025-03-08 18:00:49 +00:00
algorithms [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
autograd
benchmarks
checkpoint Fix lint 2025-04-17 12:48:52 -07:00
elastic logging start of torch elastic workers. (#150849) 2025-04-22 22:35:06 +00:00
examples
fsdp [FSDP1] print fqns when debug FlatParamHandle (#151336) 2025-04-24 04:49:24 +00:00
launcher
nn [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
optim [BE][Ez]: Use itertools.chain.from_iterable when possible (#148190) 2025-03-06 20:37:06 +00:00
pipelining [PP] Add schedule visualizer (#150347) 2025-04-15 00:38:18 +00:00
rpc Document poison fork note for accelerator APIs (#147507) 2025-04-10 02:37:37 +00:00
tensor [CP] Use TorchFunctionMode to dispatch SDPA for CP (#147902) 2025-04-25 23:33:48 +00:00
__init__.py c10d/Store: add nonblocking mode to queue_pop (#151485) 2025-04-18 02:14:50 +00:00
_checkpointable.py
_composable_state.py
_functional_collectives_impl.py
_functional_collectives.py [Async TP] More robust support for rowwise scales when fusing matmul reduce-scatter (#149247) 2025-03-27 03:15:30 +00:00
_serialization.py PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
_state_dict_utils.py Create and send full_tensor on ProcessGroup-supported device in _broadcast_tensors (#148865) 2025-03-12 20:56:31 +00:00
argparse_util.py
c10d_logger.py
collective_utils.py
constants.py
CONTRIBUTING.md
device_mesh.py [DeviceMesh] Add some documentation for from_group API and add a 2D test (#146364) 2025-03-01 00:57:37 +00:00
distributed_c10d.py [C10D] avoid computing global_rank when group_rank is used (#151373) 2025-04-17 23:53:50 +00:00
launch.py [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
logging_handlers.py
remote_device.py
rendezvous.py Fix dist.init_process_group on windows (#148266) 2025-03-05 00:07:56 +00:00
run.py [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
utils.py Refactor to use torch.accelerator.device_index instead of torch.cuda.device for generic device context manager (#148880) 2025-04-25 09:45:25 +00:00