pytorch/torch/distributed
Lei 2022588295 Fix: Ensure writeback handles NO_SHARD correctly by flattening tensors before copying (#154369)
Fixes #151223

Because FSDP stores original parameters as views into a flattened tensor, changing the flattened parameter’s tensor directly can desynchronize the views. With the NO_SHARD strategy this caused a shape mismatch error when writing back modified parameters.

Ensured writeback handles NO_SHARD correctly by flattening tensors before copying. The logic now flattens the source parameter or gradient when the strategy is unsharded to maintain the expected 1‑D shape for writeback operations

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154369
Approved by: https://github.com/weifengpy
2025-07-06 09:20:31 +00:00
..
_composable [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_shard [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
_sharded_tensor [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_sharding_spec
_symmetric_memory [SymmMem] Allow selection of allocation backend (#156661) 2025-06-26 21:37:44 +00:00
_tensor [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
_tools [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
algorithms Fix non-bitwise type annotations for Tensor operators (see #145838) (#146845) 2025-06-24 15:41:34 +00:00
autograd [remove untyped defs] batch 1 (#157011) 2025-06-30 23:54:40 +00:00
benchmarks
checkpoint Add async checkpointing impl to experimental checkpointer and add a builder API (#156927) 2025-07-03 22:49:20 +00:00
elastic [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
examples Support XPU in memory tracker (#150703) 2025-06-12 21:33:52 +00:00
fsdp Fix: Ensure writeback handles NO_SHARD correctly by flattening tensors before copying (#154369) 2025-07-06 09:20:31 +00:00
launcher [2/n]passing event log handler to record function calls (#155457) 2025-06-12 19:35:08 +00:00
nn [BE]: Update ruff to 0.11.8 (#153249) 2025-05-12 18:30:52 +00:00
optim [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
pipelining [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
rpc Make torch importable if compiled without TensorPipe (#154382) 2025-05-27 18:13:38 +00:00
tensor [dtensor] Rework partial propagation in pointwise op and support mul (#157340) 2025-07-03 17:04:08 +00:00
__init__.py c10d/Store: add nonblocking mode to queue_pop (#151485) 2025-04-18 02:14:50 +00:00
_checkpointable.py [BE]: Backport runtime_checkable perf improvements/behavior from 3.12 (#155130) 2025-06-06 13:28:05 +00:00
_composable_state.py
_functional_collectives_impl.py
_functional_collectives.py mypy 1.16.0 (#155821) 2025-06-14 18:18:43 +00:00
_serialization.py [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
_state_dict_utils.py [dcp] add new checkpoint staging to preserve storage sharing and support mutable state_dicts (#155192) 2025-06-19 02:04:21 +00:00
argparse_util.py
c10d_logger.py
collective_utils.py [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
constants.py [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
CONTRIBUTING.md
device_mesh.py [inductor] Add typing to _inductor/ir.py (#149958) 2025-06-30 15:56:35 +00:00
distributed_c10d.py Support complex numbers in DTensor redistribute (#157329) 2025-07-02 21:37:16 +00:00
launch.py [BE][PYFMT] migrate PYFMT for torch.{distributed,distributions} to ruff format (#144547) 2025-02-28 07:35:56 +00:00
logging_handlers.py
remote_device.py
rendezvous.py [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
run.py [BE][5/16] fix typos in torch/ (torch/distributed/) (#156315) 2025-06-23 02:57:28 +00:00
utils.py Refactor to use torch.accelerator.device_index instead of torch.cuda.device for generic device context manager (#148880) 2025-04-25 09:45:25 +00:00