pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

History

Wanchao Liang 4f87f47ea1 [dtensor] reuse DTensorSpec as much as possible (#128112 ) as titled, given that our DTensorSpec is immutable, we can always reuse the spec if the input/output have the same tensor metadata. this helps two fold: 1. We don't need to re-calculate the hash everytime we produce a DTensorSpec, reduce runtime operator overhead 2. reduce the DTensor construction overhead. Some local benchmark on a 800 parameter clip_grad_norm shows that for foreach_norm the CPU overhead reduces from 11ms -> 7.8ms (around 30% improvement) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128112 Approved by: https://github.com/awgu		2024-06-06 16:55:50 +00:00
..
__init__.py	[TP] Introduce Sequence Parallel Style for Laynorm/RMSNorm/Dropout (#121295 )	2024-03-07 02:04:59 +00:00
_data_parallel_utils.py	[reland] pass shape/stride during tensor unflatten (#117340 )	2024-01-13 19:33:47 +00:00
_utils.py	[BE] enable ruff rule `Q` from flake8-quotes (#127713 )	2024-06-02 23:25:26 +00:00
api.py	[TP] Add wildcard support (#122968 )	2024-04-02 21:23:39 +00:00
ddp.py
fsdp.py	[FSDP1][2D] Fix FSDP1 2D state_dict to use run_check=False (#123802 )	2024-04-24 01:25:11 +00:00
input_reshard.py
loss.py	[dtensor] reuse DTensorSpec as much as possible (#128112 )	2024-06-06 16:55:50 +00:00
style.py	[tp] add kwargs support to prepare_module_input (#124114 )	2024-04-22 21:46:31 +00:00