pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Tianyu Liu efece3f142 [dtensor] add op support for memory efficient attention (#122996 ) This is a followup to flash attention. On cuda, flash attention is supported only for fp16/bf16, whereas memory efficient attention is supported for fp32 (but not fp64). With this PR, one can run SDPA and in general Transformer completely in dtensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122996 Approved by: https://github.com/XilunWu, https://github.com/wanchaol ghstack dependencies: #122995		2024-05-08 17:08:27 +00:00
..
__init__.py	[dtensor] support convolution ops (#113123 )	2023-11-20 21:01:28 +00:00
basic_strategy.py	[BE]: FURB142 - Remove set mutations. Use set update (#124551 )	2024-04-21 14:12:33 +00:00
common_rules.py	[dtensor] refactor schema suggestions in output sharding (#122929 )	2024-04-01 17:39:39 +00:00
conv_ops.py	[dtensor] support convolution ops (#113123 )	2023-11-20 21:01:28 +00:00
embedding_ops.py	[dtensor] implement shard dim change with alltoall (#124872 )	2024-04-30 18:30:34 +00:00
experimental_ops.py	Remove hard numpy dependency from experimental_ops.py (#119520 )	2024-02-27 02:46:13 +00:00
math_ops.py	[dtensor] use str for reduce_op (#125172 )	2024-04-29 23:30:24 +00:00
matrix_ops.py	[dtensor] add op support for memory efficient attention (#122996 )	2024-05-08 17:08:27 +00:00
pointwise_ops.py	DTensor Fused ADAM (#125369 )	2024-05-07 00:08:09 +00:00
random_ops.py	[DTensor][BE] rename PlacementStrategy.output_spec to output_specs since now we support a tuple of DTensorSpec as output (#116437 )	2024-01-24 03:33:58 +00:00
tensor_ops.py	[dtensor] improve new factory strategy (#122995 )	2024-05-08 17:05:07 +00:00
utils.py	DTensor: use memory_format in the hash for all aten ops that use that arg (e.g. aten.clone) (#118667 )	2024-02-20 15:23:48 +00:00
view_ops.py	[dtensor] refactor schema suggestions in output sharding (#122929 )	2024-04-01 17:39:39 +00:00