pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Rohan Varma 55ca6901a7 [CheckpointWrapper] Decouple CPU offload (#84907 ) This fixes the activation offload for checkpoint wrapper, which was previously broken. It was broken because it was tightly coupled with activation checkpoint, i.e. we did: ``` with save_on_cpu: checkpoint(module_forward()) ``` which would not offload any activation tensors to CPU, as those activations would already be not saved by autograd due to the checkpoint implementation taking priority. Now, if `offload_to_cpu` is specified, we only do `save_on_cpu` and no checkpoint, so all intermediate tensors are offloaded to CPU instead of checkpointed. These wrappers can be composed, i.e. if we have `(Linear, Linear) -> (Linear, Linear) -> (Linear, Linear)` we can do `Offload( checkpoint(Linear, Linear) -> checkpoint(Linear, Linear) -> checkpoint(Linear, Linear))` and inner tensors would be checkpointed while outers will be offloaded. Differential Revision: [D39448882](https://our.internmc.facebook.com/intern/diff/D39448882/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84907 Approved by: https://github.com/awgu		2022-09-15 00:30:23 +00:00
..
_checkpoint	[CheckpointWrapper] Decouple CPU offload (#84907 )	2022-09-15 00:30:23 +00:00
_comm_hooks	Enforce explicit ProcessGroup passed into DefaultState (#84105 )	2022-08-29 14:52:58 +00:00
_optimizer_overlap	make fsdp folder to be public (#72084 )	2022-02-02 15:50:14 +00:00
_quantization	Change docstring type callable to Callable for consistency (#82487 )	2022-08-01 17:26:09 +00:00
ddp_comm_hooks	Add __all__ for a few distributed modules plus a little typing (reland) (#84872 )	2022-09-13 21:57:49 +00:00
model_averaging	Integrate xdoctest - Rebased (#82797 )	2022-08-12 02:08:01 +00:00
__init__.py
join.py	Integrate xdoctest - Rebased (#82797 )	2022-08-12 02:08:01 +00:00