Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030
Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible
Fixes https://github.com/pytorch/pytorch/issues/47442
* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls. `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.
Original pull request: https://github.com/pytorch/pytorch/pull/59671
Reviewed By: soulitzer, ngimel
Differential Revision: D29466819
Pulled By: ezyang
fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65190
As described in https://github.com/pytorch/pytorch/issues/65093, there
could be modules which don't have any parameters/buffers. In this case, Pipe
determines that the module should be executed on CPU. However this might result
in unnecessary GPU to CPU transfers whereas the user expected the module to be
executed on the GPU itself by keeping its inputs and outputs on GPU.
For this use case, we introduce a `WithDevice` wrapper which can be used to
override which device a particular module should be executed on as part of the
pipeline.
#Closes: https://github.com/pytorch/pytorch/issues/65093
ghstack-source-id: 138376272
Test Plan:
1) waitforbuildbot
2) unit tests
Reviewed By: SciPioneer
Differential Revision: D31010027
fbshipit-source-id: 4c1c61d3c6feeef341e002e5f7e83dd33ff3a516
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57325
As per the design outlined in
https://github.com/pytorch/pytorch/issues/53952, adding a `NoChunk` wrapper for
pipeline parallelism inputs.
If a Tensor is wrapped with this wrapper, the pipeline implementation does not
split this Tensor across micro-batches and instead just replicates this tensor
as-is similar to non-tensors.
ghstack-source-id: 132009305
Test Plan:
1) unit tests.
2) waitforbuildbot.
Reviewed By: SciPioneer
Differential Revision: D28109277
fbshipit-source-id: ee78c814c715d207d2796aba40b756a8e1834898
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57226
As per the design outlined in
https://github.com/pytorch/pytorch/issues/53952, this PR adds support for
non-Tensor args in the pipeline.
The `NoChunk` wrapper hasn't been implemented yet and will be implemented in a
follow up PR.
ghstack-source-id: 132008356
Test Plan:
1) unit tests
2) waitforbuildbot
Reviewed By: SciPioneer
Differential Revision: D28083564
fbshipit-source-id: 5f09da238eec0167feff76fe98916dedb0a9ae4e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55441
This is the first step towards supporting the proposal outlined in
https://github.com/pytorch/pytorch/issues/53952.
In this PR I've ensured Pipe.forward() accepts a *inputs argument instead of
just a single input as previously. This lays the groundwork for supporting
non-Tensors and generic arguments to the Pipe API. In this PR we still only
support Tensors and non-Tensor support will come in future PRs.
For backward compatibility I've ensured a single Tuple[Tensor] input still
works as expected previously.
ghstack-source-id: 130767499
Test Plan: waitforbuildbot
Reviewed By: SciPioneer
Differential Revision: D27613887
fbshipit-source-id: 05e19e537e6d7fe4999745fc4ba9941ac54906de
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55187
As described in https://github.com/pytorch/pytorch/issues/54927, Pipe
docs didn't explicitly mention initializing RPC. This PR improves the docs and
also ensures Pipe throws a more useful error message when RPC is not
initialized and not an internal assertion error.
ghstack-source-id: 125563552
Test Plan:
1) unit test added.
2) waitforbuildbot
Reviewed By: rohan-varma
Differential Revision: D27521783
fbshipit-source-id: d1a5c6ca789b9a66c07a794468178c25cfd4b743
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53433
As described in https://github.com/pytorch/pytorch/issues/53413, the
pipeline destructor ends up hanging sometimes. The reason for this is that Pipe
uses daemon threads and as a result these threads could be destroyed before the
Pipe destructor is done. The Pipe destructor then calls `join_workers` which
waits on signals from the worker threads, which might be already dead and
results in the main thread blocking forever.
To resolve this issue, in this PR we remove `join_workers` completely since it
is not necessary to wait for daemon threads.
#Closes: https://github.com/pytorch/pytorch/issues/53413
ghstack-source-id: 123641509
Test Plan:
1) Tested with repro in
https://github.com/pytorch/pytorch/issues/53413.
2) Hard to add a unit test for this since the bug really depends on order of
objects being destroyed.
Reviewed By: rohan-varma
Differential Revision: D26863321
fbshipit-source-id: 18fff072cabacfb10390e971eac789859d3dcc81
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50860
Since fairscale.nn.Pipe still uses 'balance' and 'devices' parameters,
other frameworks like fairseq still use these parameters. As a result, the
`convert_to_balance` method is a nice utility to use for migrating to PyTorch
Pipe without changing a lot of code in other frameworks.
In addition to this I've renamed the method to be more illustrative of what it
does and also allowed an optional devices parameter.
ghstack-source-id: 120430775
Test Plan:
1) waitforbuildbot
2) Tested with fairseq
Reviewed By: SciPioneer
Differential Revision: D25987273
fbshipit-source-id: dccd42cf1a74b08c876090d3a10a94911cc46dd8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50791
Add a dedicated pipeline parallelism doc page explaining the APIs and
the overall value of the module.
ghstack-source-id: 120257168
Test Plan:
1) View locally
2) waitforbuildbot
Reviewed By: rohan-varma
Differential Revision: D25967981
fbshipit-source-id: b607b788703173a5fa4e3526471140506171632b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49908
As described in https://github.com/pytorch/pytorch/issues/49891, DDP +
Pipe doesn't work with find_unused_parameters.
This PR adds a simple fix to enable this functionality. This only currently
works for Pipe within a single host and needs to be re-worked once we support
cross host Pipe.
ghstack-source-id: 119573413
Test Plan:
1) unit tests added.
2) waitforbuildbot
Reviewed By: rohan-varma
Differential Revision: D25719922
fbshipit-source-id: 948bcc758d96f6b3c591182f1ec631830db1b15c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48638
Polishing up some of the docs for the main `Pipe` class and its
`forward` method.
ghstack-source-id: 118820804
Test Plan: waitforbuildbot
Reviewed By: rohan-varma
Differential Revision: D25237705
fbshipit-source-id: ba3d8737b90a80024c827c0887fc56f14bf678b7