Summary:
This PR serves two purposes:
1. Design an abstraction over a serialization scheme for C++ modules, optimizers and tensors in general,
2. Add serialization to the ONNX/PyTorch proto format.
This is currently a rough prototype I coded up today, to get quick feedback.
For this I propose the following serialization interface within the C++ API:
```cpp
namespace torch { namespace serialize {
class Reader {
public:
virtual ~Reader() = default;
virtual void read(const std::string& key, Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
class Writer {
public:
virtual ~Reader() = default;
virtual void writer(const std::string& key, const Tensor& tensor, bool is_buffer = false) = 0;
virtual void finish() { }
};
}} // namespace torch::serialize
```
There are then subclasses of these two for (1) Cereal and (2) Protobuf (called the "DefaultWriter" and "DefaultReader" to hide the implementation details). See `torch/serialize/cereal.h` and `torch/serialize/default.h`. This abstraction and subclassing for these two allows us to:
1. Provide a cereal-less serialization forward that we can ship and iterate on going forward,
2. Provide no-friction backwards compatibility with existing C++ API uses, mainly StarCraft.
The user-facing API is (conceptually):
```cpp
void torch::save(const Module& module, Writer& writer);
void torch::save(const Optimizer& optimizer, Writer& writer);
void torch::read(Module& module, Reader& reader);
void torch::read(Optimizer& optimizer, Reader& reader);
```
with implementations for both optimizers and modules that write into the `Writer` and read from the `Reader`
ebetica ezyang zdevito dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11619
Differential Revision: D9984664
Pulled By: goldsborough
fbshipit-source-id: e03afaa646221546e7f93bb8dfe3558e384a5847
|
||
|---|---|---|
| .. | ||
| any.cpp | ||
| catch_utils.hpp | ||
| cursor.cpp | ||
| integration.cpp | ||
| jit.cpp | ||
| main.cpp | ||
| misc.cpp | ||
| module.cpp | ||
| modules.cpp | ||
| optim_baseline.h | ||
| optim_baseline.py | ||
| optim.cpp | ||
| parallel.cpp | ||
| README.md | ||
| rnn.cpp | ||
| sequential.cpp | ||
| serialize.cpp | ||
| static.cpp | ||
| tensor_cuda.cpp | ||
| tensor_options_cuda.cpp | ||
| tensor_options.cpp | ||
| tensor.cpp | ||
| util.h | ||
C++ API Tests
In this folder live the tests for PyTorch's C++ API (formerly known as autogradpp). They use the Catch2 test framework.
CUDA Tests
The way we handle CUDA tests is by separating them into a separate TEST_CASE
(e.g. we have optim and optim_cuda test cases in optim.cpp), and giving
them the [cuda] tag. Then, inside main.cpp we detect at runtime whether
CUDA is available. If not, we disable these CUDA tests by appending ~[cuda]
to the test specifications. The ~ disables the tag.
One annoying aspect is that Catch only allows filtering on test cases and not
sections. Ideally, one could have a section like LSTM inside the RNN test
case, and give this section a [cuda] tag to only run it when CUDA is
available. Instead, we have to create a whole separate RNN_cuda test case and
put all these CUDA sections in there.
Integration Tests
Integration tests use the MNIST dataset. You must download it by running the following command from the PyTorch root folder:
$ python tools/download_mnist.py -d test/cpp/api/mnist