Commit Graph

23 Commits

Author SHA1 Message Date
Shashank Chaudhry
06d1be2447 [NOOP][clangformat][codemod] Enable CLANGFORMAT for caffe2/caffe2/* (#67624)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67624

Test Plan: Visual inspection. Sandcastle.

Reviewed By: malfet

Differential Revision: D31986628

fbshipit-source-id: c872bded7325997a2945dbf5d4d052628dcb3659
2021-11-02 22:14:04 -07:00
Michael Suo
77c08aa46c serialize modules as classes
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23098

Test Plan: Imported from OSS

Differential Revision: D16383328

Pulled By: suo

fbshipit-source-id: 36389b8e45c3febb7f224cd9c630fe643fa90bef
2019-08-11 15:50:29 -07:00
Supriya Rao
9223fa1c46 Add support to serialize qtensor in JIT. (#23356)
Summary:
Adds qtensor specific fields to the proto file so that they get serialized into the model.json

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23356
ghstack-source-id: 87263428

Differential Revision: D16473237

fbshipit-source-id: bf5b51d0863d036d30a1644a3c3b74516468224b
2019-07-26 15:52:15 -07:00
James Reed
2c2a913a4f Preserve SourceRanges across serialization (#22179)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22179
ghimport-source-id: 9879551127da09d78ca348b9e436db5a09a92a38

Test Plan: Imported from OSS

Differential Revision: D15981423

Pulled By: jamesr66a

fbshipit-source-id: a2506f5a2f05916b6e8226841b0229110e758671
2019-07-01 21:14:35 -07:00
davidriazati
cd28ff5395 Add support for __getstate__/__setstate__ on module (#20242)
Summary:
Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`).
* `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively
* The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided)
    * This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import)

Follow up
* Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage.
](https://our.intern.facebook.com/intern/diff/15287161/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242

Pulled By: driazati

Differential Revision: D15287161

fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8
2019-05-17 14:43:14 -07:00
davidriazati
e8fb5f35f0 Bump torch proto version (#20444)
Summary:
Tagging along to changes in #20191 which added more support for types in the pickler
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20444

Pulled By: driazati

Differential Revision: D15321463

fbshipit-source-id: 985061bf5070a7d7bad58ea8db11d531f3d13e74
2019-05-13 18:32:16 -07:00
Michael Suo
a25b79531c use fully qualified name for ScriptClasses (#19239)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239
ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706

Differential Revision: D14928293

Reviewed By: eellison

Pulled By: suo

fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a
2019-04-26 19:17:21 -07:00
David Riazati
3d44305e9d Attribute serialization (#17423)
Summary:
Allows serialization/loading of attributes (`IValue`s of any type).
* metadata (attribute name, type) is stored in the `model.json`
* The binary format is a subset of the `pickle` module that supports the operations necessary for `IValue`s
    * Attributes are serialized in the order they are defined on a module to a list in a single `attributes` file, with submodule attributes coming first. This order directly matches the order attributes are listed in `model.json`
    * This can be inspected in Python with `pickle.load()` or with `pickletools` (PyTorch need not be installed for this to work)
        * A class is used to store a tensor's index into the tensor table of the model, so to unpickle the file you have to use a custom Unpickler:
        ```python
        class TensorID(object):
            def __setstate__(self, id):
                self.id = id

        class JitUnpickler(pickle.Unpickler):
            def find_class(self, module, name):
                if module == '__main__' and name == 'TensorID':
                    return TensorID

        JitUnpickler(open("my_model/attributes.pkl", "rb")).load()
        ```
    * pickle format: https://svn.python.org/projects/python/trunk/Lib/pickletools.py
* It currently does not support/guarantee that anything saved out with `pickle` (i.e. if you edit `attributes` with `pickle` directly) instead of our tools will be imported correctly

Also will fix #17683 and fix #16367

Followup Work:
* document format / choice of pickle: #17951
* create an example
* list specializations
* int size specializations, large binputs
* do a first pass over attributes to output only necessary `BINPUT` ops
* attribute reassignment (e.g `self.my_attribute = new_value`)
* `tensor.save("some_checkpoint.pkl")` support with tensors embedded in Pickle file
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17423

Differential Revision: D14470965

Pulled By: driazati

fbshipit-source-id: 6a21a9939efdbe59b4bc57fd31d6d630bab5297e
2019-03-18 18:18:22 -07:00
Michael Suo
18f721fb9a support serialization of classes (#17856)
Summary:
Stack:
      **#17856 [jit] support serialization of classes**  [💛](https://our.intern.facebook.com/intern/diff/D14402599/)

Add support for saving/loading TorchScript modules that depend on user-defned classes.

We track class dependencies the same we track tensor constants, then write them
all out such that we can just compile them in order before compiling the module
hierarchy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17856

Reviewed By: shannonzhu

Differential Revision: D14461599

Pulled By: suo

fbshipit-source-id: 7115f87e069fd00dc8381d7de9997864fef7ea9f
2019-03-15 12:06:23 -07:00
Pritam Damania
c3f5ba9460 PyTorch model metadata. (#16275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16275

Adding a generic string `metadata` field as part of the model to capture additional metadata with the model.

Reviewed By: dzhulgakov

Differential Revision: D13579029

fbshipit-source-id: 7456ef2edbe73bb70bbb31889cecd94e0db329a2
2019-02-13 19:48:11 -08:00
Pritam Damania
90aa21e795 Metadata for input/output formats in model file proto. (#15252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15252

We would like to extend the model file format to include strongly type, semantic information
about the model inputs and outputs.

The goal is for a user to be able to consider a model file like a function with
a well defined API describing what the inputs and outputs would be.

Reviewed By: dzhulgakov

Differential Revision: D13009915

fbshipit-source-id: 5df124a876ad03c05fbdaacae0eab659637734c1
2018-12-21 17:42:38 -08:00
Lu Fang
e0f68671bd Restore device when import jit script module (#14454)
Summary:
We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454

Reviewed By: zrphercule

Differential Revision: D13271956

Pulled By: houseroad

fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49
2018-12-03 14:10:30 -08:00
Zachary DeVito
170ff7764f Use a zip archive as our container format (#14521)
Summary:
After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format.
miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it.

This replaces our custom container format with a zip archive, preserving all of the
desirable features of our custom format, such as append-oriented writing, and
mmap'able tensor data while adding a bunch of debugging advantages:

1. You can unzip and explore the container to debug what is going on with a model.
2. You can edit the model using a text editor (e.g. change the definition of a method,
   or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress'
   option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug
   serialized models.
3. We can easily enable features like compression in the future.
4. Stock python , without pytorch installed, and other programming languages
   can reasonably consume this format,using json  and zipfile packages, which enables
   people to build tools like visualizers without those visualizers depending on pytorch.
   This will be especially useful if you want to, for instance, write a visualizer in javascript.

Notes:

*  This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained
   library for reading/writing zipfiles that unlike other zip libraries also includes libz
   compatible compress/decompress support. It is a single header and a single C file without
   any other dependencies. Note that the instructions for miniz explicitly state:

   > Please use the files from the releases page in your projects. Do not use the git checkout directly!

   So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable
   to doing zip-align style things to align data.

*  Removes 'size' from RecordRef. This allows you to edit files in the zip archive without
   editing the meta-data file. Very important if you want to print-debug serialized models.

*  PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings)
   However, their implementation is completely swapped out to use miniz.

*  Code exists to check for the old magic number to give a decent warning to our preview users
   after we change the format.

*  Container version information is now put in a stand-alone 'version' file in the archive
   and serves a similar purpose to the other container version info.

*  All files in the zip archive start at 64-byte boundaries, using an approach similar to
   zip-align. Tests check that this property remains true. While the writer does this,
   the reader doesn't depend on it, allowing user-created archives that can use compression,
   and do not have to align data.

*  Added test to check for > 4GB files and archives. Disabled by default because it takes
   almost 2 minutes to run.

*  torchscript files are now optional: if a submodule does not have methods, it will
   not be written.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521

Reviewed By: jamesr66a

Differential Revision: D13252945

Pulled By: zdevito

fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c
2018-11-30 19:19:29 -08:00
Zachary DeVito
fd31eae9ad Switch import/export to python printing (#14400)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit.

This changes the way methods are defined in TorchScript archives to use
PythonPrint rather than ONNX protobufs.

It also updates torch.proto to directly document the tensor data
structure actually being serialized.

Notes:
* because PythonPrint prints all the methods at once per module, this
  removes MethodDef in favor of a single torchscript_area and a separate
  caffe2_graphs entry. Note that NetDef's already have method names,
  so there is no need or a separate method name entry.
* This switches cpp/pickle area to RecordRef (references to a file in
  the container format) since it is possible the data in these arenas
  may be large and not suited to json ouput.
* Removes 'annotations' -- annotations should be re-added on the first
  commit that actually has a practical use for them. In the current state
  it is unlikely they are representing the right information.
* Some expect files have changed because PythonPrint is preserving more
  debug name information for parameter names.
* MethodEncoder (the ONNX output format) has been deleted. There is still
  some cleanup possible combining EncoderBase and GraphEncode now that there
  is only a single pathway using EncoderBase.
* This incorporates the changes from #14397
  to define TensorDef
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400

Reviewed By: suo

Differential Revision: D13231800

Pulled By: zdevito

fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571
2018-11-29 17:53:49 -08:00
Lu Fang
7a654617eb Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861)
Summary:
As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861

Reviewed By: dzhulgakov

Differential Revision: D13036940

Pulled By: zrphercule

fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559
2018-11-20 23:37:50 -08:00
Lu Fang
f34c848f52 Store the optimize flag in module (#14166)
Summary:
When the save/load of script module, we store optimize flag in module instead of encoding it in method.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/14166

Reviewed By: ezyang

Differential Revision: D13117577

Pulled By: dzhulgakov

fbshipit-source-id: dc322948bda0ac5809d8ef9a345497ebb8f33a61
2018-11-19 14:34:05 -08:00
Lu Fang
e2a7d43dfd Use the torch.proto to store script module (#13736)
Summary:
Directly operate protobuf in the serializer/deserializer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13736

Reviewed By: dzhulgakov

Differential Revision: D13028487

Pulled By: houseroad

fbshipit-source-id: e578474008874f00f2a22f0a2ffd85f52643881a
2018-11-14 00:22:09 -08:00
Lu Fang
30aaa07594 New serialization format (#12384)
Summary:
Addressed Dima's feedback.

The proposal is here: https://fb.quip.com/TbQmAuqIznCf
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12384

Reviewed By: dzhulgakov

Differential Revision: D10246743

Pulled By: houseroad

fbshipit-source-id: c80db0c35d60ca32965275da705f2b1dfb2a7265
2018-10-16 16:36:58 -07:00
Dmytro Dzhulgakov
1d3f650ce4 Revert D10098106: [pytorch][PR] [WIP] New version of PT1 model format
Differential Revision:
D10098106

Original commit changeset: 94ec7fc57c84

fbshipit-source-id: 38f729b0970618f38359797b806cbbcd865f4715
2018-10-02 00:43:40 -07:00
Lu Fang
35becd1879 New version of PT1 model format (#12149)
Summary:
Considered four different existing formats: 1) static graph, 2) torch script, 3) pickle files, 4) PyTorch C++ serialize APIs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12149

Reviewed By: BIT-silence

Differential Revision: D10098106

Pulled By: houseroad

fbshipit-source-id: 94ec7fc57c842e50fae5286ddeda657a4967a07a
2018-10-01 15:57:02 -07:00
Roy Li
30521a37ad codemod: caffe::float16 -> at::Half (#11785)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11785

Replace each instead of float16 with Half.

Reviewed By: Yangqing

Differential Revision: D9892158

fbshipit-source-id: b9225ca7bd5c84fd1c04a9d24b026c8b6cbff120
2018-09-20 18:55:19 -07:00
Lu Fang
32494c226e OperatorDef <==> NodeProto Conversion (#11621)
Summary:
Operator level proto conversion between (new) torch proto and (old) caffe2 proto.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11621

Reviewed By: BIT-silence

Differential Revision: D9892422

Pulled By: houseroad

fbshipit-source-id: 01a55ec0a09479876a27082d90fc970723f4d431
2018-09-19 08:41:33 -07:00
Lu Fang
727a4453aa New Serialization Proto
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11166

Reviewed By: mingzhe09088

Differential Revision: D9623522

Pulled By: houseroad

fbshipit-source-id: f21153034a398de7959404321d8534234cd58a40
2018-09-11 10:55:43 -07:00