pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Michael Suo	77c08aa46c	serialize modules as classes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23098 Test Plan: Imported from OSS Differential Revision: D16383328 Pulled By: suo fbshipit-source-id: 36389b8e45c3febb7f224cd9c630fe643fa90bef	2019-08-11 15:50:29 -07:00
Supriya Rao	9223fa1c46	Add support to serialize qtensor in JIT. (#23356 ) Summary: Adds qtensor specific fields to the proto file so that they get serialized into the model.json Pull Request resolved: https://github.com/pytorch/pytorch/pull/23356 ghstack-source-id: 87263428 Differential Revision: D16473237 fbshipit-source-id: bf5b51d0863d036d30a1644a3c3b74516468224b	2019-07-26 15:52:15 -07:00
James Reed	2c2a913a4f	Preserve SourceRanges across serialization (#22179 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22179 ghimport-source-id: 9879551127da09d78ca348b9e436db5a09a92a38 Test Plan: Imported from OSS Differential Revision: D15981423 Pulled By: jamesr66a fbshipit-source-id: a2506f5a2f05916b6e8226841b0229110e758671	2019-07-01 21:14:35 -07:00
davidriazati	cd28ff5395	Add support for __getstate__/__setstate__ on module (#20242 ) Summary: Adds support for `__getstate__` and `__setstate__` on modules that are called as part of export (`torch.save()`) and import (`torch.jit.load`). * `__getstate__` and `__setstate__` must be TorchScript functions with the signatures `() -> T` and `(T) -> None` respectively * The results of `__getstate__` are stored using the pickler in `states.pkl` with one for each module in definition order (`__getstate__` returns `None` by default if an imlpementation is not provided) * This prevents sharing between `__getstate__` and attributes, but this should be fine since their use is mostly unrelated (attributes are for storing values to be used in script methods, `__getstate__` for running arbitrary computations during import) Follow up * Somehow replacing `__getstate__`/`__setstate__` with a `ScriptMethodStub` makes `MyScriptModule().__getstate__()` call `ScriptModule.__getstate__()` when used in Python. This should be fixed so semantics in Python are preserved, but it doesn't affect the typical usage. ](https://our.intern.facebook.com/intern/diff/15287161/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20242 Pulled By: driazati Differential Revision: D15287161 fbshipit-source-id: b3f5f33ab74a21a89e6d15460af63aff75cab2d8	2019-05-17 14:43:14 -07:00
davidriazati	e8fb5f35f0	Bump torch proto version (#20444 ) Summary: Tagging along to changes in #20191 which added more support for types in the pickler Pull Request resolved: https://github.com/pytorch/pytorch/pull/20444 Pulled By: driazati Differential Revision: D15321463 fbshipit-source-id: 985061bf5070a7d7bad58ea8db11d531f3d13e74	2019-05-13 18:32:16 -07:00
Michael Suo	a25b79531c	use fully qualified name for ScriptClasses (#19239 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19239 ghimport-source-id: 830aad6dc11d2a7247760a9c7c9fc8556f70a706 Differential Revision: D14928293 Reviewed By: eellison Pulled By: suo fbshipit-source-id: d2efa5d7f7397526083278d6650b9cee8d967b1a	2019-04-26 19:17:21 -07:00
David Riazati	3d44305e9d	Attribute serialization (#17423 ) Summary: Allows serialization/loading of attributes (`IValue`s of any type). * metadata (attribute name, type) is stored in the `model.json` * The binary format is a subset of the `pickle` module that supports the operations necessary for `IValue`s * Attributes are serialized in the order they are defined on a module to a list in a single `attributes` file, with submodule attributes coming first. This order directly matches the order attributes are listed in `model.json` * This can be inspected in Python with `pickle.load()` or with `pickletools` (PyTorch need not be installed for this to work) * A class is used to store a tensor's index into the tensor table of the model, so to unpickle the file you have to use a custom Unpickler: ```python class TensorID(object): def __setstate__(self, id): self.id = id class JitUnpickler(pickle.Unpickler): def find_class(self, module, name): if module == '__main__' and name == 'TensorID': return TensorID JitUnpickler(open("my_model/attributes.pkl", "rb")).load() ``` * pickle format: https://svn.python.org/projects/python/trunk/Lib/pickletools.py * It currently does not support/guarantee that anything saved out with `pickle` (i.e. if you edit `attributes` with `pickle` directly) instead of our tools will be imported correctly Also will fix #17683 and fix #16367 Followup Work: * document format / choice of pickle: #17951 * create an example * list specializations * int size specializations, large binputs * do a first pass over attributes to output only necessary `BINPUT` ops * attribute reassignment (e.g `self.my_attribute = new_value`) * `tensor.save("some_checkpoint.pkl")` support with tensors embedded in Pickle file Pull Request resolved: https://github.com/pytorch/pytorch/pull/17423 Differential Revision: D14470965 Pulled By: driazati fbshipit-source-id: 6a21a9939efdbe59b4bc57fd31d6d630bab5297e	2019-03-18 18:18:22 -07:00
Michael Suo	18f721fb9a	support serialization of classes (#17856 ) Summary: Stack:     ⚫  #17856 [jit] support serialization of classes  [💛](https://our.intern.facebook.com/intern/diff/D14402599/) Add support for saving/loading TorchScript modules that depend on user-defned classes. We track class dependencies the same we track tensor constants, then write them all out such that we can just compile them in order before compiling the module hierarchy. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17856 Reviewed By: shannonzhu Differential Revision: D14461599 Pulled By: suo fbshipit-source-id: 7115f87e069fd00dc8381d7de9997864fef7ea9f	2019-03-15 12:06:23 -07:00
Pritam Damania	c3f5ba9460	PyTorch model metadata. (#16275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16275 Adding a generic string `metadata` field as part of the model to capture additional metadata with the model. Reviewed By: dzhulgakov Differential Revision: D13579029 fbshipit-source-id: 7456ef2edbe73bb70bbb31889cecd94e0db329a2	2019-02-13 19:48:11 -08:00
Pritam Damania	90aa21e795	Metadata for input/output formats in model file proto. (#15252 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15252 We would like to extend the model file format to include strongly type, semantic information about the model inputs and outputs. The goal is for a user to be able to consider a model file like a function with a well defined API describing what the inputs and outputs would be. Reviewed By: dzhulgakov Differential Revision: D13009915 fbshipit-source-id: 5df124a876ad03c05fbdaacae0eab659637734c1	2018-12-21 17:42:38 -08:00
Lu Fang	e0f68671bd	Restore device when import jit script module (#14454 ) Summary: We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string). Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454 Reviewed By: zrphercule Differential Revision: D13271956 Pulled By: houseroad fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49	2018-12-03 14:10:30 -08:00
Zachary DeVito	170ff7764f	Use a zip archive as our container format (#14521 ) Summary: After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format. miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it. This replaces our custom container format with a zip archive, preserving all of the desirable features of our custom format, such as append-oriented writing, and mmap'able tensor data while adding a bunch of debugging advantages: 1. You can unzip and explore the container to debug what is going on with a model. 2. You can edit the model using a text editor (e.g. change the definition of a method, or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress' option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug serialized models. 3. We can easily enable features like compression in the future. 4. Stock python , without pytorch installed, and other programming languages can reasonably consume this format,using json and zipfile packages, which enables people to build tools like visualizers without those visualizers depending on pytorch. This will be especially useful if you want to, for instance, write a visualizer in javascript. Notes: * This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained library for reading/writing zipfiles that unlike other zip libraries also includes libz compatible compress/decompress support. It is a single header and a single C file without any other dependencies. Note that the instructions for miniz explicitly state: > Please use the files from the releases page in your projects. Do not use the git checkout directly! So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable to doing zip-align style things to align data. * Removes 'size' from RecordRef. This allows you to edit files in the zip archive without editing the meta-data file. Very important if you want to print-debug serialized models. * PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings) However, their implementation is completely swapped out to use miniz. * Code exists to check for the old magic number to give a decent warning to our preview users after we change the format. * Container version information is now put in a stand-alone 'version' file in the archive and serves a similar purpose to the other container version info. * All files in the zip archive start at 64-byte boundaries, using an approach similar to zip-align. Tests check that this property remains true. While the writer does this, the reader doesn't depend on it, allowing user-created archives that can use compression, and do not have to align data. * Added test to check for > 4GB files and archives. Disabled by default because it takes almost 2 minutes to run. * torchscript files are now optional: if a submodule does not have methods, it will not be written. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521 Reviewed By: jamesr66a Differential Revision: D13252945 Pulled By: zdevito fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c	2018-11-30 19:19:29 -08:00
Zachary DeVito	fd31eae9ad	Switch import/export to python printing (#14400 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit. This changes the way methods are defined in TorchScript archives to use PythonPrint rather than ONNX protobufs. It also updates torch.proto to directly document the tensor data structure actually being serialized. Notes: * because PythonPrint prints all the methods at once per module, this removes MethodDef in favor of a single torchscript_area and a separate caffe2_graphs entry. Note that NetDef's already have method names, so there is no need or a separate method name entry. * This switches cpp/pickle area to RecordRef (references to a file in the container format) since it is possible the data in these arenas may be large and not suited to json ouput. * Removes 'annotations' -- annotations should be re-added on the first commit that actually has a practical use for them. In the current state it is unlikely they are representing the right information. * Some expect files have changed because PythonPrint is preserving more debug name information for parameter names. * MethodEncoder (the ONNX output format) has been deleted. There is still some cleanup possible combining EncoderBase and GraphEncode now that there is only a single pathway using EncoderBase. * This incorporates the changes from #14397 to define TensorDef Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400 Reviewed By: suo Differential Revision: D13231800 Pulled By: zdevito fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571	2018-11-29 17:53:49 -08:00
Lu Fang	7a654617eb	Add tensor table in ModelDef and use it for jit script serialization and deserialization (#13861 ) Summary: As we discussed, the tensors in the torch script will be associated with the tensor data in the serialized file. So let's add a table of tensor (actually it's a repeated TensorProto filed) in the ModelDef. TensorProto.name will be the id. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13861 Reviewed By: dzhulgakov Differential Revision: D13036940 Pulled By: zrphercule fbshipit-source-id: ecb91b062ac4bc26af2a8d6d12c91d5614efd559	2018-11-20 23:37:50 -08:00
Lu Fang	f34c848f52	Store the optimize flag in module (#14166 ) Summary: When the save/load of script module, we store optimize flag in module instead of encoding it in method. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14166 Reviewed By: ezyang Differential Revision: D13117577 Pulled By: dzhulgakov fbshipit-source-id: dc322948bda0ac5809d8ef9a345497ebb8f33a61	2018-11-19 14:34:05 -08:00
Lu Fang	e2a7d43dfd	Use the torch.proto to store script module (#13736 ) Summary: Directly operate protobuf in the serializer/deserializer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13736 Reviewed By: dzhulgakov Differential Revision: D13028487 Pulled By: houseroad fbshipit-source-id: e578474008874f00f2a22f0a2ffd85f52643881a	2018-11-14 00:22:09 -08:00
Lu Fang	30aaa07594	New serialization format (#12384 ) Summary: Addressed Dima's feedback. The proposal is here: https://fb.quip.com/TbQmAuqIznCf Pull Request resolved: https://github.com/pytorch/pytorch/pull/12384 Reviewed By: dzhulgakov Differential Revision: D10246743 Pulled By: houseroad fbshipit-source-id: c80db0c35d60ca32965275da705f2b1dfb2a7265	2018-10-16 16:36:58 -07:00
Dmytro Dzhulgakov	1d3f650ce4	Revert D10098106: [pytorch][PR] [WIP] New version of PT1 model format Differential Revision: D10098106 Original commit changeset: 94ec7fc57c84 fbshipit-source-id: 38f729b0970618f38359797b806cbbcd865f4715	2018-10-02 00:43:40 -07:00
Lu Fang	35becd1879	New version of PT1 model format (#12149 ) Summary: Considered four different existing formats: 1) static graph, 2) torch script, 3) pickle files, 4) PyTorch C++ serialize APIs Pull Request resolved: https://github.com/pytorch/pytorch/pull/12149 Reviewed By: BIT-silence Differential Revision: D10098106 Pulled By: houseroad fbshipit-source-id: 94ec7fc57c842e50fae5286ddeda657a4967a07a	2018-10-01 15:57:02 -07:00
Roy Li	30521a37ad	codemod: caffe::float16 -> at::Half (#11785 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11785 Replace each instead of float16 with Half. Reviewed By: Yangqing Differential Revision: D9892158 fbshipit-source-id: b9225ca7bd5c84fd1c04a9d24b026c8b6cbff120	2018-09-20 18:55:19 -07:00
Lu Fang	32494c226e	OperatorDef <==> NodeProto Conversion (#11621 ) Summary: Operator level proto conversion between (new) torch proto and (old) caffe2 proto. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11621 Reviewed By: BIT-silence Differential Revision: D9892422 Pulled By: houseroad fbshipit-source-id: 01a55ec0a09479876a27082d90fc970723f4d431	2018-09-19 08:41:33 -07:00
Lu Fang	727a4453aa	New Serialization Proto Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11166 Reviewed By: mingzhe09088 Differential Revision: D9623522 Pulled By: houseroad fbshipit-source-id: f21153034a398de7959404321d8534234cd58a40	2018-09-11 10:55:43 -07:00

22 Commits