pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Mikayla Gawarecki	6fa2d41dc7	Add mmap option to `torch.load` (#102549 ) Using [`nanoGPT/model.py`](https://github.com/karpathy/nanoGPT/blob/master/model.py) run <details><summary><b>Click for script to save gpt2-xlarge (1.5B params)</b></summary> ``` # test_load_save_gpt.py from model import GPT import torch import time torch.manual_seed(5) # gpt2-xlarge 1558M parameters class GPTConfig: block_size: int = 1024 vocab_size: int = 50304 # GPT-2 vocab_size of 50257, padded up to nearest multiple of 64 for efficiency n_layer: int = 48 n_head: int = 25 n_embd: int = 1600 dropout: float = 0.0 bias: bool = True # True: bias in Linears and LayerNorms, like GPT-2. False: a bit better and faster def f(): model = GPT(GPTConfig()) state_dict = model.state_dict() start_saving = time.time() torch.save(state_dict, "gpt2-xlarge.pth") end_saving = time.time() if __name__ == "__main__": f() ``` </details> <details><summary><b>Click for script to load</b></summary> ``` # test_load_gpt.py import torch from model import GPT from test_load_save_gpt import GPTConfig import time import argparse def f(mmap, meta): device = 'meta' if meta else 'cpu' assign = True if meta else False with torch.device(device): model = GPT(GPTConfig()) start_loading = time.time() loaded_state_dict = torch.load("gpt2-xlarge.pth", _mmap=mmap) end_loading = time.time() print(f"loading time using torch.load with mmap={mmap}: ", end_loading - start_loading) model.load_state_dict(loaded_state_dict, assign=assign) end_load_state_dict = time.time() print("load_state_dict time: ", end_load_state_dict - end_loading) model.cuda() end_cuda = time.time() print("cuda time using torch.load with mmap: ", end_cuda - end_load_state_dict) if __name__ == "__main__": parser = argparse.ArgumentParser(prog='load_gpt_xlarge') parser.add_argument('-m', '--mmap', action='store_true') parser.add_argument('-d', '--devicemeta', action='store_true') args = parser.parse_args() mmap = args.mmap meta = args.devicemeta f(mmap, meta) ``` </details> `python test_load_gpt.py` <img width="614" alt="Screenshot 2023-06-06 at 1 35 43 PM" src="https://github.com/pytorch/pytorch/assets/35276741/ee06e5b3-b610-463b-a867-df995d21af29"> `python test_load_gpt.py --mmap` <img width="622" alt="Screenshot 2023-06-06 at 1 35 30 PM" src="https://github.com/pytorch/pytorch/assets/35276741/00d2fdd0-b1f5-4313-83dc-e540b654b2af"> If we further use the `with torch.device('meta')` context manager and pull the changes from https://github.com/pytorch/pytorch/pull/102212 that allow the model to reuse tensors from the state_dict, we have `python test_load_gpt.py --mmap --devicemeta` <img width="727" alt="Screenshot 2023-06-06 at 1 35 51 PM" src="https://github.com/pytorch/pytorch/assets/35276741/b50257d9-092a-49c3-acae-876ee44d009f"> \ \ Running the above in a docker container containing a build of PyTorch with RAM limited to 512mb by 1) running `make -f docker.Makefile` from `pytorch/` directory 2) `docker run -m 512m -it <image> bash` 3) docker cp `gpt2-xlarge.pth` and `test_load_gpt.py` into the image `python test_load_gpt.py` Docker will Kill the process due to OOM whereas `python test_load_gpt.py --mmap --devicemeta` <img width="635" alt="Screenshot 2023-06-06 at 1 55 48 PM" src="https://github.com/pytorch/pytorch/assets/35276741/f3820d9e-f24c-43e7-885b-3bfdf24ef8ad"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102549 Approved by: https://github.com/albanD	2023-06-09 15:49:58 +00:00
atannous	b469ed72d0	Integrating new API usage metadata logger (#101762 ) Summary: The new logger allows passing metadata into the api usage logger. The immediate use case is to pass the serialization_id to the save and load events to be enable tracking serialized models in API events. It could be extended to add more metadata in the future. Test Plan: ``` buck2 test @//mode/dev //caffe2/caffe2/serialize:inline_container_test ``` Reviewed By: davidberard98 Differential Revision: D45683697 Pull Request resolved: https://github.com/pytorch/pytorch/pull/101762 Approved by: https://github.com/davidberard98	2023-05-26 00:24:26 +00:00
XDaoHong	a723f1f2b9	fix _privateuse1_tag problem (#100632 ) Fix _privateuse1_tag bug in torch/serialization.py Add device_index after device_type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100632 Approved by: https://github.com/ezyang	2023-05-10 09:53:19 +00:00
Rob Guo	111358de19	Support non-ASCII characters in model file paths (#99453 ) Fixes #98918 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99453 Approved by: https://github.com/albanD, https://github.com/malfet	2023-04-26 01:15:49 +00:00
Aleksei Nikiforov	87a2af6d4a	Fix loading data on different encoding (#94503 ) Add endianness marker when saving, and if it doesn't match host endianness when loading data, do a byteswap. Older data will load correctly only on systems with same endianness it was saved on. New data should load correctly on systems with any endianness. Fixes #65300 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94503 Approved by: https://github.com/kurtamohler, https://github.com/ezyang	2023-04-25 21:05:20 +00:00
XDaoHong	27f8eb8c2b	add storage serialization methods for privateuse1 (#98920 ) add entry for privateuse1 storage serialization register_package in _register_device_module. 1. User only need to implement `privateuse1_tag` and `privateuse1_deserialize` in the device module of open device. When registering device module, the methods are registered with _package_registry in storage serialization. 2. Provides a fixed sequence number 30 for privateuse1 in storage serialization _package_registry list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98920 Approved by: https://github.com/ezyang	2023-04-21 01:51:08 +00:00
Xuehai Pan	e6888697c4	Revisit `torch._six.string_classes` removal (#94709 ) (#97863 ) Revisit `torch._six.string_classes` (which is `(str, bytes)`) removal: `isinstance(obj, string_classes) -> isinstance(obj, str)`. Both `str` and `bytes` are `Sequence` classes. ```python In [1]: from typing import Sequence In [2]: issubclass(bytes, Sequence) Out[2]: True In [3]: issubclass(str, Sequence) Out[3]: True ``` Re-add `bytes` to type guards like: ```python def is_seq(obj): return isinstance(obj, Sequence) and not isinstance(obj, (str, bytes)) ``` Ref: - https://github.com/pytorch/pytorch/pull/94709#issuecomment-1487282912 - #97737 - #97789 Pull Request resolved: https://github.com/pytorch/pytorch/pull/97863 Approved by: https://github.com/Skylion007, https://github.com/albanD	2023-03-30 17:02:45 +00:00
Horace He	5bbec680d7	Fix usages of contextmanager without finally (#96170 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-03-08 20:59:27 +00:00
Xuehai Pan	b005ec62b9	[BE] Remove dependency on `six` and `future` (#94709 ) Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709 Approved by: https://github.com/malfet, https://github.com/Skylion007	2023-02-14 09:14:14 +00:00
Xuehai Pan	69e0bda999	[BE] Import `Literal`, `Protocol`, and `Final` from standard library `typing` as of Python 3.8+ (#94490 ) Changes: 1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention. ```python import re def normalize(name): return re.sub(r"[-_.]+", "-", name).lower() ``` 2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+ 3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-09 19:17:49 +00:00
Aaron Gokaslan	1e2d82b8e4	[BE] Merge isinstance calls together (#94419 ) Simplify and speeds up isinstance calls by checking for multiple types at the same time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94419 Approved by: https://github.com/ezyang	2023-02-09 00:47:26 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
joncrall	ad782ff7df	Enable xdoctest runner in CI for real this time (#83816 ) Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-29 05:32:42 +00:00
Kurt Mohler	81b3df4fb0	Fix dtype mismatch for unallocated storage deserialization (#91285 ) Fixes #90497 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91285 Approved by: https://github.com/ezyang	2022-12-27 19:31:09 +00:00
Laurent Mazare	17941b12e0	Fix a typo in some torch.load error message. (#90662 ) Very cosmetic change: only fixes a small typo in an error message that torch.load could raise. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90662 Approved by: https://github.com/kit1980	2022-12-12 22:34:57 +00:00
Kazuaki Ishizaki	1cd6ebe095	Fix typos in messages under torch (#89049 ) This PR fixes typos of messages in `.py` files under torch directory. Only in `torch/onnx/symbolic_opset16.py`, fix a typo in comment to make the operator name correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89049 Approved by: https://github.com/lezcano	2022-11-17 04:18:14 +00:00
Colin Taylor	3e2ba60ac0	[torch] [analytics] add pytorch event logger callsites to torch.save and torch.load (#89003 ) Summary: as title. Differential Revision: D41239419 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89003 Approved by: https://github.com/ezyang, https://github.com/dzhulgakov	2022-11-15 20:36:16 +00:00
Kurt Mohler	89a326ff7e	Explicitly check filelike arg of `torch.save` (#88867 ) Fixes #88793 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88867 Approved by: https://github.com/ezyang	2022-11-11 16:57:08 +00:00
Kurt Mohler	ee28b865ee	Deprecate TypedStorage, its derived classes, and all of their public methods (#85303 ) Part of #85302 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85303 Approved by: https://github.com/ezyang	2022-11-08 18:11:01 +00:00
BoringCrypto	9f11ce7f67	Setting pickle_module isn't working (#88570 ) When setting the pickle_module it currently always gets overwritten by the pickle module. This should only happen when the pickle_module isn't specified. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88570 Approved by: https://github.com/kit1980	2022-11-08 03:26:46 +00:00
Nikita Shulga	caaf37a111	Fix `PyTorchStreamWriter` exception handling (#88128 ) Avoid double exception in destructor if attempting to serialize to python object that does not have `write` method Use `Finalizer` class in `PyTorchStreamWriter::writeEndOfFile()` to a always set `finailized_` property even if excretion occurs. (as there isn't much one can do at this point) Add expicit check for the attribue to `_open_zipfile_writer_buffer` and add unitests Modernize code a bit by using Python-3 `super()` method Fixes https://github.com/pytorch/pytorch/issues/87997 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88128 Approved by: https://github.com/albanD	2022-10-31 23:38:03 +00:00
Nikita Shulga	961ebca225	Add `weights_only` option to `torch.load` (#86812 ) This addresses the security issue in default Python's `unpickler` that allows arbitrary code execution while unpickling. Restrict classes allowed to be unpicked to in `None`, `int`, `bool`, `str`, `float`, `list`, `tuple`, `dict`/`OrderedDict` as well as `torch.Size`, `torch.nn.Param` as well as `torch.Tensor` and `torch.Storage` variants. Defaults `weights_only` is set to `False`, but allows global override to safe only load via `TORCH_FORCE_WEIGHTS_ONLY_LOAD` environment variable. To some extent, addresses https://github.com/pytorch/pytorch/issues/52596 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86812 Approved by: https://github.com/ezyang	2022-10-21 01:09:50 +00:00
Adam J. Stewart	c6348a7109	Add type hints to torch.save, torch.load (#83937 ) I'll probably need help with this one. I'm not sure what the full type signature for `map_location` should be. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83937 Approved by: https://github.com/malfet, https://github.com/albanD	2022-08-26 18:58:25 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Edward Z. Yang	b9c8db435b	Allow map location to meta device (#82603 ) Fixes https://github.com/pytorch/pytorch/issues/82412 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82603 Approved by: https://github.com/eellison	2022-08-08 19:56:59 +00:00
Kurt Mohler	14d0296e5c	Rename `_Typed/_UntypedStorage` to `Typed/UntypedStorage` and update docs (#82438 ) ### Description Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public. `TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`. Documentation for storages is improved as well. ### Issue Fixes #82436 ### Testing N/A Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438 Approved by: https://github.com/ezyang	2022-07-30 19:37:08 +00:00
Rodrigo Kumpera	b4e491798c	Avoid temporary buffers for tensors with torch.save. (#80404 ) Fix torch.save _open_zipfile_writer optimization that uses a c++ stream when `f` is a os.PathLike. This fastpath requires that we don't `open()` in python if possible, so don't do it unconditionally. Fix PyTorchStreamWriter construction binding that takes a buffer object. Use py::memoryview instead of py::bytes as the former doesn't copy the data. Validated with a trivial benchmark that calls torch.save in a loop 20x with a 10M elements float32 tensor either on cpu or cuda. Saved to /dev/null. Tried two variants 'str' and 'open' In 'str' we pass the string "/dev/null" to torch.save. In 'open' we pass `open("/dev/null", "wb")` to torch.save. Timing in seconds. Before this patch: str-cpu :: 0.757 open-cpu :: 0.757 str-cuda :: 1.367 open-cuda :: 1.366 After this patch: str-cpu :: 0.256 open-cpu :: 0.251 str-cuda :: 0.896 open-cuda :: 0.834 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/80404 Approved by: https://github.com/jamesr66a	2022-06-30 00:19:42 +00:00
Alex Hedges	cb2b7b1e57	Fix code that triggers BytesWarning (#79868 ) Fixes #74812. I have fixed the multiple instances in the repository that trigger `BytesWarning`, and I have enabled the `-bb` option when tests are run to prevent regressions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79868 Approved by: https://github.com/janeyx99	2022-06-21 01:12:21 +00:00
PyTorch MergeBot	e10cbe3880	Revert "Fix BytesWarning in torch.load() (#74813 )" This reverts commit `6c2e8119dd`. Reverted https://github.com/pytorch/pytorch/pull/74813 on behalf of https://github.com/janeyx99 due to Broke slow tests in cuda 10.2 https://github.com/pytorch/pytorch/runs/6944238177?check_suite_focus=true	2022-06-18 03:53:54 +00:00
Alex Hedges	6c2e8119dd	Fix BytesWarning in torch.load() (#74813 ) Fixes #74812. I have enabled the `-bb` option when tests are run to prevent regressions. I don't think it will make CI run more slowly, but I'm not entirely sure. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74813 Approved by: https://github.com/kit1980	2022-06-17 22:56:43 +00:00
Alban Desmaison	0a651a231d	Add full support for serialization of MPS Tensors (#79465 ) Fix https://github.com/pytorch/pytorch/issues/79384 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79465 Approved by: https://github.com/kulinseth, https://github.com/malfet	2022-06-14 17:54:30 +00:00
PyTorch MergeBot	ce6ce74703	Revert "Add full support for serialization of MPS Tensors (#79465 )" This reverts commit `64c2a275c4`. Reverted https://github.com/pytorch/pytorch/pull/79465 on behalf of https://github.com/zengk95 due to this broke X linux-xenial-py3.7-clang7-onnx / test (default, 1, 2, linux.2xlarge). Not sure why since it passed on pull.	2022-06-14 16:42:36 +00:00
Andrij David	48505356f5	Propagate map_location arg to torch.jit.load in torch.load (#78733 ) Fixes #78331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78733 Approved by: https://github.com/davidberard98	2022-06-14 16:04:45 +00:00
Alban Desmaison	64c2a275c4	Add full support for serialization of MPS Tensors (#79465 ) Fix https://github.com/pytorch/pytorch/issues/79384 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79465 Approved by: https://github.com/kulinseth, https://github.com/malfet	2022-06-14 14:20:09 +00:00
Kurt Mohler	aea6e2c396	Merge torch.cuda._UntypedStorage into torch._UntypedStorage (#75459 ) Fixes #74933 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75459 Approved by: https://github.com/ezyang	2022-05-19 13:54:39 +00:00
Kurt Mohler	8e7fe87630	Rename `Typed/UntypedStorage` to `_Typed/_UntypedStorage` (#72540 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72540 Reviewed By: jbschlosser Differential Revision: D34216823 Pulled By: bdhirsh fbshipit-source-id: 1bc9930ab582771ebf02308e035576cd1a0dbe47 (cherry picked from commit `329238f612`)	2022-02-15 23:53:01 +00:00
Kurt Mohler	b69155f754	Avoid dtype mismatch error in `torch.save` if storages are unallocated (#68787 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58970 cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/68787 Reviewed By: mruberry Differential Revision: D32617425 Pulled By: anjali411 fbshipit-source-id: fe7f2374e4ef4428346a0a202cae8e0d382e03ab	2021-11-24 09:51:29 -08:00
Kurt Mohler	bc3d380ed1	Throw error when saving storages that view same data with different type (#66949 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58970 cc mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/66949 Reviewed By: albanD Differential Revision: D31926323 Pulled By: anjali411 fbshipit-source-id: f6e7acc0c1968b70a94f9b0b69a32780e8e21a62	2021-11-16 08:44:44 -08:00
Kurt Mohler	5883523c1d	Remove dtype from torch.Storage and use only torch.ByteStorage (#62030 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030 Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible Fixes https://github.com/pytorch/pytorch/issues/47442 * THE SERIALIZATION FORMAT IS FULLY FC/BC. We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today. * There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate. * As we no longer know what dtype of a storage is, we've removed the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes. * `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls. `Storage._new_with_file` and `_set_from_file` require explicit element size arguments. * It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor. * It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling. * The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall. To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage or your serialization code will degrade to standard file-based serialization. Original pull request: https://github.com/pytorch/pytorch/pull/59671 Reviewed By: soulitzer, ngimel Differential Revision: D29466819 Pulled By: ezyang fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e	2021-10-05 13:50:34 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Zhengxu Chen	e62189ad69	[jit] Better checking for overload function declarations. (#59956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59956 Issue #50175. Basically two things need to be checked and are lacking currently: 1. Overload declarations should always have a single `pass` statement as the body. 2. There should be always an implementation provided for decls which doesn't have the torch.jit._overload decorator. So in this case we need to check whether we are actually compiling a function body with decorator ahead. Test Plan: python test/test_jit.py TestScript.test_function_overloads Imported from OSS Reviewed By: gmagogsfm Differential Revision: D29106555 fbshipit-source-id: 2d9d7df2fb51ab6db0e1b726f9644e4cfbf733d6	2021-08-05 14:21:48 -07:00
Francesco Casalegno	fea3824214	Ensure torch.save() deterministic output (#57536 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42163. ## {emoji:1f525} Pitch Currently, the binary outputs produced by `torch.save()` are non-deterministic (as pointed out in https://github.com/pytorch/pytorch/issues/42163). This means that running a simple snippet that creates a tensor (or a model) twice will produce output files with a different `md5` sum. Why does this occur? The cause of this behavior lies in the fact that the `obj._cdata` is used to identify a tensor and is written to a file, but the `_cdata` attribute is of course non-deterministic: `a80b215a9a/torch/serialization.py (L416)` Why does this matter? Reproducibility is essential for many Machine Learning projects. For instance, when using [`dvc`](https://dvc.org/) you would expect that if none of the dependencies of a stage of a ML pipeline has changed, then running the same stage another time will produce the same binary output. For the reasons explained above, with `torch` this was not the case, so this PR tries to fix this issue. ## {emoji:1f4cc} Content of this PR ### What changes? - The `persistent_id()` function now returns a deterministic value, rather than `obj._cdata` (which depends on runtime). - As a consequence, `torch.save(obj, "output.pt")` produces a deterministic output, i.e. the `md5` hash of `output.pt` is determinstic. See Test 1 and Test 2 below. ### What does not change? - If an `obj` contains several tensors that share the same underlying data (e.g. they are views of the same tensor),the `obj_key` returned by `persistent_id()` is still going to be the same for all of them - As a consequence, serialization optimizes disk storage by storing only necessary tensors, rather than writing one tensor per view. See Test 3 below. ## � How to test ### Test 1: snipped from https://github.com/pytorch/pytorch/issues/42163 Consider the following `snippet_1.py` (from https://github.com/pytorch/pytorch/issues/42163). ```python import hashlib import torch def get_sha256_hash(file: str, chunk_size: int = 4096) -> str: hasher = hashlib.sha256() with open(file, "rb") as fh: for chunk in iter(lambda: fh.read(chunk_size), b""): hasher.update(chunk) return hasher.hexdigest() file = "tensor.pt" hashes = [] for _ in range(5): obj = torch.ones(1) torch.save(obj, file) hashes.append(get_sha256_hash(file)[:8]) del obj hash = hashes[0] assert all(other == hash for other in hashes[1:]) print(hash) ``` On `master` you obtain an error ```bash $ python snippet_1.py Traceback (most recent call last): File "save_tensor.py", line 84, in <module> assert all(other == hash for other in hashes[1:]) AssertionError ``` while on this PR branch you should get the following consistent behaviour: ```bash $ for run in {1..2}; do python snippet_1.py; done 600a83cb 600a83cb ``` ### Test 2: Deterministic save of `Tensor` and `nn.Module` instances Consider the following `snippet_2.py` ```python import torch torch.manual_seed(0) x = torch.tensor([8., 8., 5., 0.]) torch.save(x, "out_tensor.pt") model = torch.nn.Sequential( torch.nn.Linear(3, 1), torch.nn.Flatten(0, 1) ) torch.save(model, "out_model.pt") ``` On `master` branch, the `md5` hash of `out_tensor.pt` and `out_model.pt` are non-determinstic, for instance you may get ```bash $ for run in {1..2}; do python snippet_2.py; md5 out_pt; done MD5 (`bc9e8af218`) (out_model.pt) = 92dca4a310b691e893f3cb41d64d5af1 MD5 (`bc9e8af218`) (out_tensor.pt) = a4ef290583f50a9c203a42d0cfc078af MD5 (`bc9e8af218`) (out_model.pt) = de3cb9791a66af8aed77ed7224bd1d5c MD5 (`bc9e8af218`) (out_tensor.pt) = 3b8a6009d3a0be5b9dd94152dcc0c7cb ``` while on this PR branch you should get the following consistent behaviour: ```bash $ for run in {1..2}; do python snippet_2.py; md5 out_pt; done MD5 (`bc9e8af218`) (out_model.pt) = dba75fd50a190e4e7fa89b7a2477bab7 MD5 (`bc9e8af218`) (out_tensor.pt) = 029f52f0706d6c813cc796d3cdcd3eb0 MD5 (`bc9e8af218`) (out_model.pt) = dba75fd50a190e4e7fa89b7a2477bab7 MD5 (`bc9e8af218`) (out_tensor.pt) = 029f52f0706d6c813cc796d3cdcd3eb0 ``` ### Test 3: Views of the same tensor are not re-written to file Consider the following `snippet_3.py`. ```python import torch torch.manual_seed(0) x = torch.rand(1_000, 1_000) y = x.T z = x.view(1_000_000, 1) torch.save({"x": x}, "out_tensor_x.pt") torch.save({"x": x, "y": y, "z": z}, "out_tensor_xyz.pt") ``` Both on `master` branch and on this PR branch you should get two output files with same size: ```bash $ python snippet_3.py && du -sh out_tensorpt && md5 out_pt 3.8M out_tensor_x.pt 3.8M out_tensor_xyz.pt MD5 (`bc9e8af218`) (out_tensor_x.pt) = eda516d9156177b27bdc2a75c9064d9b MD5 (`bc9e8af218`) (out_tensor_xyz.pt) = 333b869f5b93ced7b8649ab1571eb8e3 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57536 Reviewed By: bdhirsh Differential Revision: D28304728 Pulled By: ailzhang fbshipit-source-id: 49788e566a3cd2c6c36dc801e6bdd8f42c9459cb	2021-05-10 11:51:55 -07:00
Yukio Siraichi	9d54475032	Hide module paths leaking in the documentation. (#54585 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54354 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54585 Reviewed By: H-Huang Differential Revision: D28027037 Pulled By: mruberry fbshipit-source-id: 219874e143221f5e8349d007f88464e0be1a6243	2021-04-27 10:58:01 -07:00
Jeff Yang	475251631b	docs: reference links to serialization.html (#54659 ) Summary: fixes https://github.com/pytorch/pytorch/issues/54311 https://11811979-65600975-gh.circle-artifacts.com/0/docs/generated/torch.save.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/54659 Reviewed By: ailzhang Differential Revision: D27328281 Pulled By: zou3519 fbshipit-source-id: b88d02e5407238a338d537d013a297ae9cdf922b	2021-03-29 10:15:07 -07:00
Philip Meier	b0afe945a7	Fix pylint error torch.tensor is not callable (#53424 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53424 Fixes https://github.com/pytorch/pytorch/issues/24807 and supersedes the stale https://github.com/pytorch/pytorch/issues/25093 (Cc Microsheep). If you now run the reproduction ```python import torch if __name__ == "__main__": t = torch.tensor([1, 2, 3], dtype=torch.float64) ``` with `pylint==2.6.0`, you get the following output ``` test_pylint.py:1:0: C0114: Missing module docstring (missing-module-docstring) test_pylint.py:4:8: E1101: Module 'torch' has no 'tensor' member; maybe 'Tensor'? (no- member) test_pylint.py:4:38: E1101: Module 'torch' has no 'float64' member (no-member) ``` Now `pylint` doesn't recognize `torch.tensor` at all, but it is promoted in the stub. Given that it also doesn't recognize `torch.float64`, I think fixing this is out of scope of this PR. --- ## TL;DR This BC-breaking only for users that rely on unintended behavior. Since `torch/__init__.py` loaded `torch/tensor.py` it was populated in `sys.modules`. `torch/__init__.py` then overwrote `torch.tensor` with the actual function. With this `import torch.tensor as tensor` does not fail, but returns the function rather than the module. Users that rely on this import need to change it to `from torch import tensor`. Reviewed By: zou3519 Differential Revision: D26223815 Pulled By: bdhirsh fbshipit-source-id: 125b9ff3d276e84a645cd7521e8d6160b1ca1c21	2021-03-09 11:32:53 -08:00
Brian Hirsh	18277137ff	make torch.load() aware of import path changes: torch.tensor -> torch._tensor (#53139 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53139 ghstack-source-id: 123090847 Test Plan: Sandcastle Also explicitly tests that this test passes after incorporating the changes from D26656767, and adding a `torch.tensor` -> `torch._tensor` mapping to the `load_module_mapping` dict: `buck test mode/dev //pandora/utils/tests:manifold_utils_tests -- --exact 'pandora/utils/tests:manifold_utils_tests - test_load_dataset_valid_dir (pandora.utils.tests.manifold_utils_tests.TestManifoldUtils)'` With just D26656767, that test fails. With D26656767 + the changes in this diff, that test passes. Reviewed By: ezyang Differential Revision: D26760600 fbshipit-source-id: cb16493b858a358acf468d755740aa272ae9d363	2021-03-04 17:11:20 -08:00
Sam Estep	c147aa306c	Use doctest directly to get docstring examples (#50596 ) Summary: This PR addresses [a two-year-old TODO in `test/test_type_hints.py`](`12942ea52b/test/test_type_hints.py (L21-L22)`) by replacing most of the body of our custom `get_examples_from_docstring` function with [a function from Python's built-in `doctest.DocTestParser` class](https://docs.python.org/3/library/doctest.html#doctest.DocTestParser.get_examples). This mostly made the parser more strict, catching a few errors in existing doctests: - missing `...` in multiline statements - missing space after `>>>` - unmatched closing parenthesis Also, as shown by [the resulting diff of the untracked `test/generated_type_hints_smoketest.py` file](https://pastebin.com/vC5Wz6M0) (also linked from the test plan below), this introduces a few incidental changes as well: - standalone comments are no longer preserved - indentation is now visually correct - [`example_torch_promote_types`](`4da9ceb743/torch/_torch_docs.py (L6753-L6772)`) is now present - an example called `example_torch_tensor___array_priority__` is added, although I can't tell where it comes from - the last nine lines of code from [`example_torch_tensor_align_as`](`5d45140d68/torch/_tensor_docs.py (L386-L431)`) are now present - the previously-misformatted third line from [`example_torch_tensor_stride`](`5d45140d68/torch/_tensor_docs.py (L3508-L3532)`) is now present Pull Request resolved: https://github.com/pytorch/pytorch/pull/50596 Test Plan: Checkout the base commit, typecheck the doctests, and save the generated file: ``` $ python test/test_type_hints.py TestTypeHints.test_doc_examples $ cp test/generated_type_hints_smoketest.py /tmp ``` Then checkout this PR, do the same thing, and compare: ``` $ python test/test_type_hints.py TestTypeHints.test_doc_examples $ git diff --no-index {/tmp,test}/generated_type_hints_smoketest.py ``` The test should succeed, and the diff should match [this paste](https://pastebin.com/vC5Wz6M0). Reviewed By: walterddr Differential Revision: D25926245 Pulled By: samestep fbshipit-source-id: 23bc379ff438420e556263c19582dba06d8e42ec	2021-01-20 15:55:36 -08:00
Hugo van Kemenade	473e78c0fa	Remove redundant code for unsupported Python versions (#49486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49486 Remove code for Python 3.5 and lower. There's more that can be removed/modernised, but sticking mainly to redundant version checks here, to keep the diff/PR smaller. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46579 Reviewed By: zou3519 Differential Revision: D24453571 Pulled By: ezyang fbshipit-source-id: c2cfcf05d6c5f65df64d89c331692c9aec09248e	2021-01-06 12:45:46 -08:00
Zain Patel	bbeee481c3	Fix typo in torch.load docstring for the `f` parameter (#49350 ) Summary: No issue opened for this (that I can see) and it was a fairly small change, so just opening this PR directly! The docstring for `torch.load` had some of parameter descriptions including typos like ``:meth`readline` `` instead of``:meth:`readline` ``. This PR corrects that :) <img width="811" alt="image" src="https://user-images.githubusercontent.com/30357972/102128240-7fa33500-3e45-11eb-8f54-ce5ca7bba96c.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/49350 Reviewed By: glaringlee Differential Revision: D25543041 Pulled By: mrshenli fbshipit-source-id: 10db04d58dd5b07777bdd51d3fcb3c45dea4c84b	2020-12-14 19:16:01 -08:00

1 2 3

141 Commits