Using [`nanoGPT/model.py`](https://github.com/karpathy/nanoGPT/blob/master/model.py) run
<details><summary><b>Click for script to save gpt2-xlarge (1.5B params)</b></summary>
```
# test_load_save_gpt.py
from model import GPT
import torch
import time
torch.manual_seed(5)
# gpt2-xlarge 1558M parameters
class GPTConfig:
block_size: int = 1024
vocab_size: int = 50304 # GPT-2 vocab_size of 50257, padded up to nearest multiple of 64 for efficiency
n_layer: int = 48
n_head: int = 25
n_embd: int = 1600
dropout: float = 0.0
bias: bool = True # True: bias in Linears and LayerNorms, like GPT-2. False: a bit better and faster
def f():
model = GPT(GPTConfig())
state_dict = model.state_dict()
start_saving = time.time()
torch.save(state_dict, "gpt2-xlarge.pth")
end_saving = time.time()
if __name__ == "__main__":
f()
```
</details>
<details><summary><b>Click for script to load</b></summary>
```
# test_load_gpt.py
import torch
from model import GPT
from test_load_save_gpt import GPTConfig
import time
import argparse
def f(mmap, meta):
device = 'meta' if meta else 'cpu'
assign = True if meta else False
with torch.device(device):
model = GPT(GPTConfig())
start_loading = time.time()
loaded_state_dict = torch.load("gpt2-xlarge.pth", _mmap=mmap)
end_loading = time.time()
print(f"loading time using torch.load with mmap={mmap}: ", end_loading - start_loading)
model.load_state_dict(loaded_state_dict, assign=assign)
end_load_state_dict = time.time()
print("load_state_dict time: ", end_load_state_dict - end_loading)
model.cuda()
end_cuda = time.time()
print("cuda time using torch.load with mmap: ", end_cuda - end_load_state_dict)
if __name__ == "__main__":
parser = argparse.ArgumentParser(prog='load_gpt_xlarge')
parser.add_argument('-m', '--mmap', action='store_true')
parser.add_argument('-d', '--devicemeta', action='store_true')
args = parser.parse_args()
mmap = args.mmap
meta = args.devicemeta
f(mmap, meta)
```
</details>
`python test_load_gpt.py`
<img width="614" alt="Screenshot 2023-06-06 at 1 35 43 PM" src="https://github.com/pytorch/pytorch/assets/35276741/ee06e5b3-b610-463b-a867-df995d21af29">
`python test_load_gpt.py --mmap`
<img width="622" alt="Screenshot 2023-06-06 at 1 35 30 PM" src="https://github.com/pytorch/pytorch/assets/35276741/00d2fdd0-b1f5-4313-83dc-e540b654b2af">
If we further use the `with torch.device('meta')` context manager and pull the changes from https://github.com/pytorch/pytorch/pull/102212 that allow the model to reuse tensors from the state_dict, we have
`python test_load_gpt.py --mmap --devicemeta`
<img width="727" alt="Screenshot 2023-06-06 at 1 35 51 PM" src="https://github.com/pytorch/pytorch/assets/35276741/b50257d9-092a-49c3-acae-876ee44d009f">
\
\
Running the above in a docker container containing a build of PyTorch with RAM limited to 512mb by
1) running `make -f docker.Makefile` from `pytorch/` directory
2) `docker run -m 512m -it <image> bash`
3) docker cp `gpt2-xlarge.pth` and `test_load_gpt.py` into the image
`python test_load_gpt.py`
Docker will Kill the process due to OOM whereas
`python test_load_gpt.py --mmap --devicemeta`
<img width="635" alt="Screenshot 2023-06-06 at 1 55 48 PM" src="https://github.com/pytorch/pytorch/assets/35276741/f3820d9e-f24c-43e7-885b-3bfdf24ef8ad">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102549
Approved by: https://github.com/albanD
Summary: The new logger allows passing metadata into the api usage logger. The immediate use case is to pass the serialization_id to the save and load events to be enable tracking serialized models in API events. It could be extended to add more metadata in the future.
Test Plan:
```
buck2 test @//mode/dev //caffe2/caffe2/serialize:inline_container_test
```
Reviewed By: davidberard98
Differential Revision: D45683697
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101762
Approved by: https://github.com/davidberard98
add entry for privateuse1 storage serialization register_package in _register_device_module.
1. User only need to implement `privateuse1_tag` and `privateuse1_deserialize` in the device module of open device. When registering device module, the methods are registered with _package_registry in storage serialization.
2. Provides a fixed sequence number 30 for privateuse1 in storage serialization _package_registry list.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98920
Approved by: https://github.com/ezyang
Changes:
1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention.
```python
import re
def normalize(name):
return re.sub(r"[-_.]+", "-", name).lower()
```
2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+
3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490
Approved by: https://github.com/ezyang, https://github.com/albanD
Avoid double exception in destructor if attempting to serialize to
python object that does not have `write` method
Use `Finalizer` class in `PyTorchStreamWriter::writeEndOfFile()` to a
always set `finailized_` property even if excretion occurs. (as there
isn't much one can do at this point)
Add expicit check for the attribue to `_open_zipfile_writer_buffer` and
add unitests
Modernize code a bit by using Python-3 `super()` method
Fixes https://github.com/pytorch/pytorch/issues/87997
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88128
Approved by: https://github.com/albanD
This addresses the security issue in default Python's `unpickler` that allows arbitrary code execution while unpickling.
Restrict classes allowed to be unpicked to in `None`, `int`, `bool`, `str`, `float`, `list`, `tuple`, `dict`/`OrderedDict` as well as `torch.Size`, `torch.nn.Param` as well as `torch.Tensor` and `torch.Storage` variants.
Defaults `weights_only` is set to `False`, but allows global override to safe only load via `TORCH_FORCE_WEIGHTS_ONLY_LOAD` environment variable.
To some extent, addresses https://github.com/pytorch/pytorch/issues/52596
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86812
Approved by: https://github.com/ezyang
This is a new version of #15648 based on the latest master branch.
Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.
In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)
Fixes https://github.com/pytorch/pytorch/issues/71105
@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
### Description
Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public.
`TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`.
Documentation for storages is improved as well.
### Issue
Fixes#82436
### Testing
N/A
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438
Approved by: https://github.com/ezyang
Fix torch.save _open_zipfile_writer optimization that uses a c++ stream when `f` is a os.PathLike.
This fastpath requires that we don't `open()` in python if possible, so don't do it unconditionally.
Fix PyTorchStreamWriter construction binding that takes a buffer object.
Use py::memoryview instead of py::bytes as the former doesn't copy the data.
Validated with a trivial benchmark that calls torch.save in a loop 20x with a 10M elements float32 tensor
either on cpu or cuda. Saved to /dev/null.
Tried two variants 'str' and 'open'
In 'str' we pass the string "/dev/null" to torch.save.
In 'open' we pass `open("/dev/null", "wb")` to torch.save.
Timing in seconds.
Before this patch:
str-cpu :: 0.757
open-cpu :: 0.757
str-cuda :: 1.367
open-cuda :: 1.366
After this patch:
str-cpu :: 0.256
open-cpu :: 0.251
str-cuda :: 0.896
open-cuda :: 0.834
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80404
Approved by: https://github.com/jamesr66a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030
Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible
Fixes https://github.com/pytorch/pytorch/issues/47442
* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls. `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.
Original pull request: https://github.com/pytorch/pytorch/pull/59671
Reviewed By: soulitzer, ngimel
Differential Revision: D29466819
Pulled By: ezyang
fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59956
Issue #50175. Basically two things need to be checked and are lacking currently:
1. Overload declarations should always have a single `pass` statement as the body.
2. There should be always an implementation provided for decls which doesn't
have the torch.jit._overload decorator. So in this case we need to check
whether we are actually compiling a function body with decorator ahead.
Test Plan:
python test/test_jit.py TestScript.test_function_overloads
Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D29106555
fbshipit-source-id: 2d9d7df2fb51ab6db0e1b726f9644e4cfbf733d6
Summary:
Fixes https://github.com/pytorch/pytorch/issues/42163.
## {emoji:1f525} Pitch
Currently, the binary outputs produced by `torch.save()` are non-deterministic (as pointed out in https://github.com/pytorch/pytorch/issues/42163). This means that running a simple snippet that creates a tensor (or a model) twice will produce output files with a different `md5` sum.
**Why does this occur?**
The cause of this behavior lies in the fact that the `obj._cdata` is used to identify a tensor and is written to a file, but the `_cdata` attribute is of course non-deterministic:
a80b215a9a/torch/serialization.py (L416)
**Why does this matter?**
Reproducibility is essential for many Machine Learning projects.
For instance, when using [`dvc`](https://dvc.org/) you would expect that if none of the dependencies of a stage of a ML pipeline has changed, then running the same stage another time will produce the same binary output. For the reasons explained above, with `torch` this was not the case, so this PR tries to fix this issue.
## {emoji:1f4cc} Content of this PR
### What changes?
- The `persistent_id()` function now returns a deterministic value, rather than `obj._cdata` (which depends on runtime).
- As a consequence, `torch.save(obj, "output.pt")` produces a deterministic output, i.e. the `md5` hash of `output.pt` is determinstic. See **Test 1** and **Test 2** below.
### What does not change?
- If an `obj` contains several tensors that share the same underlying data (e.g. they are views of the same tensor),the `obj_key` returned by `persistent_id()` is still going to be the same for all of them
- As a consequence, serialization optimizes disk storage by storing only necessary tensors, rather than writing one tensor per view. See **Test 3** below.
## � How to test
### Test 1: snipped from https://github.com/pytorch/pytorch/issues/42163
Consider the following `snippet_1.py` (from https://github.com/pytorch/pytorch/issues/42163).
```python
import hashlib
import torch
def get_sha256_hash(file: str, chunk_size: int = 4096) -> str:
hasher = hashlib.sha256()
with open(file, "rb") as fh:
for chunk in iter(lambda: fh.read(chunk_size), b""):
hasher.update(chunk)
return hasher.hexdigest()
file = "tensor.pt"
hashes = []
for _ in range(5):
obj = torch.ones(1)
torch.save(obj, file)
hashes.append(get_sha256_hash(file)[:8])
del obj
hash = hashes[0]
assert all(other == hash for other in hashes[1:])
print(hash)
```
On `master` you obtain an error
```bash
$ python snippet_1.py
Traceback (most recent call last):
File "save_tensor.py", line 84, in <module>
assert all(other == hash for other in hashes[1:])
AssertionError
```
while on this PR branch you should get the following consistent behaviour:
```bash
$ for run in {1..2}; do python snippet_1.py; done
600a83cb
600a83cb
```
### Test 2: Deterministic save of `Tensor` and `nn.Module` instances
Consider the following `snippet_2.py`
```python
import torch
torch.manual_seed(0)
x = torch.tensor([8., 8., 5., 0.])
torch.save(x, "out_tensor.pt")
model = torch.nn.Sequential(
torch.nn.Linear(3, 1),
torch.nn.Flatten(0, 1)
)
torch.save(model, "out_model.pt")
```
On `master` branch, the `md5` hash of `out_tensor.pt` and `out_model.pt` are non-determinstic, for instance you may get
```bash
$ for run in {1..2}; do python snippet_2.py; md5 out_*pt; done
MD5 (bc9e8af218) (out_model.pt) = 92dca4a310b691e893f3cb41d64d5af1
MD5 (bc9e8af218) (out_tensor.pt) = a4ef290583f50a9c203a42d0cfc078af
MD5 (bc9e8af218) (out_model.pt) = de3cb9791a66af8aed77ed7224bd1d5c
MD5 (bc9e8af218) (out_tensor.pt) = 3b8a6009d3a0be5b9dd94152dcc0c7cb
```
while on this PR branch you should get the following consistent behaviour:
```bash
$ for run in {1..2}; do python snippet_2.py; md5 out_*pt; done
MD5 (bc9e8af218) (out_model.pt) = dba75fd50a190e4e7fa89b7a2477bab7
MD5 (bc9e8af218) (out_tensor.pt) = 029f52f0706d6c813cc796d3cdcd3eb0
MD5 (bc9e8af218) (out_model.pt) = dba75fd50a190e4e7fa89b7a2477bab7
MD5 (bc9e8af218) (out_tensor.pt) = 029f52f0706d6c813cc796d3cdcd3eb0
```
### Test 3: Views of the same tensor are not re-written to file
Consider the following `snippet_3.py`.
```python
import torch
torch.manual_seed(0)
x = torch.rand(1_000, 1_000)
y = x.T
z = x.view(1_000_000, 1)
torch.save({"x": x}, "out_tensor_x.pt")
torch.save({"x": x, "y": y, "z": z}, "out_tensor_xyz.pt")
```
Both on `master` branch and on this PR branch you should get two output files with same size:
```bash
$ python snippet_3.py && du -sh out_tensor*pt && md5 out_*pt
3.8M out_tensor_x.pt
3.8M out_tensor_xyz.pt
MD5 (bc9e8af218) (out_tensor_x.pt) = eda516d9156177b27bdc2a75c9064d9b
MD5 (bc9e8af218) (out_tensor_xyz.pt) = 333b869f5b93ced7b8649ab1571eb8e3
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57536
Reviewed By: bdhirsh
Differential Revision: D28304728
Pulled By: ailzhang
fbshipit-source-id: 49788e566a3cd2c6c36dc801e6bdd8f42c9459cb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53424
Fixes https://github.com/pytorch/pytorch/issues/24807 and supersedes the stale https://github.com/pytorch/pytorch/issues/25093 (Cc Microsheep). If you now run the reproduction
```python
import torch
if __name__ == "__main__":
t = torch.tensor([1, 2, 3], dtype=torch.float64)
```
with `pylint==2.6.0`, you get the following output
```
test_pylint.py:1:0: C0114: Missing module docstring (missing-module-docstring)
test_pylint.py:4:8: E1101: Module 'torch' has no 'tensor' member; maybe 'Tensor'? (no-
member)
test_pylint.py:4:38: E1101: Module 'torch' has no 'float64' member (no-member)
```
Now `pylint` doesn't recognize `torch.tensor` at all, but it is promoted in the stub. Given that it also doesn't recognize `torch.float64`, I think fixing this is out of scope of this PR.
---
## TL;DR
This BC-breaking only for users that rely on unintended behavior. Since `torch/__init__.py` loaded `torch/tensor.py` it was populated in `sys.modules`. `torch/__init__.py` then overwrote `torch.tensor` with the actual function. With this `import torch.tensor as tensor` does not fail, but returns the function rather than the module. Users that rely on this import need to change it to `from torch import tensor`.
Reviewed By: zou3519
Differential Revision: D26223815
Pulled By: bdhirsh
fbshipit-source-id: 125b9ff3d276e84a645cd7521e8d6160b1ca1c21
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53139
ghstack-source-id: 123090847
Test Plan:
Sandcastle
Also explicitly tests that this test passes after incorporating the changes from D26656767, and adding a `torch.tensor` -> `torch._tensor` mapping to the `load_module_mapping` dict: `buck test mode/dev //pandora/utils/tests:manifold_utils_tests -- --exact 'pandora/utils/tests:manifold_utils_tests - test_load_dataset_valid_dir (pandora.utils.tests.manifold_utils_tests.TestManifoldUtils)'`
With just D26656767, that test fails. With D26656767 + the changes in this diff, that test passes.
Reviewed By: ezyang
Differential Revision: D26760600
fbshipit-source-id: cb16493b858a358acf468d755740aa272ae9d363
Summary:
This PR addresses [a two-year-old TODO in `test/test_type_hints.py`](12942ea52b/test/test_type_hints.py (L21-L22)) by replacing most of the body of our custom `get_examples_from_docstring` function with [a function from Python's built-in `doctest.DocTestParser` class](https://docs.python.org/3/library/doctest.html#doctest.DocTestParser.get_examples). This mostly made the parser more strict, catching a few errors in existing doctests:
- missing `...` in multiline statements
- missing space after `>>>`
- unmatched closing parenthesis
Also, as shown by [the resulting diff of the untracked `test/generated_type_hints_smoketest.py` file](https://pastebin.com/vC5Wz6M0) (also linked from the test plan below), this introduces a few incidental changes as well:
- standalone comments are no longer preserved
- indentation is now visually correct
- [`example_torch_promote_types`](4da9ceb743/torch/_torch_docs.py (L6753-L6772)) is now present
- an example called `example_torch_tensor___array_priority__` is added, although I can't tell where it comes from
- the last nine lines of code from [`example_torch_tensor_align_as`](5d45140d68/torch/_tensor_docs.py (L386-L431)) are now present
- the previously-misformatted third line from [`example_torch_tensor_stride`](5d45140d68/torch/_tensor_docs.py (L3508-L3532)) is now present
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50596
Test Plan:
Checkout the base commit, typecheck the doctests, and save the generated file:
```
$ python test/test_type_hints.py TestTypeHints.test_doc_examples
$ cp test/generated_type_hints_smoketest.py /tmp
```
Then checkout this PR, do the same thing, and compare:
```
$ python test/test_type_hints.py TestTypeHints.test_doc_examples
$ git diff --no-index {/tmp,test}/generated_type_hints_smoketest.py
```
The test should succeed, and the diff should match [this paste](https://pastebin.com/vC5Wz6M0).
Reviewed By: walterddr
Differential Revision: D25926245
Pulled By: samestep
fbshipit-source-id: 23bc379ff438420e556263c19582dba06d8e42ec
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49486
Remove code for Python 3.5 and lower.
There's more that can be removed/modernised, but sticking mainly to redundant version checks here, to keep the diff/PR smaller.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46579
Reviewed By: zou3519
Differential Revision: D24453571
Pulled By: ezyang
fbshipit-source-id: c2cfcf05d6c5f65df64d89c331692c9aec09248e
Summary:
No issue opened for this (that I can see) and it was a fairly small change, so just opening this PR directly!
The docstring for `torch.load` had some of parameter descriptions including typos like ``:meth`readline` `` instead of``:meth:`readline` ``. This PR corrects that :)
<img width="811" alt="image" src="https://user-images.githubusercontent.com/30357972/102128240-7fa33500-3e45-11eb-8f54-ce5ca7bba96c.png">
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49350
Reviewed By: glaringlee
Differential Revision: D25543041
Pulled By: mrshenli
fbshipit-source-id: 10db04d58dd5b07777bdd51d3fcb3c45dea4c84b