Commit Graph

27 Commits

Author SHA1 Message Date
Kurt Mohler
b69155f754 Avoid dtype mismatch error in torch.save if storages are unallocated (#68787)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58970

cc mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68787

Reviewed By: mruberry

Differential Revision: D32617425

Pulled By: anjali411

fbshipit-source-id: fe7f2374e4ef4428346a0a202cae8e0d382e03ab
2021-11-24 09:51:29 -08:00
Kurt Mohler
bc3d380ed1 Throw error when saving storages that view same data with different type (#66949)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/58970

cc mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66949

Reviewed By: albanD

Differential Revision: D31926323

Pulled By: anjali411

fbshipit-source-id: f6e7acc0c1968b70a94f9b0b69a32780e8e21a62
2021-11-16 08:44:44 -08:00
Jane Xu
b07371f19c [skip ci] Set test owners for serialization tests (#66862)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

cc mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66862

Reviewed By: saketh-are

Differential Revision: D31828615

Pulled By: janeyx99

fbshipit-source-id: 8d28970eead9d6f26e9ea64b823295d9c9e1469d
2021-10-21 13:22:18 -07:00
Kurt Mohler
5883523c1d Remove dtype from torch.Storage and use only torch.ByteStorage (#62030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030

Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible

Fixes https://github.com/pytorch/pytorch/issues/47442

* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls.  `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
 To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.

Original pull request: https://github.com/pytorch/pytorch/pull/59671

Reviewed By: soulitzer, ngimel

Differential Revision: D29466819

Pulled By: ezyang

fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
2021-10-05 13:50:34 -07:00
Alban Desmaison
7c62b6e973 add deepcopy support to subclasses (#65584)
Summary:
Happy to get any feedback on how to make this code cleaner!

This:
- Fix Tensor attribute deepcopy BC-breaking?
- Add a test for Tensor attribute deepcopy
- Fix subclass deepcopy
- Moves the subclass serialization tests into their own class not to interfere with other serialization test logic
- Add a test for subclass deepcopy

cc ezyang gchanan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65584

Reviewed By: gchanan

Differential Revision: D31206590

Pulled By: albanD

fbshipit-source-id: 74a8f0767f4933b9c941fbea880a8fd1b893ea2f
2021-09-27 14:36:22 -07:00
Shen Li
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
Zsolt Dollenstein
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
Alban Desmaison
e6a227465b Add serialization support for slots and subclass getstate/setstate (#62745)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62745

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D30113112

Pulled By: albanD

fbshipit-source-id: 6c562d0c060fb0280e5e3d432bb42fb833e6d500
2021-08-05 06:49:44 -07:00
Edward Yang
cf1f59452b Hacky support for meta tensor serialization. (#62192)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62192

This support is hacky because it doesn't preserve meta tensor storage
sharing (e.g., if you serialize a model with shared storage, e.g., a
tensor and a view on a tensor, when I deserialize the viewing
relationship will be broken and these are just different tensors.) The
hack is also durable, in the sense that we will be on the hook for
supporting `_rebuild_meta_tensor_no_storage` in perpetuity in the
future, even if we change our mind about the serialization format.

This unblocks an FB production use case. I didn't add C++ support to minimize
blast area of this patch.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D29910535

Pulled By: ezyang

fbshipit-source-id: d98dcdd0108dfc3ae730a071d3c583b6d0281d21
2021-07-26 14:33:45 -07:00
peter
8d7338e820 Enable tests using named temp files on Windows (#49640)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49640

Reviewed By: ngimel

Differential Revision: D25681548

Pulled By: malfet

fbshipit-source-id: 0e2b25817c98d749920cb2b4079033a2ee8c1456
2020-12-29 09:57:35 -08:00
Rong Rong
b98e35948f fix test_serialization not working with Windows. (#46120)
Summary:
fixes https://github.com/pytorch/pytorch/issues/45917.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46120

Reviewed By: janeyx99

Differential Revision: D24253317

Pulled By: walterddr

fbshipit-source-id: 6caa0970b3e3eb972d314639be773a104a4e89a5
2020-10-12 15:18:46 -07:00
Gregory Chanan
2070834b9e Improve error checking of Storage._writeFile. (#46036)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46036

Previously, this function didn't do error-bounds checking on the GetItem (GET_ITEM) calls, which led to issues like https://github.com/pytorch/pytorch/issues/46020.

A better solution would be to use pybind, but given writing the file is going to dominate bounds checking, this is strictly better.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D24228370

Pulled By: gchanan

fbshipit-source-id: f5d0a3d21ff12b4380beefe1e9954fa81ea2f567
2020-10-12 11:10:04 -07:00
Rong Rong
275bb5e801 Fix flakiness in caffe2/test:serialization - test_serialization_new_format_old_format_compat (#45915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45915

Use temp file instead

Test Plan: buck test mode/opt-asan //caffe2/test:serialization -- 'test_serialization_new_format_old_format_compat \(test_serialization\.TestBothSerialization\)' --run-disabled --jobs 18 --stress-runs 10 --record-results

Reviewed By: malfet

Differential Revision: D24142278

fbshipit-source-id: 9c88330fc5664d464daa9124e67644f497353f3b
2020-10-06 18:11:58 -07:00
James Reed
9c82b570bf Fix delegating to jit.load from torch.load (#40937)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40937

Test Plan: Imported from OSS

Differential Revision: D22363816

Pulled By: jamesr66a

fbshipit-source-id: 50fc318869407fe8b215368026eaceb129b68a46
2020-07-06 09:00:13 -07:00
peter
c71ec1c717 Fix zip serialization for file > 2GiB for Windows (#40783)
Summary:
`long long == int64_t != long` in MSVC
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40783

Differential Revision: D22328757

Pulled By: ezyang

fbshipit-source-id: bc7301d6b0e7e00ee6d7ca8637e3fce7810b15e2
2020-07-01 08:15:27 -07:00
Wojciech Baranowski
fcadca1bda serialization: validate sparse tensors after loading (#34059)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33439

This introduces torch._sparse_coo_tensor_unsafe(...) and
torch._validate_sparse_coo_tensor_args(...)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34059

Differential Revision: D22161254

Pulled By: ezyang

fbshipit-source-id: 994efc9b0e30abbc23ddd7b2ec987e6ba08a8ef0
2020-06-30 22:31:21 -07:00
James Reed
3ecae99dd9 Support Pathlike for zipfile serialization (#40723)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40723

Test Plan: Imported from OSS

Differential Revision: D22294575

Pulled By: jamesr66a

fbshipit-source-id: b157fa0ab02c4eb22cb99ac870942aeab352b0c5
2020-06-30 10:07:23 -07:00
James Reed
320164f878 Fix zip serialization for file > 2GiB (#40722)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40722

Test Plan: Imported from OSS

Differential Revision: D22294016

Pulled By: jamesr66a

fbshipit-source-id: 0288882873d4b59bdef37d018c030519c4be7f03
2020-06-29 19:17:06 -07:00
Michael Voznesensky
fce01a9bab [JIT] Make new zip serialization for torch save/load significantly (~70%) faster (#38379)
Summary:
Before:
```
2020-05-11 18:31:41 INFO     Benchmarking 'basic', best of 10 runs (with 1 warmup runs)
{
  "Big Tensors Save": {
    "mean": 17.8048762,
    "median": 17.458917
  },
  "Big Tensors Load": {
    "mean": 3.2556887,
    "median": 2.9668495000000004
  },
  "Small Tensors Save": {
    "mean": 4.0381357,
    "median": 3.9440125
  },
  "Small Tensors Load": {
    "mean": 5.8792499,
    "median": 5.603067
  },
  "benchmark_run_at": "2020-05-12T01:31:41"
}
```
After
```
Use zipfile serialization: True
2020-05-12 20:15:32 INFO     Benchmarking 'basic', best of 10 runs (with 1 warmup runs)
{
  "Big Tensors Save": {
    "mean": 4.7534657,
    "median": 4.646732
  },
  "Big Tensors Load": {
    "mean": 3.6001919,
    "median": 3.493285
  },
  "Small Tensors Save": {
    "mean": 4.1066924,
    "median": 4.1219255
  },
  "Small Tensors Load": {
    "mean": 6.3902358,
    "median": 6.36977
  },
  "benchmark_run_at": "2020-05-13T03:15:32"
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38379

Differential Revision: D21779494

Pulled By: voznesenskym

fbshipit-source-id: 694d65029a5b817424d454bd331e285df828c67a
2020-05-29 01:56:18 -07:00
Mike Ruberry
13120bf677 Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21740237

Pulled By: mruberry

fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042
2020-05-27 06:31:07 -07:00
Rohan Varma
63e545e0fe Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol
Test Plan: revert-hammer

Differential Revision:
D21717199

Original commit changeset: 9feb856f94ee

fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259
2020-05-26 18:23:59 -07:00
Mike Ruberry
6ddca30b2d Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21717199

Pulled By: mruberry

fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
2020-05-26 08:30:23 -07:00
Nikita Shulga
47c4dca1ab Remove python-2 or python<3.5 checks from unit tests (#37252)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37252

Test Plan: CI

Differential Revision: D21241083

Pulled By: malfet

fbshipit-source-id: 44164b822f7905288abb2beda0175d2162d86143
2020-04-24 17:42:04 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Nathan Goldbaum
84101f353e Avoid problematic pickle usages on Python 3.8.0 and 3.8.1 (#33824)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/32289

This has been fixed upstream as of Python 3.8.2. I think the easiest and least invasive way to ameliorate this is to catch the error condition and print a more informative error asking the user to update their Python version. It might be possible to buffer the data into memory and then read from memory, but that would be an invasive change and might cause memory exhaustion for very large models.

Suggestions for alternate fixes or ways to improve the error message wording are very welcome.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33824

Differential Revision: D20131722

Pulled By: ezyang

fbshipit-source-id: a6e3fbf4bf7f9dcce5772b36f7a622cbf14b5ae4
2020-02-26 21:15:38 -08:00
davidriazati
74ce3a032c Fix some bugs with zipfile serialization (#32244)
Summary:
Stacked PRs
 * #32958 - Make zip serialization the default
 * **#32244 - Fix some bugs with zipfile serialization**

It includes the following changes:
* Split up tests so that we can test both serialization methods
    * Loading something within a buffer doesn't work anymore, so those tests are only on the old serialization method (it's possible but introduces a big slowdown since it requires a linear scan of the entire zipfile to find the magic number at the end)
* Call `readinto` on a buffer if possible instead of `read` + a copy
* Disable CRC-32 checks on read (there was some issue where miniz said the CRC was wrong but `zipinfo` and `unzip` said the zip file was fine)
](https://our.intern.facebook.com/intern/diff/19418935/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32244

Pulled By: driazati

Reviewed By: eellison

Differential Revision: D19418935

fbshipit-source-id: df140854f52ecd04236225417d625374fd99f573
2020-02-05 15:32:14 -08:00
davidriazati
2060e0a9dd Split serialization tests to their own file (#32241)
Summary:
Stacked PRs
 * #32244 - Make zip serialization the default
 * **#32241 - Split serialization tests to their own file**

This makes them all easier to run as a batch. This PR is just a code move / fixing up imports. There are still some serialization tests in `test_torch.py` as part of `TestDeviceType`.
](https://our.intern.facebook.com/intern/diff/19415826/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32241

Pulled By: driazati

Differential Revision: D19415826

fbshipit-source-id: a3f6cfe1626ff2f9b9631c409bf525bd32e4639b
2020-01-28 15:04:05 -08:00