Fixes#144976
Using appoach ① `IO[bytes]`, but could also try with a protocol.
## Notes:
- moved `torch.serialization.FILE_LIKE` to `torch.types.FileLike`
- Use `FileLike` annotation where it makes sense
- made sure those functions also support `os.PathLike`
- Replaced `isinstance(x, io.BytesIO)` with `isinstance(x, (io.IOBase, IO))` where appropriate.
- Replaced `BinaryIO` with `IO[bytes]` (the two ABCs are almost identical, the only difference is that `BinaryIO` allows `bytearray` input to `write`, whereas `IO[bytes]` only `bytes`)
- needed to make `torch.serialization._opener` generic to avoid LSP violations.
- skipped `torch/onnx/verification` for now (functions use `BytesIO.getvalue` which is not part of the `IO[bytes]` ABC, but it kind of seems that this is redundant, as e.g. `onnx.load` supports `str | PathLike[str] | IO[bytes]` directly...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144994
Approved by: https://github.com/ezyang, https://github.com/Skylion007
Adds a ruff lint rule to ban raising raw exceptions. Most of these should at the very least be runtime exception, value errors, type errors or some other errors. There are hundreds of instance of these bad exception types already in the codebase, so I have noqa'd most of them. Hopefully this error code will get commiters to rethink what exception type they should raise when they submit a PR.
I also encourage people to gradually go and fix all the existing noqas that have been added so they can be removed overtime and our exception typing can be improved.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124570
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57661
Thie Pickle "specification" (pickletools.py) states that the argument to
a BINUNICODE opcode must be UTF-8 encoded. However, if a PyTorch custom
class returns a non-UTF-8 std::string from its pickle method the
libtorch Pickler will write it to the output pickle without complaining.
Python's _Unpickler (the Python implementation of Unpickler) always
throws an exception when trying to deserialize these invalid pickles.
We still want to be able to dump these pickle files. Update
DumpUnpickler to create its own opcode dispatch table (initialized as a
clone of the _Unpickler dispatch table) and patch in a custom function
for the BINUNICODE op. We try to emulate the default behavior, but any
UnicodeDecodeError is caught and replaced with a dummy object. This
could violate the assumptions of a user that expects a str in that
position, so we disable this behavior by default.
Update model_dump to recognize this special object and allow it to be
rendered.
Test Plan: Dumped and viewed a model with an invalid string in an object state.
Reviewed By: malfet
Differential Revision: D28531392
Pulled By: dreiss
fbshipit-source-id: ab5aea20975a0ef53ef52a880deaa2c5a626e4a2
Summary:
For some of the end to end flow projects, we will need the capabilities to read module information during model validation or model publishing.
Creating this model_reader.py for utilities for model content reading, this diff we included the following functionalities:
1. read the model bytecode version;
2. check if a model is lite PyTorch script module;
3. check if a model is PyTorch script module.
This diff is recreated from the reverted diff: D24655999 (7f056e99dd).
Test Plan:
```
[xcheng16@devvm1099]/data/users/xcheng16/fbsource/fbcode% buck test //caffe2/torch/fb/mobile/tests:mobile_model_reader_tests
Action graph will be rebuilt because files have been added or removed.
Parsing buck files: finished in 10.4 sec
Creating action graph: finished in 22.2 sec
Building: finished in 01:29.1 min (100%) 10619/10619 jobs, 1145 updated
Total time: 02:01.8 min
More details at https://www.internalfb.com/intern/buck/build/f962dfad-76f9-457a-aca3-768ce20f0c31
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 172633f6-6b5b-49e9-a632-b4efa083a001
Trace available for this run at /tmp/tpx-20201109-165156.109798/trace.log
Started reporting to test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649712677511
✓ ListingSuccess: caffe2/torch/fb/mobile/tests:mobile_model_reader_tests - main (18.229)
✓ Pass: caffe2/torch/fb/mobile/tests:mobile_model_reader_tests - test_is_pytorch_lite_module (caffe2.torch.fb.mobile.tests.test_model_reader.TestModelLoader) (8.975)
✓ Pass: caffe2/torch/fb/mobile/tests:mobile_model_reader_tests - test_is_pytorch_script_module (caffe2.torch.fb.mobile.tests.test_model_reader.TestModelLoader) (9.136)
✓ Pass: caffe2/torch/fb/mobile/tests:mobile_model_reader_tests - test_read_module_bytecode_version (caffe2.torch.fb.mobile.tests.test_model_reader.TestModelLoader) (9.152)
Summary
Pass: 3
ListingSuccess: 1
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649712677511
```
Reviewed By: husthyc
Differential Revision: D24848563
fbshipit-source-id: ab3371e111206a4bb4d07715c3314596cdc38d2c
Summary:
For some of the end to end flow projects, we will need the capabilities to read module information during model validation or model publishing.
Creating this model_reader.py for utilities for model content reading, this diff we included the following functionalities:
1. read the model bytecode version;
2. check if a model is lite PyTorch script module;
3. check if a model is PyTorch script module.
Test Plan:
```
[xcheng16@devvm1099]/data/users/xcheng16/fbsource/fbcode% buck test pytorch_mobile/utils/tests:mobile_model_reader_tests
Processing filesystem changes: finished in 1.5 sec
Parsing buck files: finished in 1.6 sec
Building: finished in 4.9 sec (100%) 9249/43504 jobs, 2 updated
Total time: 6.5 sec
More details at https://www.internalfb.com/intern/buck/build/6d0e2c23-d86d-4248-811f-31cb1aa7eab3
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 2ffccd62-ece5-44b5-8350-3a292243fad9
Trace available for this run at /tmp/tpx-20201030-122220.664763/trace.log
Started reporting to test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649711969390
✓ ListingSuccess: pytorch_mobile/utils/tests:mobile_model_reader_tests - main (10.234)
✓ Pass: pytorch_mobile/utils/tests:mobile_model_reader_tests - test_is_pytorch_lite_module (pytorch_mobile.utils.tests.test_model_reader.TestModelLoader) (7.039)
✓ Pass: pytorch_mobile/utils/tests:mobile_model_reader_tests - test_is_pytorch_script_module (pytorch_mobile.utils.tests.test_model_reader.TestModelLoader) (7.205)
✓ Pass: pytorch_mobile/utils/tests:mobile_model_reader_tests - test_read_module_bytecode_version (pytorch_mobile.utils.tests.test_model_reader.TestModelLoader) (7.223)
Summary
Pass: 3
ListingSuccess: 1
Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/3940649711969390
Reviewed By: husthyc
Differential Revision: D24655999
fbshipit-source-id: 5095ca158d89231fb17285d445548f91ddb89bab
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35168
Sometimes when a saved model isn't working, it's nice to be able to look
at the contents of the pickle files. Unfortunately, pickletools output
isn't particularly readable, and unpickling is often either not possible
or runs so much post-processing code that it's not possible to tell
exactly what is present in the pickled data.
This script uses a custom Unpickler to unpickle (almost) any data into
stub objects that have no dependency on torch or any other runtime types
and suppress (almost) any postprocessing code.
As a convenience, the wrapper can search through zip files, supporting
command lines like
`python -m torch.utils.show_pickle /path/to/model.pt1@*/data.pkl`
When the module is invoked as main, we also install a hack in pprint to
allow semi-resonable formatting of our stub objects.
Test Plan: Ran it on a data.pkl, constants.pkl, and a debug pkl
Differential Revision: D20842550
Pulled By: dreiss
fbshipit-source-id: ef662d8915fc5795039054d1f8fef2e1c51cf40a