Previously, if you called `torch.fx.wrap()` on the same thing twice, it would add two entries to `_wrapped_fns_to_patch`. Then, when tracing, the patcher would process them both. On the second entry, the patcher would double-wrap the fn (e.g. `wrap(wrap(orig_fn))`)
This makes it so that wrapping is observable after the trace. While normally, a Patcher instance will "revert" the wrapping after tracing, the double wrapped function goes from `wrap(wrap(orig_fn)) -> wrap(orig_fn)`.
This happens to work in normal fx stuff (after all, the wrapper function will behave exactly like the original function). But it upsets torch.package, which is not expecting to see a weird wrapper function in the graph.
This PR adds a dictionary to deduplicate `wrap()` calls, ensuring that the patcher only operates each once per frame-fn pair.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104838
Approved by: https://github.com/Chillee
Summary:
In order to better track models after serialization, this change writes a serialization_id as a UUID to inline container. Having this ID enables traceability of model in saving and loading events.
serialization_id is generated as a new UUID everytime serialization takes place. It can be thought of as a model snapshot identifier at the time of serialization.
Test Plan:
```
buck2 test @//mode/dev //caffe2/caffe2/serialize:inline_container_test
```
Local tests:
```
buck2 run @//mode/opt //scripts/atannous:example_pytorch_package
buck2 run @//mode/opt //scripts/atannous:example_pytorch
buck2 run @//mode/opt //scripts/atannous:example_pytorch_script
```
```
$ unzip -l output.pt
Archive: output.pt
Length Date Time Name
--------- ---------- ----- ----
36 00-00-1980 00:00 output/.data/serialization_id
358 00-00-1980 00:00 output/extra/producer_info.json
58 00-00-1980 00:00 output/data.pkl
261 00-00-1980 00:00 output/code/__torch__.py
326 00-00-1980 00:00 output/code/__torch__.py.debug_pkl
4 00-00-1980 00:00 output/constants.pkl
2 00-00-1980 00:00 output/version
--------- -------
1045 7 files
```
```
unzip -p output.pt "output/.data/serialization_id"
a9f903df-cbf6-40e3-8068-68086167ec60
```
Differential Revision: D45683657
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100994
Approved by: https://github.com/davidberard98
Summary:
Make it a bit easier to run the tests anywhere/avoid skipping the tests by using buffers instead of temporary files.
[Er, still figuring out how the sync tooling works, I'll send this against github once the first diff is landed]
Test Plan: buck2 test
Reviewed By: fluckydog232
Differential Revision: D44818261
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98798
Approved by: https://github.com/ezyang
Summary: IL generates massive function names: which meant that the pickle opcode used is BINUNICODE instead of the short version -- and then it would silently get skipped while pickling with protocol 4.
Differential Revision: D44815351
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98674
Approved by: https://github.com/ezyang
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
I applied some flake8 fixes and enabled checking for them in the linter. I also enabled some checks for my previous comprehensions PR.
This is a follow up to #94323 where I enable the flake8 checkers for the fixes I made and fix a few more of them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94601
Approved by: https://github.com/ezyang
Summary:
Makes torch.package debugging more transparent by
1. Pointing out not implictily externed modules in the standard library.
2. Creating a debug mode for users to find the source of broken modules.
Test Plan: Run package tests
Differential Revision: D42728753
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92939
Approved by: https://github.com/kurman
Summary: To get source for a particular module, the "correct" thing to do is to check the module's spec and use `get_source` if it's a SourceFileLoader, since subclasses may look elsewhere than the `__file__`, and the spec will give the source of truth. For torch packager, however, we prefer to use linecache, but the loader could still change the file, so we figure out the file for the module using the spec's loader rather than using `module.__file__`, if possible.
Test Plan: This code path will get exercised by CI. Also added a test for remapped files.
Differential Revision: D41412983
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90258
Approved by: https://github.com/PaliC
Summary:
This pr addresses https://github.com/pytorch/multipy/issues/82 and https://github.com/pytorch/multipy/issues/44. The changes will be copied over to [pytorch/multipy](https://github.com/pytorch/multipy) as well.
A C extension module behaves a bit differently than a normal python package as it does not contain a `__path__` attribute. However, these modules still have information about their submodules. This PR also checks if a module is a C extension module and checks if the module we are looking for is in it's children.
For example, if we are importing `torch._C._nn` we check if the parent `torch._C` is a C extension module if necessary, and then check if `torch._C._nn` is a proper child of `torch._C`.
Differential Revision: D37630120
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80917
Approved by: https://github.com/d4l3k
If installed with pep517 support, `torchvision` will be build againstreleased version of PyTorch rather than against the one currently installed on the system
Also update `torchvision` hash to 8a45147f9d and:
- Added `maskrcnn_resnet50_fpn_v2`, `maskrcnn_resnet50_fpn_v2`, `retinanet_resnet50_fpn_v2`, `ssd300_vgg16`, `fcos_resnet50_fpn` and `ssdlite320_mobilenet_v3_large` to the list of untraceable models
- Set default input size to (1, 3, 16, 224, 224) for `mvit_v1_b` model
- Skipped `test_roi_aligned`,`test_batched_nms`, `test_roi_pooled` and `test_roi_align_aligned` ONNX test (tracked in https://github.com/pytorch/pytorch/issues/81121 )
- Skipped TorchVision integration tests in `test_package` (tracked in https://github.com/pytorch/pytorch/issues/81115 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81074
Approved by: https://github.com/kit1980
Summary:
Applies new import merging and sorting from µsort v1.0.
When merging imports, µsort will make a best-effort to move associated
comments to match merged elements, but there are known limitations due to
the diynamic nature of Python and developer tooling. These changes should
not produce any dangerous runtime changes, but may require touch-ups to
satisfy linters and other tooling.
Note that µsort uses case-insensitive, lexicographical sorting, which
results in a different ordering compared to isort. This provides a more
consistent sorting order, matching the case-insensitive order used when
sorting import statements by module name, and ensures that "frog", "FROG",
and "Frog" always sort next to each other.
For details on µsort's sorting and merging semantics, see the user guide:
https://usort.readthedocs.io/en/stable/guide.html#sorting
Test Plan: S271899
Reviewed By: lisroach
Differential Revision: D36402110
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78973
Approved by: https://github.com/osalpekar
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72237
add a generic zip file reader/writer to torch.package in order to get rid of dependency on torch for non torchscript / tensor related usages of package. This also enables users to create a derived class from the zip file reader/writer classes to have their own serialization/deserialization if it's desired for performance needs.
https://www.internalfb.com/intern/diff/D35423079/ was reverted due to this refactor changing the name of where most of the implementation components of PackageExporter/PackageImporter come from like ModuleActionType_ etc.
This diff also changes the import paths where these components come from to point to the correct file compared to D35423079
Test Plan: Imported from OSS
Reviewed By: malfet
Differential Revision: D35423079
Pulled By: PaliC
fbshipit-source-id: 31abc4364d5fd007911cfb67cf36ebfac5d786f4
(cherry picked from commit 023b0d1445e0b1e1bb7a03c660cd62eb9d26d2a6)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74702
- add flag in constructor.
- add if condition routing to extern c-extension in method `_intern_module`.
- add unit test for the new condition.
Test Plan: Imported from OSS
Reviewed By: PaliC
Differential Revision: D35124731
fbshipit-source-id: a4b7fdf3210e0ad4bfd1ea30fd94595d10405987
(cherry picked from commit 57239a77ae099328025ab2d634e7880bd14a473b)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74610
Adding python version to exported package and reading it on import as per this issue in github https://github.com/pytorch/pytorch/issues/74068
ghstack-source-id: 152003088
Test Plan: CI Tests
Reviewed By: PaliC
Differential Revision: D35062709
fbshipit-source-id: 04091a1255a09b96255112a60d31df127c424193
(cherry picked from commit ed39fd54b8b20918dac89a2873ecccf06aafd724)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74315
Now instead of spitting out a NotImplemented Error when the Package Exporter finds an object for a mocked module, it combines these errors with the rest of the Packaging Errors of PackageExporter to get something like
```
torch.package.package_exporter.PackagingError:
* Module was mocked out, but is still being used in the package.Please intern or extern the mocked modules if objects are supposed to be inthe package.
package_a
Context: Object(s) '['PackageASubpackageObject']' from module package_a was mocked out during packaging but is being used in resource - obj.pkl in package obj.
package_a.subpackage
Context: Object(s) '['PackageASubpackageObject']' from module package_a.subpackage was mocked out during packaging but is being used in resource - obj.pkl in package obj.
```
This makes it significantly easier to fix mocked object errors as they all should appear at once.
Test Plan: Imported from OSS
Reviewed By: d4l3k
Differential Revision: D34951973
Pulled By: PaliC
fbshipit-source-id: 01ee4ba3767967ef9a9bcd69ad86362ebc100b2d
(cherry picked from commit 900edd270ee8f5802fc6e56df08fff6b073ac6f2)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74315
Now instead of spitting out a NotImplemented Error when the Package Exporter finds an object for a mocked module, it combines these errors with the rest of the Packaging Errors of PackageExporter to get something like
```
torch.package.package_exporter.PackagingError:
* Module was mocked out, but is still being used in the package.Please intern or extern the mocked modules if objects are supposed to be inthe package.
package_a
Context: Object(s) '['PackageASubpackageObject']' from module package_a was mocked out during packaging but is being used in resource - obj.pkl in package obj.
package_a.subpackage
Context: Object(s) '['PackageASubpackageObject']' from module package_a.subpackage was mocked out during packaging but is being used in resource - obj.pkl in package obj.
```
This makes it significantly easier to fix mocked object errors as they all should appear at once.
Test Plan: Imported from OSS
Reviewed By: aivanou
Differential Revision: D34932200
Pulled By: PaliC
fbshipit-source-id: 7f12bd88dbfbad974fd04b5dcaba3203b5c68a04
(cherry picked from commit 73df434ddd3e26f0e4c5ea3dd2ca1b6984736213)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70641
Raises a not implemented error if we attempt to pickle an object which uses a mocked module. Now we no longer have to load the object to get this check, and instead happens right on the saving path.
Review History is on https://github.com/pytorch/pytorch/pull/69793 PR was moved to a different branch due to original branch getting corrupted.
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D33414365
Pulled By: PaliC
fbshipit-source-id: 6d72ddb05c47a3d060e9622ec0b6e5cd6c6c71c8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68028
Today, we demangle a typename before passing it to the TorchScript
compiler. This breaks compilation of torch classes in cases where we are
attempting to script the same class name from inside a package and out,
since we will return the same qualified name for both.
Differential Revision:
D32261907
D32261907
Test Plan: Imported from OSS
Reviewed By: saketh-are
Pulled By: suo
fbshipit-source-id: 921bc03ad385d94b9279fbc6f3b7dcd0ddbe5bc7
Summary:
Fixes https://github.com/pytorch/pytorch/issues/65154, tests for backwards compatibility of torch.package by checking if packages that were created before can still be loaded.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66739
Reviewed By: suo
Differential Revision: D31771526
Pulled By: PaliC
fbshipit-source-id: ba8c652c647b94114a058e4c7d7f1c7ce6033d84
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62030
Remove dtype tracking from Python Storage interface, remove all the different `<type>Storage` classes except for `ByteStorage`, and update serialization accordingly, while maintaining as much FC/BC as possible
Fixes https://github.com/pytorch/pytorch/issues/47442
* **THE SERIALIZATION FORMAT IS FULLY FC/BC.** We worked very hard to make sure this is the case. We will probably want to break FC at some point to make the serialization structure of tensors make more sense, but not today.
* There is now only a single torch.ByteStorage class. Methods like `Tensor.set_` no longer check that the dtype of storage is appropriate.
* As we no longer know what dtype of a storage is, we've **removed** the size method from Storage, replacing it with nbytes. This is to help catch otherwise silent errors where you confuse number of elements with number of bytes.
* `Storage._new_shared` takes a `nbytes` kwarg and will reject previous positional only calls. `Storage._new_with_file` and `_set_from_file` require explicit element size arguments.
* It's no longer possible to convert storages to different types using the float/double/etc methods. Instead, do the conversion using a tensor.
* It's no longer possible to allocate a typed storage directly using FloatStorage/DoubleStorage/etc constructors. Instead, construct a tensor and extract its storage. The classes still exist but they are used purely for unpickling.
* The preexisting serialization format stores dtype with storage, and in fact this dtype is used to determine the dtype of the tensor overall.
To accommodate this case, we introduce a new TypedStorage concept that exists only during unpickling time which is used to temporarily store the dtype so we can construct a tensor. **If you overrode the handling of pickling/unpickling, you MUST add handling for TypedStorage** or your serialization code will degrade to standard file-based serialization.
Original pull request: https://github.com/pytorch/pytorch/pull/59671
Reviewed By: soulitzer, ngimel
Differential Revision: D29466819
Pulled By: ezyang
fbshipit-source-id: 4a14e5d3c2b08e06e558683d97f7378a3180b00e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65101
As title. Previously this was guarded against for implementation
simplicity, as we didn't really think there was a use case for saving a
mangled module name directly.
But people started doing stuff like:
```
exporter.save_module(my_imported_obj.__module__)
```
which implicitly passes along the mangled module name.
This PR makes it so that given `PackageImporter` instance can always
import modules that it created, and changes `PackageExporter` to
properly demangle the resulting module name when writing the package to
the export archive.
Differential Revision:
D30975712
D30975712
Test Plan: Imported from OSS
Pulled By: suo
fbshipit-source-id: d9e849bf651713890e72dccdcef74fa52d377149
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63121
Re-introducing this diff with a small change to ignore setting Tracer classes on GraphModules when the Tracer class is defined not at module-level (prevents pickling).
Previous, reverted Pull Request: https://github.com/pytorch/pytorch/pull/62497
Reviewed By: houseroad
Differential Revision: D30252776
fbshipit-source-id: 42d2bc846e4b32d00563419c38c02b63cd0986e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63053
Original commit changeset: eca09424ad30
The original diff - D30019214 (6286d33878) breaks the publish flow in model saving.
Test Plan: ci
Differential Revision: D30236517
fbshipit-source-id: 3e05db02fc1cbbc2ed262c83bf56d555277abb34
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62497
Previously named: add support for custom tracer in __reduce_package__
Stores a Tracer class on a Graph created by Tracer, and copies the Tracer class into the GraphModule's state so that when a GraphModule is packaged by torch package, it can be reconstructed with the same Tracer and GraphModule class name.
Reviewed By: suo
Differential Revision: D30019214
fbshipit-source-id: eca09424ad30feb93524d481268b066ea55b892a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61807
These shouldn't be separate files, they test the same thing
Differential Revision:
D29748967
D29748967
Test Plan: Imported from OSS
Reviewed By: Lilyjjo
Pulled By: suo
fbshipit-source-id: 177f40fa460d00d064dfd1f33a0b6656b214a296
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61806
Currently, if you do `save_pickle` on a ScriptModule, then `save_pickle`
on a tensor, this would result in a `0.storage` tensor being written
*twice* to the zip archive. This would cause weird bugs on the
serializing side (this presented as a ASAN-detected heap buffer overflow
because we tried to read more memory from a tensor than we actually
had).
Turns out this was because when we did:
```
self.storage_context = self.script_module_serializer.storage_context()
```
it returned a new copy of the storage context, so we weren't actually
assigning unique names to tensors!!
This PR fixes the issue by making `(De)SerializationStorageContext`
non-copyable and fixing up the parts of the bindings that returned by
copy.
Differential Revision:
D29748969
D29748969
Test Plan: Imported from OSS
Reviewed By: Lilyjjo
Pulled By: suo
fbshipit-source-id: c2f89ab270e07e7a111fb35c545b5e07b804dc3c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61469
This feature is not supported, error out early.
Differential Revision:
D29639797
D29639797
Test Plan: Imported from OSS
Reviewed By: Lilyjjo
Pulled By: suo
fbshipit-source-id: 775ed78638fb6da8f830b632726b00c0533ed176