Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44324
**Summary**
This commit adds reference semantics to TorchScript class types;
modifications made to them within TorchScript will be visible in Python.
**Test Plan**
This commit adds a unit test to `TestClassType` that checks that
modifications made to a class type instance passed into TorchScript are
visible in Python after executing the scripted function or module.
**Fixes**
This commit closes#41421.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D24912807
Pulled By: SplitInfinity
fbshipit-source-id: d64ac6211012425b040b987e3358253016e84ca0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60384
Currently inlining module graph will drop module hierarchy info on Python side. Here we retrieve the module hierarchy from cpp side and expose it to a new Python API on Node called `moduleHierarchy()`.
Test Plan:
Usage:
```
torch._C._jit_pass_inline(module.graph)
torch._C._jit_pass_propagate_shapes_on_graph(module.graph)
node = module.graph.findNode("quantized::conv2d_relu")
'top(' + module.original_name + ').' + node.moduleHierarchy() + '.' + node.kind()
```
Output:
```
'top(QuantWrapper).module(FBNetHR).0(Sequential).xif0_0(ConvBNRelu).conv(ConvReLU2d).quantized::conv2d_relu'
```
Reviewed By: kimishpatel
Differential Revision: D29252169
fbshipit-source-id: 74163a87f919e061e5e75dfebc4c5cdbe8489d93
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57334
Here's a possibly controversial PR. These counters got in the way of
generalizing the fuser tests to handle arbitrary devices, and I guess I'm just
generally skeptical that they provide much value. While true that they let us
observe whether fusion groups were created, we already have assertions based on
the shape of the graph, and I'm not sure that I trust those any less than these
counters.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D29471484
Pulled By: bertmaher
fbshipit-source-id: f6d76f6e72dbfb581acff1d834b0c74500941b57
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735
1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue.
2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`)
3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D29075276
Pulled By: Lilyjjo
fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12
Summary:
Description:
- Before this, logging level could only be changed by changing the env
variable "PYTORCH_JIT_LOG_LEVEL"
- Can change the level from python now
- Have not added stream configuration for now
- Configuration is stored in a singleton class managing the options
Issue Link: https://github.com/pytorch/pytorch/issues/54188
Gotchas:
- Created separate functions
`::torch::jit::get_jit_logging_levels/set_jit_logging_levels` instead of
using the singleton class's method directly
- This is because when running test cases, two different instances
of the singleton are created for the test suite and the actual code
(`jit_log.cpp`)
- On using these methods directly, `is_enabled` calls the singleton
in `jit_log.cpp` while we are setting the config using another
singleton
- See: https://stackoverflow.com/questions/55467246/my-singleton-can-be-called-multiple-times
API:
- To set the level: `torch._C._jit_set_logging_option("level")`
- To get the level: `torch._C._jit_get_logging_option()`
Testing:
- UTs were added for C++
- A very simple UT was added for python to just check if the API is
being called correctly
- The API was checked by running trace in a sample python file
- Set env variable to "" and used `_jit_set_logging_option` in python to set the variable to `>dead_code_elimination`
- The error output had logs of form [DUMP..] [UPDATE...] etc
Fixes https://github.com/pytorch/pytorch/issues/54188
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58821
Reviewed By: soulitzer
Differential Revision: D29116712
Pulled By: ZolotukhinM
fbshipit-source-id: 8f2861ee2bd567fb63b405953d035ca657a3200f
Summary:
This commit removes the warning that suggests that users script their
dictionaries before passing them into TorchScript code. The ScriptDict feature
is not fully ready, so it does not make sense to recommend this yet.
Test Plan:
Sandcastle.
In addition, the PyPER test broken by the original diff passes:
```
buck test mode/opt //caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt -- --exact 'caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt - caffe2.torch.fb.training_toolkit.backend.tests.test_model_materializer_full_sync_lwt.ModelMaterializerFullSyncLwtTest: test_materialization_determinism_cpu' --run-disabled
```
Differential Revision: D28891351
fbshipit-source-id: 2a3a00cde935d670fb1dc7fd8c709ae9c2ad8cdc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59287
D27211605 added a warning in `toIValue` that warns users to script their
dictionaries before passing them to TorchScript functions in order to get some
performance benefits and reference semantics. However, this warning is emitted
every time `toIValue` is called (e.g. when a dictionary is passed to
TorchScript function), which can lead to noisy log output. This diff changes
this changes to use `TORCH_WARN_ONCE` instead.
Test Plan: Sandcastle, OSS CI.
Reviewed By: hyuen
Differential Revision: D28824468
fbshipit-source-id: e651eade4380abaf77c6c8a81ec4e565b0c2c714
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56966
This PR adds a toggle to shape analysis which won't inline complete tensor shapes as constants into the shape compute graph, which is a good stress test on the partial evaluation pipeline.
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision: D28444664
Pulled By: eellison
fbshipit-source-id: a62e424515a8837a4b596546efa93af5e8e61f10
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52659
**Summary**
This commit adds `torch._C.ScriptDict`, a dictionary type that has reference
semantics across the Python/TorchScript boundary. That is, modifications
made to instances of `torch._C.ScriptDict` in TorchScript are visible in
Python even when it is not returned from the function. Instances can be
constructed by passing an instance of a Python dictionary to
`torch.jit.script`. In the case of an empty dictionary, its type is
assumed to be `Dict[str, Tensor]` to be consistent with the handling of
empty dictionaries in TorchScript source code.
`torch._C.ScriptDict` is implemented using a modified version of pybind's `stl_bind.h`-style bindings attached to `ScriptDict`, `ScriptDictIterator` and `ScriptDictKeyIterator`, wrapper classes around `c10::impl::GenericDict` and `c10::impl::GenericDict::iterator`. These bindings allow instances of `torch._C.ScriptDict` to be used as if it were a regular `dict` Python. Reference semantics are achieved by simply retrieving the `IValue` contained in `ScriptDict` in `toIValue` (invoked when converting Python arguments to `IValues` before calling TorchScript code).
**Test Plan**
This commit adds `TestScriptDict` to `test_list_dict.py`, a set of tests
that check that all of the common dictionary operations are supported
and that instances have reference semantics across the
Python/TorchScript boundary.
Differential Revision:
D27211605
D27211605
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Pulled By: SplitInfinity
fbshipit-source-id: 446d4e5328375791aa73eb9e8b04dfe3465af960
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58300
Current state: During graph rewriting that can fuse nodes or add nodes
result in new nodes without debug information that was available in
original node. Thus we lose this information during graph rewrite.
This PR changes graph rewriting API to let user specify how the values
in the replacement pattern map to values in the pattern to be matched.
Then the graph rewriting will copy source range and inlined callstack
from the matched nodes onto the nodes being inserted.
(Note: this ignores all push blocking failures!)
Test Plan:
python test/test_jit.py
TestJit.test_pattern_based_rewrite_with_source_range_preserved
Imported from OSS
Reviewed By: malfet
Differential Revision: D28512465
fbshipit-source-id: 863173c29de726be85b3acbd3ddf3257eea36d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58887
There are some callsites of `torch.distributed.rpc.XXX` APIs that are compiled
or not based on `USE_RPC`. However, `torch::deploy`, at least for now,
is compiled with `USE_RPC=1`, but the `torch.distributed.rpc.XXX` APIs used by
the aforementioned pieces of code are not available (i.e.
`torch.distributed.rpc.is_available()` returns `False`). This can cause
Torchscript compilation to fail, even if the code being compiled doesn't use
RPC.
This commit fixes this problem (at least temporarily) by predicating the use
all thse `torch.distributed.rpc` APIs on the value of
`torch.distributed.rpc.is_available()`.
Test Plan: Ran packaged XLM-R model with C++ benchmark.
Reviewed By: suo
Differential Revision: D28660925
fbshipit-source-id: fbff7c7ef9596549105e79f702987a53b04ba6f9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57501
Add an api _get_model_ops_and_info to get root operators and versioning info of a model in both cxx and python, and the input can be from a file path or buffer.
ghstack-source-id: 129620112
Test Plan: unit test.
Reviewed By: xcheng16, raziel
Differential Revision: D28162765
fbshipit-source-id: 4413c1e906b8a872e4a717d849da37347adbbea4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55926
This is necessary for code like conv2d where we wish to share a generic convolution shape function logic with that of conv2d but for conv2d always infer the output is dimension 4. I'm also hoping the refinement algorithm here could be refactored out and used to support refining tensor types from user annotations. i have a length comment explaining how this works, and the logic outside of data structures is pretty small and contained. Additionally, you might check out https://fb.quip.com/X7EVAdQ99Zzm for a very similar description of how to refine values based on comparison operators.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D27750997
Pulled By: eellison
fbshipit-source-id: d962415af519ac37ebc9de88f2e1ea60a1374f7c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55925
This sets up the initial handling of symbolic shapes. As in the test, it doesn't work perfectly yet because it needs a couple other optimization passes. The basic description is pretty simple: we resolve tensor dimension indices to the same Value *, and before extracting out the output Tensor shape we substitute in symbolic shapes. We don't substitute during optimization because they are represented as negative numbers so we don't want them inadvertently used in Constant prop or something else.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D27750996
Pulled By: eellison
fbshipit-source-id: 6984e7276b578f96b00fc2025cef0e13f594b6e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809
I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now.
There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do:
- [ ] Add symbolic shapes support
- [ ] Bind shape functions for operators in C++
- [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs)
- [ ] Refactor APIs
- [ ] Only iteratively optimize shape function while a change has been made
- [ ] Expand coverage of coverage to common ops
- [ ] Add shape analysis pass on Graph that handles Ifs and Loops
- [ ] Allow concurrent reads to the operator map
- [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops)
For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D27750998
Pulled By: eellison
fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57397
Introduces two main classes in C++ runtime:
ScriptProfile is the implementation for enalbing and disabling interpreter
profiling in C++. This should be only used from Python, and we will add
corresponding Python API in the next diff.
InstructionSpan is a utility class to instrument execution of each single
instruction. A start timestamp is recorded in the consturctor, and an end
timestamp is recorded in the destructor. During destruction, this will send
runtime data to all enabled ScriptProfile instances.
Test Plan:
build/bin/test_jit --gtest_filter='ScriptProfileTest.Basic'
Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D28133579
fbshipit-source-id: e7e30e96151367022793ab3ad323f01c51ad4a3b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58382
Calling markCompleted on a Future now first acquires the Future's mutex (as usual) but then sometimes tries to acquire the GIL during the DataPtr extraction while still holding the Future's mutex. (This happens when the value passed to markCompleted is a Python object). This can cause a deadlock if someone else calls any of the other methods of Future while holding the GIL.
There are two solutions to this: avoid holding the Future's mutex when extracting DataPtrs, and avoid holding the GIL while invoking the Future's method. In this PR I'm going for the latter, because it's a very simple immediate fix, but I believe this is brittle and that we should probably also consider the former fix.
ghstack-source-id: 129105358
Test Plan: The repro in https://github.com/pytorch/pytorch/issues/58239 now doesn't deadlock.
Reviewed By: mrshenli
Differential Revision: D28472816
fbshipit-source-id: 1bc9bca426dd004f9eb2568db1ffd38f014450e2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56152
Currently, the Bundled Inputs API mutates the module in-place. It adds class methods and not instance methods. This results in a small problem that one can't re-run an already executed cell in Bento if the class has already been subject to bundled inputs.
In addition, there is no way to add bundled inputs to a module that has bundled inputs added already. This API provides a way to solve this problem as well by adding an `ignored_methods` to the call to `clone()` by allowing the implementation of bundled inputs to pass in the methods that it will add as `ignored_methods` so that when it does try to add those methods, it will be able to do so successfully.
We'll have to be careful when ignoring those methods during the call to `torch.jit._clone_module_with_class` since any bundled input that relies on a user-provided method will need to be preserved and not ignored during the clone.
Looking for feedback on whether this is an acceptable direction.
ghstack-source-id: 128908360
Test Plan:
Added unit test and ran it as `buck test //caffe2/test:mobile`
Also see this Bento Notebook: https://www.internalfb.com/intern/anp/view/?id=550829
Reviewed By: gmagogsfm
Differential Revision: D27788394
fbshipit-source-id: 48109cd4583506d4efdb345e4ba31385db23a273
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57079
Testing onnx 1.9 release, we see that the old bug is triggered for the caffe2 test:
`pytest test/onnx/test_pytorch_onnx_caffe2_quantized.py::TestQuantizedOps::test_small_model`
This is because the graph inputs
```python
graph(%x.1 : Tensor,
%conv1._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
%conv2._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
%fc.bias : Float(10, strides=[1], requires_grad=0, device=cpu),
%fc.weight : Float(10, 72, strides=[72, 1], requires_grad=0, device=cpu)):
```
contains `Conv2dPackedParamsBase` which is a PackedParams.
When we do flatten, we will flatten to several tensors, then the shape inference for input misaligned.
This PR record how may tensors got flattened in PackeParams, and skip by these number rather than 1, then the UT passed.
Note that tuple case should still follow the original logic.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D28393949
Pulled By: malfet
fbshipit-source-id: 98d48aad27e5ca03fb10d260f8e625478d996ee2
Co-authored-by: David <jiafa@microsoft.com>
Summary:
Add an api to backport a model vn to model vi. It accept an input model (file or buffer) and output a model (file or buffer) with an expected bytecode version.
In this change, the input is a model and it can come from a file or buffer. The output is a model and can be either file path or buffer.
When backport fails, function return false with a warning message :
```
/Users/chenlai/pytorch/cmake-build-debug/bin/test_jit --gtest_filter=LiteInterpreterTest.BackPortByteCodeModelV4:LiteInterpreterTest/*.BackPortByteCodeModelV4:*/LiteInterpreterTest.BackPortByteCodeModelV4/*:*/LiteInterpreterTest/*.BackPortByteCodeModelV4 --gtest_color=no
Testing started at 2:32 PM ...
CUDA not available. Disabling CUDA and MultiCUDA tests
[W backport.cpp:419] Warning: Backport doesn't support backport to version3 (function _backport_for_mobile_impl)
Process finished with exit code 0
```
## Test
1. Run both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py`.
2. Run all prod models with backport api.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56802
ghstack-source-id: 128425510
Test Plan: CI
Reviewed By: raziel, iseeyuan
Differential Revision: D27844651
fbshipit-source-id: 8a803cf6c76433ee0a3049b1a5570585d569f8d6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57635
Note: this PR looks massive, but it's just one simple change, codemodded many times.
In many cases, a callback needs to access the value/error produced by the parent future. In Python this was easy because the callback was invoked with the parent future as argument, and could thus inspect it. In C++ the callbacks didn't take any arguments, thus in many cases we worked around this by capturing the future in its own callback. This is risky (leads to reference cycle and thus memory leak) and must be done carefully (spoiler: sometimes we weren't).
ghstack-source-id: 128296580
Test Plan: CI
Reviewed By: wanchaol
Differential Revision: D28178783
fbshipit-source-id: 6de02c4568be42123372edc008f630d5ddae0081
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56918
Re-importing a Python module each time is a bit expensive, and it's unnecessary because this is a private module which won't change and thus we can cache the value once we first extract it.
ghstack-source-id: 128184666
Test Plan: CI
Reviewed By: mrshenli
Differential Revision: D27985910
fbshipit-source-id: be40ae9b67ab8ea6c07bc2cb9a78d2c2c30b35d3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57688
P412982836 says that `torch::jit::toIValue()` will also touch GIL through `torch::jit::createGenericDict()` (P412848640)
So we have to move `torch::jit::toIValue()` out of multithreading execution
Reviewed By: hyuen
Differential Revision: D28236527
fbshipit-source-id: 43a33dbcfc828cc42c5e1230c8f5cb415bf7bde4
Summary:
Add an api `_get_bytecode_version` to get version number given a bytecode model in both cxx and python, and the input can be both from file path and buffer.
## Test
CI (new added unit test will run as part of `pytorch_core-buck`)
1. run test_lite_interpreter.cpp
2. `python test/mobile/test_bytecode.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56801
ghstack-source-id: 128169647
Test Plan:
CI (new added unit test will run as part of `pytorch_core-buck`)
1. run test_lite_interpreter.cpp
2. `python test/mobile/test_bytecode.py`
Reviewed By: iseeyuan
Differential Revision: D27961417
fbshipit-source-id: f786cc9573d855feecff0b4fe8e5363e25f5728c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57292
In Future (and soon in other places too) we need to receive a list of devices from Python-land. We don't want to just take their indices because we need full devices in order to infer the type from them. torch.device is not defined through pybind, it's defined through a plain `PyModule_AddObject` call with CPython, thus pybind isn't naturally able to understand and convert it. However we can provide a custom type caster which fixes that. We have this already for at::Tensor, at::Generator, ...
ghstack-source-id: 127916268
Test Plan: CI
Reviewed By: mrshenli
Differential Revision: D28092732
fbshipit-source-id: 1c31d0b85a4d5c9e7bde8161efbb7574d505157c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56807
If I understand correctly, there's no reason to create your own instance of these global singleton types.
ghstack-source-id: 127312270
Test Plan: CI
Reviewed By: SplitInfinity
Differential Revision: D27973447
fbshipit-source-id: f12df69d185f1baaa45f2ac6eac70570a7a65912
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57052
This PR caps a stack whose goal was to merge CUDAFuture into ivalue::Future. CUDAFuture used to be a subclass of ivalue::Future, which was already pretty good, but it meant that in several places we needed `#ifdef`s or registries in order to create the right type of class, which was annoying. We've made CUDAFuture device-agnostic, by using generic helpers, so that it doesn't depend on CUDA. Now all its code can be inserted into ivalue::Future.
This PR does this very naively, by copy-pasting CUDAFuture's code into the (previously empty) virtual methods of ivalue::Future. This helps ensure the correctness of this PR, as it's straightforward to see it behaves exactly like before. However we probably want to polish it a bit later to iron out so wrinkles.
ghstack-source-id: 127713138
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: mrshenli
Differential Revision: D28036829
fbshipit-source-id: 3e5b16402f5dc245c1fcb9d7bf06db64dcb0d2a3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57051
Make CUDAFuture autodetect the devicetype from its arguments (which thus change from DeviceIndices to full Devices). This in fact transforms CUDAFuture into a AnythingFuture, since it's not tied to CUDA in any way anymore. Having made it fully device-agnostic, we'll merge it into ivalue::Future in the next PR.
ghstack-source-id: 127713134
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: mrshenli
Differential Revision: D28032711
fbshipit-source-id: 8ba23b1b0d97f61db8693cd5f3c7bae7989a9bcd
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os
def get_compiled_files_list():
import json
with open("build/compile_commands.json") as f:
data = json.load(f)
files = [os.path.relpath(node['file']) for node in data]
for idx, fname in enumerate(files):
if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
return files
def run_clang_tidy(fname):
check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
changes = check_output(["git", "ls-files", "-m"])
if len(changes) == 0:
return
check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])
def main():
git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
compiled_files = get_compiled_files_list()
for idx, fname in enumerate(git_files):
if fname not in compiled_files:
continue
if fname.startswith("caffe2/contrib/aten/"):
continue
print(f"[{idx}/{len(git_files)}] Processing {fname}")
run_clang_tidy(fname)
if __name__ == "__main__":
main()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892
Reviewed By: H-Huang
Differential Revision: D27991944
Pulled By: malfet
fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57045
Went back and adjusted the previous optimizations to just be applied to every function.
Cleaned up api to match.
ghstack-source-id: 127214412
ghstack-source-id: 127536155
Test Plan: unit test
Reviewed By: kimishpatel
Differential Revision: D27950859
fbshipit-source-id: 214e83d5a19b452747fe223615815c10fa4aee58
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56717
The signal_handler was under the caffe2 namespacee but was being used
by PyTorch as well.
I've fixed this my moving it to the c10 namespace where now both C2 and PyTorch
can use it.
The signal_handler interface in caffe2/utils/signal_handler.h is kept the same
for backward compatiblity for C2, but most of the commmon code is moved to c10.
ghstack-source-id: 127446929
Test Plan: waitforbuildbot
Reviewed By: ezyang
Differential Revision: D27946738
fbshipit-source-id: d6228d1a0108f4c807d405e7a0bb799c5375388f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56516
One problem with CUDAFuture's extraction of DataPtrs from IValues is that it only supported Python objects that could be converted to "regular" IValues (e.g., lists/dicts/tuples of ints/strings/tensors/...). One notable exception are custom Python classes, which are in fact a very common data type transferred over RPC. The only solution we found for those is to use the Python pickler to extract the tensors contained in them.
We can't insert a Python dependency directly into CUDAFuture, so instead I'm proposing to use the same indirection technique used to support `getSubValues` on Python objects: define some methods on the abstract class `PyObjectHolder` (which can be used by CUDAFuture) but only implement them in the concrete subclass `ConcretePyObjectHolder` (which is only built when Python support is enabled).
I am a bit worried about the performance toll of this (pickling isn't exactly known to be cheap) but I think we should start by providing a functionally complete API. We already have ideas on how to make this faster if needed, for example by having users provide a custom DataPtr extractor tailored to their class via a decorator. (Or just use TorchScript).
ghstack-source-id: 127295014
Test Plan: Added a test later in the stack
Reviewed By: mrshenli
Differential Revision: D27887189
fbshipit-source-id: 9d27e4e62390b836e5bb4f06f401cc002f0cf95b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56517
Currently a torch.futures.Future could wrap a CUDAFuture, but it could not create one from scratch. This prevented users from using CUDAFutures in some occasions, for example when using `rpc.functions.async_execution`, or in their own code. I don't see any reason for such a limitation, hence here I add support for this.
ghstack-source-id: 127261554
Test Plan: Added a test later in the stack
Reviewed By: mrshenli
Differential Revision: D27887190
fbshipit-source-id: ecbb39c1ad7cd189d478ded9c361448f05a270ad
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56172
Enable the standardOps include **Add\Sub\Mul\Div\Gemm\Pow\Mod** with low precision input in ORT
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D27866136
Pulled By: SplitInfinity
fbshipit-source-id: f2cf5649fffefd68c0cc7b6dce94198751636727
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56120
This reverts commit ad17fadbfc (D27786457).
The big annoyance here is that depending on the threading mode you may not be
able to toggle num_threads at will, so the fusion tests won't fail.
I hate this solution, but I'm adding a secondary override for the TE fuser.
Now you need to both turn on fusion (_jit_override_can_fuse_on_cpu), and you're
OK if you're running with 1 thread, or you can add
`_jit_set_texpr_parallel_cpu_enabled` to enable it anyways.
This is (a) mainly for tests, since a real user probably won't fiddle aimlessly
with the thread count, and (b) will go away once NNC's threading support is
fully baked.
Test Plan: Imported from OSS
Reviewed By: Krovatkin
Differential Revision: D27788199
Pulled By: bertmaher
fbshipit-source-id: 070d04474f15e9689dbdf8cc1fde43050c6506b1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55799
I'm going to change the implementation of cdata soon so I need to
abstract over cdata access with a function. Additionally, many
users are casting manually casting to THPVariable to access
the member so I can remove these unsafe casts in the client code
(the implementation, of course, is still doing an unsafe cast.)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D27712130
Pulled By: ezyang
fbshipit-source-id: 95fcc013bf3913d67f2c634068eb5b3aab144cb3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54619
Minor refactor to conv batchnorm folding to work on other functions besides forward
ghstack-source-id: 125767010
Test Plan: unit test and {P339453712}
Reviewed By: kimishpatel
Differential Revision: D27301452
fbshipit-source-id: 4e0cc544a171a970583979a496b2908935124497
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55124
**Summary**
This commit modifies type inference (used by the module scripting code)
so that it tries to script the type of any class instances that it
encounters. This enables recursive, automatic scripting of class type
module attributes.
**Test Plan**
This commit adds a test case for this to `TestClassType`.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D23971883
Pulled By: SplitInfinity
fbshipit-source-id: 7a5a2e7c12ee68cbdeb0a07e6aaf98734a79cb06
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54476
Per title. For `add_done_callback`, we log but swallow exceptions in order to keep consistent with what concurrent.futures python library does, see discussion in https://github.com/pytorch/pytorch/pull/45675.
Although, it would be good to improve the verbosity here as this can be a source of confusion if users are setting a different future via `add_done_callback`, and an error is hit resulting in an unexpected hang (see https://github.com/pytorch/pytorch/issues/52132 for more details on how this can happen).
ghstack-source-id: 125300389
Test Plan: CI
Reviewed By: lw
Differential Revision: D27253004
fbshipit-source-id: 72ed21c8fb6d27de5797c17fc46b762f893e6fea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54915
TorchScript and torch.package have different mangling schemes. To avoid
them interfering with each other, we should undo the torch.package
mangling before processing anything with TorchScript (since TS
independently makes sure that no names collide).
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D27410472
Pulled By: suo
fbshipit-source-id: d1cc013c532d9abb7fb9615122bc465ded4785bb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52881
**This PR adds:**
1. logic to parse complex constants (complex literals of the form `bj`)
2. logic to parse complex lists
3. support for complex constructors: `complex(tensor/int/float/bool, tensor/int/float/bool)`
4. Limited operator support
- `add`, `sub`, `mul`, `torch.tensor`, `torch.as_tensor`
**Follow-up work:**
1. Add complex support for unary and other registered ops.
2. support complex constructor with string as input (this is supported in Python eager mode).
3. Test all emitXYZ for all XYZ in `ir_emitter.cpp` (currently only emitConst, emitValueToTensor are tested). e.g., test loops etc.
4. onnx doesn't support complex tensors, so we should error out with a clear and descriptive error message.
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision: D27245059
Pulled By: anjali411
fbshipit-source-id: af043b5159ae99a9cc8691b5a8401503fa8d6f05
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53314
Introduction of api for optimizing non forward functions for mobile. As of this diff, all functions that you say to optimize will be preserved, and those functions will be run through canonical optimization. The intention is to stack each further optimization onto separate diffs since they touch multiple files, and it seems like it'd be a nightmare to review.
ghstack-source-id: 123909414
Test Plan:
torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward", "foo"]) runs fine
torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize={"foo"}) optimizes just foo if the model doesnt define forward otherwise optimizes foo and forward
torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward"]) runs fine
torch.utils.mobile_optimizer.optimize_for_mobile(net) runs fine if the model defines forward, Throws otherwise
Reviewed By: kimishpatel
Differential Revision: D26618689
fbshipit-source-id: 5bff1fb3f3f6085c4a649a8128af9c10f0fa9400
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53306
* [ONNX] Fix for sequence of mutations in blocks (#51577)
Fixes consecutive mutations in a tensor inside blocks.
Also, support append and pop in blocks.
* Support inplace operations + indexing
* Clean up old pass for remove mutations
* Add loop test
* Fixes for set attr in loops
* Removing the new jit API flag
* [ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795)
With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators.
This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available.
The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion.
The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions.
The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator.
This PR includes several other fixes, listed below.
* ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~
* fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist.
* ~~enable onnx shape inference in export with parameter/initializer data.~~
* other code clean ups.
* fix insertion of identity nodes for loop opset 13 sequence output.
~~PR depends on #51603~~
* Fix after merge
* clang
* Fix clang
* Fix clang
* Fix warning message.
* Fixes for non-model param attributes
* Fix for caffe2
* Additional test
* clang
* Skip test for lower opsets
* fix clang-tidy
* Update init.cpp
* Update remove_inplace_ops_for_onnx.cpp
* Update remove_inplace_ops_for_onnx.cpp
* Update remove_inplace_ops_for_onnx.cpp
* Fix for clang formatting
Test Plan: Imported from OSS
Reviewed By: pbelevich, malfet
Differential Revision: D26922416
Pulled By: SplitInfinity
fbshipit-source-id: e7108620b39b6404c594910786c4d275fee59d84
Co-authored-by: Bowen Bao <bowbao@microsoft.com>
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53304
With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators.
This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available.
The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion.
The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions.
The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator.
This PR includes several other fixes, listed below.
* ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~
* fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist.
* ~~enable onnx shape inference in export with parameter/initializer data.~~
* other code clean ups.
* fix insertion of identity nodes for loop opset 13 sequence output.
~~PR depends on #51603~~
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D26922417
Pulled By: malfet
fbshipit-source-id: 14ed06158d539e2451c2e5e63ba1b32fb0f75095
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53410
**Summary**
This commit enables indexing into `ModuleList` using a non-literal
index if the LHS of the assignment statement of which the indexing is
the RHS is annotated with an interface type.
This feature already exists for `ModuleDict`, and this commit builds on
top of that implementation. A `prim::ModuleContainerIndex` operator is
emitted for any statement of the form `lhs: InterfaceType =
module_container[idx]`. The same operator has to be used for both
`ModuleDict` and `ModuleList` because serialization does not preserve
the metadata that indicates whether a `Module` is a `ModuleDict` or
`ModuleList`.
**Testing**
This commit extends the existing unit tests for non-literal `ModuleDict`
indexing to test non-literal `ModuleList` indexing.
**Fixes**
This commit fixes#47496.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D26857597
Pulled By: SplitInfinity
fbshipit-source-id: d56678700a264d79aae3de37ad6b08b080175f7c
Summary:
The code uses `torch::jit::jit_log_prefix` for handling recursive
indenting in most places in this function. There was one place that was
using "level", but it was buggy -- it would result in a compounding
superlinear indent. Note that changing it to "level+1" doesn't fix the
bug.
Before/after:
https://gist.github.com/silvasean/8ee3ef115a48de6c9c54fbc40838d8d7
The new code establishes a recursive invariant for
`Module::dump_to_str`: the function returns the module printed at the
base indent level (i.e. no indent). `torch::jit:log_prefix` is used
to prefix recursive calls. The code was already nearly there, except for
this spurious use of "level".
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52539
Reviewed By: navahgar
Differential Revision: D26773657
Pulled By: gmagogsfm
fbshipit-source-id: ab476f0738bf07de9f40d168dd038dbf62a9a79e
Summary:
This PR adds an implementation for `aten::cat` in NNC without any conditionals. This version is not enabled by default.
Here is the performance of some micro benchmarks with and without conditionals. There is up to 50% improvement in performance without conditionals for some of the shapes.
aten::cat implementation in NNC **with** conditionals
```
$ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion concat
pt: concat2d2input_fwd_cpu_1_160_1_14_1: 5.44 us, SOL 0.26 GB/s, algorithmic 0.51 GB/s
pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.75 us, SOL 1.05 GB/s, algorithmic 2.10 GB/s
pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.87 us, SOL 4.05 GB/s, algorithmic 8.11 GB/s
pt: concat2d2input_fwd_cpu_20_580_20_174_1: 14.52 us, SOL 8.31 GB/s, algorithmic 16.62 GB/s
pt: concat2d2input_fwd_cpu_8_512_8_512_1: 9.58 us, SOL 6.84 GB/s, algorithmic 13.68 GB/s
```
aten::cat implementation in NNC **without** conditionals
```
$ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion --cat_wo_conditionals concat
pt: concat2d2input_fwd_cpu_1_160_1_14_1: 4.67 us, SOL 0.30 GB/s, algorithmic 0.60 GB/s
pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.65 us, SOL 1.07 GB/s, algorithmic 2.14 GB/s
pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.10 us, SOL 4.56 GB/s, algorithmic 9.12 GB/s
pt: concat2d2input_fwd_cpu_20_580_20_174_1: 7.44 us, SOL 16.22 GB/s, algorithmic 32.44 GB/s
pt: concat2d2input_fwd_cpu_8_512_8_512_1: 6.46 us, SOL 10.14 GB/s, algorithmic 20.29 GB/s
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53128
Reviewed By: bertmaher
Differential Revision: D26758613
Pulled By: navahgar
fbshipit-source-id: 00f56b7da630b42bc6e7ddd4444bae0cf3a5780a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51564
Constructor logic was spread throughout InferenceModule and StaticRuntime. This diff unifies the two. After a lot of discussion on this diff D25961626 it became apparent that `clone` is uglier than a cheap StaticRuntime.
This means StaticRuntime is effectively StaticModule and the only code in the new StaticRuntime is the `run` functions.
```
graph, schema = PrepareForStaticModule(torchscript_module)
sm = StaticModule(graph, schema, options)
sm(inputs)
// or create many cheap runtimes with the module
sr = StaticRuntime(sm)
sr(inputs)
```
Changelist:
- Rename InferenceModule StaticModule
- Move all logic for construction into StaticModule
- Create a new StaticRuntime that only has a unique memory planner (everything else is in StaticModule)
- Update comments with explanation
- Propagate all changes to predictor integration
- Propagate all changes to python integration
- Change semantics to be a bit more PyTorch-standard (no "run" calls, no "get_" getters).
Test Plan:
buck test //caffe2/test:static_runtime
buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest
Reviewed By: hlu1
Differential Revision: D25592967
fbshipit-source-id: 8233bed03137ce129137af2d44bce0095033ef0f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50670
This PR adds property support to Torchbind. There are two cases that it needs to work:
**Torchscript**
Inside Torchscript, we don't go through pybind so there is no issue with accessing properties through ClassType.
**Eager Mode**
In Eager Mode, Torchbind creates ScriptObject which we cannot dynamically add (aka access) properties after initializing it. (https://stackoverflow.com/questions/1325673/how-to-add-property-to-a-class-dynamically
) Therefore we created a Python wrapper (ScriptObjectWrapper) around ScriptObject where we can use property method to set properties. By doing so, we can look up wrapped object's property through __getattr__ method of the ScriptObjectWrapper. This logic is inspired from https://github.com/pytorch/pytorch/pull/44324
Test Plan:
test cases in test_torchbind.py
Imported from OSS
Reviewed By: pbelevich
Differential Revision: D26632781
fbshipit-source-id: dd690887cfda0c48ff0d104aa240ce0ab09055bc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51483
This PR moves the conv weights of a frozen model to MKLDNN, and AOT reorders the weights. When the weights are already in MKLDNN, just computing a single conv by converting the input and output from/to mkldnn provides large speedups. I benchmark'd the results of the top 200 shapes in predictor [here](https://www.internalfb.com/phabricator/paste/view/P171537938), as well as verified that it sped up popular models in torchvision.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D26696703
Pulled By: eellison
fbshipit-source-id: 0b4441bee4f6e0890a4540fbca3bb5e58b8c5adf
Summary:
This is a second attempt to use graph executor to run forward on a gradient. This allows a secondary chance to profile intermediate tensor introduced by autodiff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52136
Reviewed By: pbelevich
Differential Revision: D26693978
Pulled By: Krovatkin
fbshipit-source-id: 91dde8009a210950af8e5173668ada241e16dd52
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910
**Summary**
PR #52158 tried to move all JIT bindings from `torch._C` to a new
submodule `torch._C._jit`, but that...did not go well. This pull request
adds the new `torch._C._jit` submodule, but does not migrate the
existing bindings. Instead, it adds a unit test that fails if any new
bindings are added to `torch._C`. A comment in the test instructs
developers to add their new binding to the allowlist if it really should
be in `torch._C`, or to add it to the appropriate submodule (e.g
`torch._C._jit`, for example). The idea is to prevent the issue
described in #51691 from getting *worse* if it cannot be fixed.
**Test Plan**
Continuous integration.
**Fixes**
This commit fixes#51691.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D26698373
Pulled By: SplitInfinity
fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51754
This API allows you to manage multiple python interpreters in a single
process to deploy PyTorch models packaged with torch.package.
torch/csrc/deploy/deploy.h contains the API definition
torch/csrc/deploy/test_deploy.cpp has some examples.
Notes:
* mutex is added to PyTorchStreamReader to make it safe to use from multiple threads at once.
* USE_DEPLOY is only true for the special libtorch_deployinterpreter.so library, when enabled
we use a hash table to maintain PyObject <> at::Tensor mappping rather than the internal pointer
in Tensor since >1 interpreter may have a reference to the tensor.
* serialization.py has some additional functions for creating pickle objects
but keeping storages in memory for use transfering tensors between interpreters
Test Plan: Imported from OSS
Reviewed By: wconstab
Differential Revision: D26329468
Pulled By: zdevito
fbshipit-source-id: d75f4ebb9a27f1d911179d9996041bcb3ca04a07
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51177
**Summary**
This commit adds support for static methods to TorchBind. Just like
pybind, the API for declaring a static method is `def_static(...)`. A
static method must be called on the class directly, and can be called
both in Python as well as TorchScript.
Support for static methods is implemented in a manner similar to that of
instance methods. Registered static functions are wrapped in a layer of
unboxing logic, their schemas are inferred using templates and
metaprogramming, and they are added to the `ClassType` object
corresponding to the TorchBind class on which they are registered.
ScriptClass has been extended to support a `__getattr__` function so
that static methods of TorchBind classes can be invoked in Python. The
implementation of `__getattr__` returns `ScriptClassFunctionPtr`, a
version of `StrongFunctionPtr` without a compilation unit (since the
functions of a TorchBind class live inside the TorchBind registry).
Within TorchScript, TorchBind static functions are desugared in
`PythonClassValue::attr` by looking them up on the class type of the
`PythonClassValue` instance.
**Test Plan**
This commit adds a unit test that tests a simple static method on a
TorchBind class.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D26356942
Pulled By: SplitInfinity
fbshipit-source-id: 1b6a9bc2e5f3e22071ad78e331a0201fbbf7ab30
Summary:
Previously `torch.jit.trace` relies on AutoGrad hooks to infer name of tensors in computation, including those of function/method arguments. This often doesn't work out because:
- These names often do not exist
- Tracer uses argument name of first tensor operation on each tensor as inferred argument names. These tensor operations have programmatically-generated names like `argument_1`
This PR extracts argument names directly from Python functions and pass them down to tracer, which then assigns them to correct graph inputs. This way, we always have the correct argument names captured in IR.
This is useful for both debugging and supporting using `InterfaceType` to represent traced modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51775
Reviewed By: izdeby
Differential Revision: D26273105
Pulled By: gmagogsfm
fbshipit-source-id: 934a385041137dc3731bb6fa8657b11532fed9e5
Summary:
Previously, the graph might have been delete while Python still has iterators, leading to segfaults.
This does not fully work for iterators from Nodes and Blocks as they may be invalidated when the owning graph goes out of scope. I will look into these separately.
Fixes https://github.com/pytorch/pytorch/issues/50454
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51951
Reviewed By: mrshenli
Differential Revision: D26352629
Pulled By: SplitInfinity
fbshipit-source-id: 67299b6cbf1ac7ab77f8703a0ca8f1162e03fcd4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51826
Looks like this:
```
resnet.pt
├── .data # Data folder named so it can't clash with torch.package codemodules.
│ │ # Names/extensions automatically added to avoid namingconflicts.
│ ├── 94286146172688.storage # tensor data
│ ├── 94286146172784.storage
│ ├── extern_modules # torch.package metadata
│ ├── version # version metadata
│ └── ...
├── model # package pickled model created w/
│ │ # exporter.save_pickel('model','model.pkl', resnet_model)
│ └── model.pkl
└── torchvision # all code dependencies for packaged picked
└── models # models are captured as source files
├── resnet.py
└── utils.py
```
Since `version` is hardcoded in our zip reader/writer implementation,
add it as an option that defaults to "version" but accepts other
locations for putting the version metadata.
Test Plan: Imported from OSS
Reviewed By: zdevito
Differential Revision: D26295649
Pulled By: suo
fbshipit-source-id: 2d75feeb7de0f78196b4d0b6e2b814a7d58bd1dd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50983
There is currently no way to handle/propagate errors with the python-based futures API (they are raised correctly if set with an error, but this is only possible from C++).
This diff allows the Future's `unwrap_func` to be set in python optionally, so users can set futures completed with an exception and the error will throw as expected. This is mostly to support the following use case in the next diff:
```
ret_fut = torch.futures.Future(unwrap_func = lambda python_result: {
# throw exception if needed
if isinstance(python_result, Exception):
throw python_result
})
rpc_fut = rpc.rpc_async(...) # RPC future that times out
# Goal is to propagate RPC error to this future
rpc_fut.add_done_callback(
res => {
# Note that ret_fut.set_result(res.wait()) won't propagate the error
try:
ret_fut.set_result(res.wait())
except Exception as e:
ret_fut.set_result(e)
}
)
```
ghstack-source-id: 121021434
Test Plan:
unittest
```
buck test mode/dev-nosan mode/no-gpu //caffe2/test:futures -- te
st_unwrap --print-passing-details
```
Reviewed By: mrshenli
Differential Revision: D25950304
fbshipit-source-id: 7ee61e98fcd783b3f515706fa141d538e6d2174d
Summary:
Previously we might have gotten segfaults and all, now it raises an exception.
Thread safety hasn't been an objective.
I have a followup to expand the Python interface for the API.
Fixes https://github.com/pytorch/pytorch/issues/49969.
wanchaol
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50326
Reviewed By: pbelevich
Differential Revision: D26096234
Pulled By: gmagogsfm
fbshipit-source-id: 5425772002eb4deb3830ed51eaa3964f22505840
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51340
**Summary**
`toIValue` assumes that any value passed for an argument of type
`torch.device` is a valid device object, even when it is not. This can
lead to device type arguments of functions being assigned incorrect
values (see #51098).
This commit adds an explicit check that the passed in object is indeed a
`torch.device` using `THPDevice_Check` and only then does is it
converted to an `IValue`. Since implicit conversion from strings to
devices is generally allowed, if `THPDevice_Check` fails, it is assumed
that the object is a string and an `IValue` containing a `c10::Device`
containing the passed in string is returned.
**Test Plan**
This commit adds a unit test to `test_jit.py` to test that invalid
strings passed as devices are not longer silently accepted.
**Fixes**
This commit fixes#51098.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D26187190
Pulled By: SplitInfinity
fbshipit-source-id: 48c990203431da30f9f09381cbec8218d763325b