Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74927
The move ctor was broken because `BlockRunner` stores a reference to `values_`. When moving runtime instances, the pointer to the root block would be moved, but the reference inside it would not be updated.
Pass `BlockRunner` a raw pointer to the heap-allocated IValues instead to avoid this issue.
ghstack-source-id: 153168602
Test Plan: New unit test/CI
Reviewed By: navahgar
Differential Revision: D35228467
fbshipit-source-id: 04e198b39f898b82677a0e41e1cdf00c2b0c09f3
(cherry picked from commit 03e2c591ac3a907d68025eae9500ed7226dec17e)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69835
`StaticRuntimeBlockRunner` moves its outputs to the return value at the end of `run_impl`. However, there's a corner case where this can cause problems. If we return a constant, then the only reference in the `constants_` array can be destroyed by this move. We could add special logic to handle this in `run_impl`. But since this is a relatively rare corner case, it's simpler to just add an op that does nothing but create an owned reference to its input. This owned reference can be safely moved out of `StaticRuntimeBlockRunner`.
Note that this also applies to returned values in sub-blocks that are from outer scopes.
ghstack-source-id: 148186452
Test Plan:
`buck test caffe2/benchmarks/static_runtime/...`
Added a new unit test with a graph that simply returns a constant.
Tests with sub-blocks at top of stack.
Reviewed By: d1jang
Differential Revision: D33047519
fbshipit-source-id: 22b6058f0d1da8a6d1d61a6f2866bc518bff482b
(cherry picked from commit a8f89a12ee)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69819
We should skip ReplaceWithCopy if the inputs to the operator can be updated during inference. For a set of tensors that share data, ReplaceWithCopy should not happen to any of them if there exists updates to any of them.
Currently, the check in place has missed some cases (suppose there exists updates, and uses <= 1). This diff addresses the missing cases by querying AliasDB.
Test Plan:
- Added test cases, including a one that is problematic before this diff
- CI
Reviewed By: mikeiovine
Differential Revision: D33052562
fbshipit-source-id: 61f87e471805f41d071a28212f2f457e8c6785e7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68795
This change improves static runtime exception safety. Added a scope exit guard that invokes `MemoryPlanner::deallocate` in its destructor.
Caveat: we have to be really careful with the exception behavior of `MemoryPlanner::deallocate` and `MemoryPlanner`'s constructor, because they're now both potentially called in the destructor of the scope exit guard. Letting exceptions potentially escape destructors is playing with fire since 1) the destructor of `Deallocator` is (implicitly) `noexcept`, 2) even if it wasn't, `std::terminate` will be called if an exception escapes and the stack is already unwinding. To get around this, we wrap the deallocation stuff in a try/catch. If deallocation throws, then we simply reset all of the memory planner stuff and carry on.
There's a catch: the code path that we take when handling the deallocation exception can't throw. However, this code path is much simpler than memory planner construction/deallocation, so it's much easier to manually audit the correctness here.
Test Plan:
**New unit tests**
`buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest`
Reviewed By: hlu1
Differential Revision: D32609915
fbshipit-source-id: 71fbe6994fd573ca6b7dd859b2e6fbd7eeabcd9e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67941
I just found out that due to the round up of the Tensor storage sizes to multiples of 64 bytes, resizing is not actually triggered for a lot of our unit tests (23 OSS, 16 internal). Now they should be all fixed. Also moved a bunch of tests to `test_static_module.cc` so that `test_static_runtime.cc` now only contains operator tests.
From now on, by default if `args2` is passed to `test_static_runtime`, at the end of the second iteration, it would check that the managed buffer's size is bigger than the previous size and enforce that. You can bypass the check for ops with constant output sizes, such as `aten::sum` without `dim` passed in.
Test Plan:
Facebook
```
buck test //caffe2/benchmarks/static_runtime:static_runtime_cpptest
buck test //caffe2/benchmarks/static_runtime/fb:test_fb_operators
```
Reviewed By: swolchok
Differential Revision: D32196204
fbshipit-source-id: 8425d9efe6b9a1c1e3807e576b1143efd7561c71
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65741
This op previously assumed `axis == 1`, causing graphs that would otherwise be valid to return incorrect results after fusing.
Reviewed By: hlu1
Differential Revision: D31234944
fbshipit-source-id: 89885a3b119357698ebd9fd429b009813260a2f4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62335
This change ensures that unittests only use out variants or native ops.
- Our unittests currently assume that a graph fed to the static runtime correctly replaces an interpreter op for its corresponding out variant / native op, but it's not checked by the unittest. This change ensures that.
- We relied on manual inspection of log messages to see if an out variant is used for a specific workload even for unittesting. This change frees us from doing that.
- `aten::add` is excluded from this check since it's only enabled for an internal workload. Also some unittests are excluded by using `expect_interpreter_op = true` since they are written to use interpreter ops by design.
Test Plan: Ran `buck run //caffe2/benchmarks/static_runtime:static_runtime_cpptest` successfully.
Reviewed By: mikeiovine, hlu1
Differential Revision: D29952381
fbshipit-source-id: e60e70b80ccf45e91c6654b4ad53f92ffd5ab702
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62064
`testStaticRuntime` was previously only available in `test_static_runtime.cc`. It has been moved to a common library `test_utils` to facilitate code re-use. This also lets us test dynamic shapes in `test_fb_operators`
Reviewed By: hlu1
Differential Revision: D29858928
fbshipit-source-id: 68a94760166ddb745972b0f1fc24bed594937d1c