Commit Graph

31 Commits

Author SHA1 Message Date
cyy
f4dcf2ae93 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang, https://github.com/r-barnes
2024-07-08 07:03:53 +00:00
PyTorch MergeBot
846bb30e13 Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)"
This reverts commit bd72e28314.

Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build bd72e28314. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))
2024-06-15 01:58:20 +00:00
cyy
bd72e28314 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang
2024-06-14 23:21:01 +00:00
Richard Barnes
ed327876f5 [codemod] c10:optional -> std::optional (#126135)
Generated by running the following from PyTorch root:
```
find . -regex ".*\.\(cpp\|h\|cu\|hpp\|cc\|cxx\)$" | grep -v "build/" | xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/'
```

`c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi
2024-05-14 19:35:51 +00:00
Nikita Shulga
8ddc549c0f [BE][JIT] Do not wrap shared_ptr with optional (#115473)
While reviewing https://github.com/pytorch/pytorch/pull/115381 noticed that `torch::jit::GraphFunction::optimized_graph_` is an `std::array<c10::optional<std::shared_ptr<Graph>>, N>`, which feels excessive as `shared_ptr` is already nullable and have `operator bool()`. Looking at https://github.com/pytorch/pytorch/pull/26488 that introduced the change, also does not hint that this indirection is necessary.

Test plan: CI

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115473
Approved by: https://github.com/davidberard98, https://github.com/Skylion007
2023-12-09 20:43:40 +00:00
Behrang Javaherian
b3b5bd51ea [raas][torch][jit] Allow not storing the optimized graph (#115381)
Summary:
GraphFunction internally stores the optimized graph after generating it and then it is passed into the executor which makes a copy of it. So we store the optimized graph effectively twice.

This diff allows to set a flag to not store the optimized graph inside the GraphFunction.

The code is NoP right now until the flag is enabled.

Test Plan:
I ran SL with this on raas with good memory saving on raas server. From command line:

exmaple model run
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362

I1207 11:04:58.657143 3556226 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 255646 Kb
```

then with flag enabled:
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362 --torch_jit_do_not_store_optimized_graph=true
I1207 11:06:25.245779 3577383 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 165167 Kb
```
So collective with this flag and the flag from D51950418
```
buck run mode/opt-clang  sigrid/predictor/client/localnet:run_model -- --model_id_to_load=953556500 --model_snapshot_to_load=362 --torch_jit_do_not_store_optimized_graph=true --torch_jit_enable_profiling_graph_executor=false

I1207 11:09:17.502743 3592345 SigridPredictorLocalModelFactory.cpp:32] Memory usage for 953556500_362 is 114848 Kb
```

Differential Revision: D51931895

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115381
Approved by: https://github.com/malfet
2023-12-08 16:29:13 +00:00
Nikita Shulga
ad8aef0f98 [BE] [3/N] Use nested namespaces (#110314)
Mostly in torch/csrc/jit/runtime and in `ATen/cuda/`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110314
Approved by: https://github.com/seemethere
2023-09-30 02:23:48 +00:00
cyy
e9e93c5350 [Reland] Move torch::make_unique to std::make_unique (#109780)
We can first try to move torch::make_unique to std::make_unique despite reverting of #108866 .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109780
Approved by: https://github.com/ezyang
2023-09-21 18:30:21 +00:00
PyTorch MergeBot
525e4f42d0 Revert "replace torch::make_unique with std::make_unique (#108866)"
This reverts commit 03e35efbf7.

Reverted https://github.com/pytorch/pytorch/pull/108866 on behalf of https://github.com/clee2000 due to Sorry but I found more usages of `torch::make_unique` internally, I can go change all of these, but I'd prefer if that gets done before this gets merged ([comment](https://github.com/pytorch/pytorch/pull/108866#issuecomment-1722577925))
2023-09-17 21:57:30 +00:00
cyy
03e35efbf7 replace torch::make_unique with std::make_unique (#108866)
It should be safe to remove the old torch::make_unique functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108866
Approved by: https://github.com/albanD
2023-09-14 20:52:26 +00:00
Aaron Gokaslan
77c2a8a11f Clang-Tidy: Improve ctors by removing unnecessary copies and initializations (#91538)
Apply clang-tidy fixups to prefer member initializer and modernize-pass-by-value. This is a mostly a noop, but it should make a few ctors slighlty more readable and more efficient. Also drops in some missing moves that prevents a lot of unnecessary copying.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91538
Approved by: https://github.com/ezyang
2022-12-31 07:19:30 +00:00
Elias Ellison
05ce0f9be6 Add option to disable autocast pass
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77566

Approved by: https://github.com/anijain2305, https://github.com/davidberard98
2022-05-18 14:57:25 +00:00
Elias Ellison
9c4a63787b Add api for changing function executor settings, hook up execution with decomposition registry (#74186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74186

Make the execution settings mutable on function_impl so that we can set it for running op decompositions. Add mapping to function objects and show example in test of executing op decompositions.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34938125

Pulled By: eellison

fbshipit-source-id: adf108b2f6c1bd166910c6d7b94245661d67ce0d
(cherry picked from commit 9957e33803002d9e71abe4ff802769270b6960d3)
2022-03-29 18:38:52 +00:00
Elias Ellison
0ecf1add1b Introduce function-local settings for executor, expose in c++ (#74012)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74012

This allows setting an executor on a function. The first use case is use to decompositions in C++ without additional fusion passes etc which might not work with custom tensors like batched tensors/vmap. A subsequent use case might be taking advantage of invokees of JIT execution which guard on certain properties before invocation (such as complete shapes in AOT autograd, rank in lazy tensor).

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34938124

Pulled By: eellison

fbshipit-source-id: cf7a45416457942b872322cab47d871a8336bdb5
(cherry picked from commit 9c600eb9ad0f2173f003e511268e97584edae36d)
2022-03-29 18:38:52 +00:00
Elias Ellison
6694fdaccd Clean up profiling mode and profiling executor strategy (#73875)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73875

Previously we had a few settings:
- getExecutor - which toggled between Profiling Executor and Legacy
- getGraphOptimize - if true, overrides PE/Legacy to run with simple executor (no optimizations)
and then...
- getProfilingMode - which would set PE to 0 specializtions.

The last mode is redundant with getGraphOptimize, we should just remove it and use getGraphOptimize in these cases. It would lead to potentially invalid combinations of logic - what does mean if getProfilingMode is true but getExecutor is set to false ? This would lead to a bug in specialize_autograd_zero in this case, see: https://github.com/pytorch/pytorch/blob/master/torch%2Fcsrc%2Fjit%2Fpasses%2Fspecialize_autogradzero.cpp#L93.

The tests here are failing but get fixed with the PR above it, so i'll squash for landing.

Test Plan: Imported from OSS

Reviewed By: cpuhrsch

Differential Revision: D34938130

Pulled By: eellison

fbshipit-source-id: 1a9c0ae7f6d1cfddc2ed3499a5af611053ae5e1b
(cherry picked from commit cf69ce3d155ba7d334022c42fb2cee54bb088c23)
2022-03-29 18:38:51 +00:00
Lucian Grijincu
4bf1be898d caffe: fix warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"
Summary:
Need to bring in all signatures

https://www.internalfb.com/code/fbsource/[36035b9e4e41813e215ffd5f4377d65b7259237e]/fbcode/caffe2/aten/src/ATen/core/function.h?lines=91-101

Test Plan:
```
Action Failed for fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda (ovr_config//platform/linux:x86_64-fbcode-platform010-clang-6dbc4bb1b9a32829)#5:
cxx_compile ngram_repeat_block_cuda_kernel.cu (pic) failed with non-zero exit code 1
debug information: action_digest=988629a726bc4eabcaf334db2317a969958d5fd2:94
stdout:
stderr:
fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"

fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction"

fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Property>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:117:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Property>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Assign>]':
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:171:   required from here
fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Assign>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'}
  220 |     return Maybe<T>(Compound::create(TK_OPTION, range, {value}));
      |                ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note:   initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)'
  143 |       const SourceRange& range_,
      |         ~~~~~~~~~~~~~~~~~~~~~~~~
  144 |       TreeList&& trees_) {
      | ^
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics
cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics
command: buck-out/v2/gen/fbcode/999b02f9444004c1/tools/build/__wrap_nvcc.py__/wrap_nvcc.py -_NVCC_BIN_ fbcode ...<omitted>... ors/pytorch/lib/cuda/__ngram_repeat_block_cuda__/__objects__/ngram_repeat_block_cuda_kernel.cu.pic.o (rerun with -v to view the untruncated command)
```

Differential Revision: D33579670

fbshipit-source-id: 9acb443732feb3e921ce0fa5f38f21ed44f64114
2022-01-14 20:27:09 -08:00
Han Qi
4eb772fde6 Refactor saving jit::Module to mobile .pt in 2 steps: (#66494)
Summary:
1. is to convert Function -> mobile::Function
2. is to serialize mobile::Function

This also opens opportunity to create mobile::Module without saving/reloading

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66494

Reviewed By: zhxchen17

Differential Revision: D32293022

Pulled By: qihqi

fbshipit-source-id: 29b43d47ff86071d5e2f9d6ca4dba4445711ce3d
2021-11-17 12:02:20 -08:00
jjsjann123
0dc3f829d9 Nvfuser code bump 11 5 (#67943)
Summary:
nvfuser code update:
1. Tuning heuristics on schedulers for reduction/normalization kernels;
2. bfloat16 on IO tensor support;
3. Refactored memory format support, now we can support dimension collapsing with non-coherent input tensors with different memory format. e.g. channels last tensor input to batch normalization. Note that we are currently limiting memory format to only Contiguous and Channels last;
4. Refactored nvfuser graph partitioning in `graph_fuser.cpp`, separated node merge and profile node API. Updated `profiling_record.cpp`.

Things that are reverted from our local branch:
1. changes on some entries in autodiff
2. aten::gelu with approximation
3. native_dropout(_backward)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67943

Reviewed By: ngimel

Differential Revision: D32288709

Pulled By: dzhulgakov

fbshipit-source-id: fc9491182ea7e0158bc112c66f096823c588eaf1
2021-11-17 01:22:17 -08:00
David Berard
5cfca5524c [JIT] clear GraphFunction.optimized_graphs_ after freezing a module (#68316)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68316

Consider the following:
```
class Mod(nn.Module):
    def __init__(self, val):
	super().__init__()
	self.param = nn.Parameter(val)

    def forward(self, x):
	# this method will change during freezing
	return x + self.param

    torch.jit.export
    def make_prediction(self, x):
	y = x + x
	return self.forward(y)

param = torch.rand([2, 2])

unscripted_mod = Mod(param)
mod = torch.jit.script(unscripted_mod)
mod.eval()
mod = torch.jit.freeze(mod, preserved_attrs=["make_prediction"])`
```

During freezing the following will occur:
1. do some pre-freezing, including inlining; in particular, forward will be inlined into make_prediction. During inlining, forward.optimized_graph() is called, and the result is cached
2. freeze some methods. While freezing forward, the graph associated with the function will get updated. The cached optimized_graphs_ are not updated.

Previously, a call to `mod.forward(x)` would return an exectutor that would run on the old cached optimized_graph(). This would mean that the freezing optimizations would not apply, and potentially that the execution would fail because of parameters removed from the module.

This change clears the optimized_graphs_ cache after running freezing to prevent executing an old version of the graph.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D32410862

Pulled By: davidberard98

fbshipit-source-id: dd8bfe86ec2898b7c72813ab32c08f25c38e4cea
2021-11-16 17:15:29 -08:00
Zhengxu Chen
5ef62c88a9 [jit] Replace get_executor() with call() in abstract Function interface. (#65969)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65969

ghstack-source-id: 141759210

Test Plan: no behavior change.

Reviewed By: anjali411

Differential Revision: D31326151

fbshipit-source-id: 201f6dc4c23fdb2531f6b8c73d26127f9e212de4
2021-10-28 13:11:29 -07:00
Zhengxu Chen
0795735351 [jit] Clean up unneeded virtual methods from Function interface. (#65968)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65968

tryToGraphFunction() should cover all cases and more composable than
adhoc virtual methods.
ghstack-source-id: 141759214

Test Plan: no behavior change.

Reviewed By: gmagogsfm

Differential Revision: D31326154

fbshipit-source-id: 692a35df424f7d4f777a96489c4cbb24b3ae7807
2021-10-28 12:28:48 -07:00
jjsjann123
1ec732bc46 Add fp16/fp32 autocasting to JIT/TorchScript (#63939)
Summary:
Adds mixed precision autocasting support between fp32/fp16 to torchscript/JIT. More in depth descriptoin can be found at [torch/csrc/jit/JIT-AUTOCAST.md](https://github.com/pytorch/pytorch/pull/63939/files#diff-1f1772aaa508841c5bb58b74ab98f49a1e577612cd9ea5c386c8714a75db830b)

This PR implemented an autocast optimization pass that inserts casting ops per AMP rule (torch/csrc/jit/passes/autocast.cpp), that mimics the behavior of eager autocast. The pass also takes into consideration the context of `torch.cuda.amp.autocast` and only inserts casting ops within the enabled context manager, giving feature parity as with eager amp autocast.

We currently provide JIT AMP autocast as a prototyping feature, so it is default off and could be turned on via `torch._C._jit_set_autocast_mode(True)`

The JIT support for autocast is subject to different constraints compared to the eager mode implementation (mostly related to the fact that TorchScript is statically typed), restriction on the user facing python code is described in doc torch/csrc/jit/JIT-AUTOCAST.md

This is a prototype, there are also implementation limitation that's necessary to keep this PR small and get something functioning quickly on upstream, so we can iterate on designs.

Few limitation/challenge that is not properly resolved in this PR:
1. Autocast inserts cast operation, which would have impact on scalar type of output tensor feeding downstream operations. We are not currently propagating the updated scalar types, this would give issues/wrong results on operations in promotion rules.

2. Backward for autodiff in JIT misses the casting of dgrad to input scalar type, as what autograd does in eager. This forces us to explicitly mark the casting operation for certain operations (e.g. binary ops), otherwise, we might be feeding dgrad with mismatch scalar type to input. This could potentially break gradient function consuming dgrad. (e.g. gemm backwards, which assumes grad_output to be of same scalar type as input')

3. `torch.autocast` api has an optional argument `dtype` which is not currently supported in the JIT autocast and we require a static value.

Credit goes mostly to:
tlemo
kevinstephano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63939

Reviewed By: navahgar

Differential Revision: D31093381

Pulled By: eellison

fbshipit-source-id: da6e26c668c38b01e296f304507048d6c1794314
2021-10-27 12:11:36 -07:00
Zhengxu Chen
b55a2500d2 [jit] Remove graph() call from abstract Function interface. (#65967)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65967

Graph is an implementation detail. If user wants to get access to the
underlying graph, they should be able to explicitly dynamic cast instead.
ghstack-source-id: 141659819

Test Plan: no behavior change.

Reviewed By: gmagogsfm

Differential Revision: D31326153

fbshipit-source-id: a0e984f57c6013494b92a7095bf5bb660035eb84
2021-10-27 11:54:26 -07:00
Mike Guo
6ecc1a4c4f Make pytorch clang-tidy clean (#60649)
Summary:
This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master.

I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver):
```bash
python3 setup.py develop

# Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options
python3 tools/clang_tidy.py \
  -j \
  -s \
  -k \
  -v \
  --paths torch/csrc/ \
  -g"-torch/csrc/jit/passes/onnx/helper.cpp" \
  -g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
  -g"-torch/csrc/jit/serialization/onnx.cpp" \
  -g"-torch/csrc/jit/serialization/export.cpp" \
  -g"-torch/csrc/jit/serialization/import.cpp" \
  -g"-torch/csrc/jit/serialization/import_legacy.cpp" \
  -g"-torch/csrc/onnx/init.cpp" \
  -g"-torch/csrc/cuda/nccl.*" \
  -g"-torch/csrc/cuda/python_nccl.cpp" \
  -g"-torch/csrc/autograd/FunctionsManual.cpp" \
  -g"-torch/csrc/generic/*.cpp" \
  -g"-torch/csrc/jit/codegen/cuda/runtime/*" \
  -g"-torch/csrc/deploy/interpreter/interpreter.cpp" \
  -g"-torch/csrc/deploy/interpreter/interpreter.h" \
  -g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \
  -g"-torch/csrc/deploy/interpreter/test_main.cpp"
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649

Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors.

Reviewed By: walterddr, janeyx99

Differential Revision: D29504258

Pulled By: 1ntEgr8

fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e
2021-07-01 12:21:07 -07:00
Gaoxiang Liu
735f8cc6c2 [DI] Allow explicit taskLauncher for torchscript interpreter (#46865)
Summary:
By default, TorchScript execution is single threaded and uses the caller's thread pool. For the use case of distributed inference, we hope there is a way to customize the behavior where the  interpreter in torch script can be executed in other places. This diff allows an explicit taskLauncher for torchscript interpreter.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46865

Test Plan:
unit test is passed.

fbshipit-source-id: 1d7b003926c0d1f8facc53206efb960cff8897ac

Fixes #{issue number}

Reviewed By: houseroad

Differential Revision: D24616102

Pulled By: garroud

fbshipit-source-id: 79202b62f92d0b0baf72e4bf7aa3f05e0da91d59
2020-11-04 17:07:55 -08:00
Ansha Yu
aac36a89ff [model transform] tuple to arglist jit pass (#36093)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36093

Unwrap any tuples (including NamedTuples) in the module forward
function input list to be arglist.
1. Supports multiple tuple inputs, and traces their use through CallMethods and
TupleIndex
2. Does not unwrap inner use of other tuples that did not show up in the
original toplevel graph inputs

We work from the ScriptModule level instead of the Graph level because:
1. If the ScriptModule was previously called with the original set of inputs, the GraphExecutor caches the ExecutionPlan (specifically, ArgumentSpecCreator is derived from the Graph and type check the inputs passed in)
2. Since we are changing this graph's inputs, we clone the module and clear the GraphExecutor.

Since we work from ScriptModule level, we cannot take advantage of jit level syntactic sugar like run_pass(), so I jit exposed this as a cpp extension. Let me know if there are other ideas about this.

Test Plan:
buck test caffe2/torch/fb/model_transform:signature_translation_test
Todo: Verify use in bento

Untranslated graph:
```
> graph(%self : __torch__.test_jit.SparseNNWrapper,
>       %inputs.1 : NamedTuple(dense : Tensor, sparse : Dict(int, Tensor))):
>   %2 : __torch__.test_jit.SparseNN = prim::GetAttr[name="main_module"](%self)
>   %4 : Tensor = prim::CallMethod[name="forward"](%2, %inputs.1) # /data/users/ansha/fbsource/fbcode/buck-out/dev/gen/caffe2/test/jit#binary,link-tree/test_jit.py:12141:23
>   return (%4)
```

Translated graph:
```
> graph(%self : __torch__.test_jit.___torch_mangle_1.SparseNNWrapper,
>       %inputs.1_0 : Tensor,
>       %inputs.1_1 : Dict(int, Tensor)):
>   %2 : __torch__.test_jit.___torch_mangle_2.SparseNN = prim::GetAttr[name="main_module"](%self)
>   %3 : Tensor = prim::CallMethod[name="forward"](%2, %inputs.1_0, %inputs.1_1) # /data/users/ansha/fbsource/fbcode/buck-out/dev/gen/caffe2/test/jit#binary,link-tree/test_jit.py:12141:23
>   return (%3)
```

Reviewed By: houseroad

Differential Revision: D20313673

fbshipit-source-id: fddd07c9537dc8b6f480a14d697bea10ecc74470
2020-04-09 22:05:43 -07:00
Jeremy Lilley
8d64a3848c [jit] In RPC Server, handle TorchScript continuations asynchronously (#34109)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34109

This change adds glue to GraphExecutor to give the RPC server
access to the future-based Interpreter::runAsync() api.

Previously, if a server encounted a TorchScript continuation-based block
with fork/wait, it would simply block in the server thread until the handler
completed, since it uses the synchronous Interpreter::run() api.

With the ivalue::Future returned by the Interpreter, we can run the
TorchScript code asynchronously from c++ simply by connecting its
callback to the server callback.

We add test cases to cover the new logic, both rpc_async and remote.

ghstack-source-id: 101245438

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc/...

Differential Revision: D20194321

fbshipit-source-id: 16785ec5d9ed0b16cb1ffab0a9771a77de30fcb0
2020-03-31 17:21:46 -07:00
Ilia Cherniavskii
800d5617c0 Recording of TorchScript functions (#34710)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34710

Extending RecordFunction API to support new recording scopes (such as TorchScript functions), as well as giving more flexibility to set sampling rate.

Test Plan: unit test (test_misc.cpp/testRecordFunction)

Reviewed By: gdankel, dzhulgakov

Differential Revision: D20158523

fbshipit-source-id: a9e0819d21cc06f4952d92d43246587c36137582
2020-03-31 00:33:23 -07:00
Hong Xu
027d7f7ba5 Delete AT_WARN and replace all AT_WARN with TORCH_WARN (#34623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34623

The bandaid of "AT_WARN" keeps introducing new warnings. Let's get rid
of it entirely.

Close #34502

Test Plan: Imported from OSS

Differential Revision: D20420112

Pulled By: albanD

fbshipit-source-id: 7160c113cb4deb2d2f50a375356f423fe5e86f50
2020-03-13 12:27:22 -07:00
James Reed
45a504dd2d [JIT] Introduce BuiltinOpFunction and integrate into torchbind (#34098)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34098

* #33900 [JIT] Move stuff out of class_type.cpp

Test Plan: Imported from OSS

Differential Revision: D20229166

Pulled By: jamesr66a

fbshipit-source-id: d658a63a5d6e372e675f35b8456adc8de82b49f3
2020-03-07 10:03:56 -08:00
James Reed
60e8615a6d [JIT] Virtualize Function (#33921)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33921

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.intern.facebook.com/intern/diff/D20153092/)!

Test Plan: Imported from OSS

Differential Revision: D20177227

Pulled By: jamesr66a

fbshipit-source-id: 87f3e484c4f873d60f76f50f6789c1b4a73bdfde
2020-03-07 10:03:50 -08:00