Commit Graph

9042 Commits

Author SHA1 Message Date
albanD
98afce3c56 Remove unnecessary assert in autograd engine (#34307)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34307

Test Plan: Imported from OSS

Differential Revision: D20283401

Pulled By: albanD

fbshipit-source-id: 34f6eb8955b7d9cb259260abc1056ddd9f354107
2020-03-06 11:45:46 -08:00
Shihao Xu
17ceb6941f [RPC] Create local RRef<ModuleInterface> remotely in Python, use it remotely in TorchScript (#34183)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34183

https://github.com/pytorch/pytorch/pull/33263 enhanced the RRef Python constructor to infer most types, by `jit::tryToInferType(..)`.

But this helper function can't infer `ScriptModule` type due to `ScriptModule`'s special per-Module type singleton logic, so it's still not possible for an Python-created RRef to know the JIT type of it's contained `ScriptModule`.

Instead of inferring the specific type of a Module, which could leads to too many candidate types (due to Module's multiple inheritance possibility), it's more straightforward to set it's type as a user-specified `ModuleInterface` type.

We added an optional argument `type_hint` for users to mark an `RRef` for what `ModuleInterface` type it's holds.

ghstack-source-id: 99649379

(Note: this ignores all push blocking failures!)

Test Plan:
Aspects that need to be confirmed in the test cases

https://fb.quip.com/aGxRAh2lCg05

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_create_local_script_class_rref

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_create_local_script_module_rref

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_return_local_script_class_rref_in_py_and_use_in_script

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_return_local_script_module_rref_in_py_and_use_in_script

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_torchscript_function_exception
```

Differential Revision: D7065050

fbshipit-source-id: e10210c0996622969e499e4a35b0659b36787c1c
2020-03-06 08:28:22 -08:00
Shen Li
30680196e4 Revert D20121915: [JIT] Add support for list()
Test Plan: revert-hammer

Differential Revision:
D20121915

Original commit changeset: c6c4ef444dbf

fbshipit-source-id: 829adb58780f4d0f41acebb3e7640a9c68bdbc1b
2020-03-06 07:16:40 -08:00
lixinyu
f9f135c5d8 ChannelsLast3d support is_contiguous, contiguous, suggest_memory_format, caching (#33033)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33033

Test Plan: Imported from OSS

Differential Revision: D19759661

Pulled By: glaringlee

fbshipit-source-id: 6c4798fa93589338c0c71c5308b9fd1151330245
2020-03-06 06:02:03 -08:00
Will Feng
415595ace4 [C++ API] Remove init-list form of at::indexing::Slice (#34255)
Summary:
The init-list form of `at::indexing::Slice` (i.e. `tensor.index({{1, None, 2}, ...})` instead of `tensor.index({Slice(1, None, 2), ...})`) in C++ API can be easily confused with the list-form indexing in Python API (e.g. `tensor[[1, 3, 2], ...]`), which is not good from readability perspective. This PR removes the init-list form of `at::indexing::Slice` to make the API less confusing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34255

Test Plan: Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D20290166

Pulled By: yf225

fbshipit-source-id: abbcbeca0b179219e5e1f196a33ef8aec87ebb76
2020-03-06 05:51:53 -08:00
meganset
b8fd88319a C++ make torch::nn::Sequential push_back(AnyModule) methods public (#34208)
Summary:
Issue https://github.com/pytorch/pytorch/issues/33192
Moves Sequential::push_back methods with AnyModule from private -> public
Allows adding an existing AnyModule via something like:

```
  torch::nn::Sequential q;
  auto a=torch::nn::AnyModule(torch::nn::Linear(1,2));
  q->push_back(a);
  q->push_back("fc",a);
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34208

Differential Revision: D20300278

Pulled By: yf225

fbshipit-source-id: 4525319bb7fb6667e43a006c9f446a2193781005
2020-03-06 05:47:14 -08:00
Ilia Cherniavskii
b50825e011 Make RecordFunction more robust for async use cases (#34122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34122

Earlier work added support for async rpc cases when RecordFunction's
end callbacks might be called in a different thread; in addition some
extra care was needed to handle pointer to parent function;

This PR makes RecordFunction aware of potentially multiple threads in
use, as well as removes unused parent() call and restricts current()
RecordFunction to scope-based record functions (RECORD_FUNCTION macro)

Test Plan: unit tests

Differential Revision: D20297709

Pulled By: ilia-cher

fbshipit-source-id: 46a59e1b2eea0bbd8a59630385e193b38d30f9d1
2020-03-05 22:28:53 -08:00
anjali411
76035f050b [C++ API Parity] Adam: updated step and class design (#33730)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33730

Differential Revision: D20292073

Pulled By: anjali411

fbshipit-source-id: a7b4a70f29027ab355aebb91873ea55d5cb51783
2020-03-05 19:15:24 -08:00
Shihao Xu
f4da78f1b3 Remove RPC TorchScript private API (#33978)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33978

We can directly pass user_callbale to rpc_async API in TorchScript. There is no need to have private API for taking qualified name.
ghstack-source-id: 99600360

Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork \
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_torchscript_functions_not_supported
```

Differential Revision: D7420993

fbshipit-source-id: 228c15b21848e67418fab780e3fd6a1c6da5142d
2020-03-05 18:35:05 -08:00
Kimish Patel
02478984d6 Add support to dump unsupported ops. Add lite_interpter_load test. (#34278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34278

This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.

Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>

Reviewed By: iseeyuan

Differential Revision: D20266341

fbshipit-source-id: 5a6c7a5bc52f910cea82a72045870da8105ccb87
2020-03-05 18:31:31 -08:00
Supriya Rao
434af5d94a [quant] Speed up per-channel min-max observer (#34118)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34118

Previously calc_per_channel_qparams was using for loops and python primitives, which called `item` many times causing slowdown during training.
    These changes uses torch primitives on the tensor to speed up the operation over 60x

    Perf results on MobileNetV2 during training using autograd profiler

    FP32 forward call -
    Self CPU time total: 47.222ms
    CUDA time total: 124.001ms

    before change
    FakeQuant Model -
    Self CPU time total: 19.107s
    CUDA time total: 27.177s

    after change
    FakeQuant Model -
    Self CPU time total: 404.667ms
    CUDA time total: 446.344ms

Test Plan:
python test/test_quantization.py

Imported from OSS

Differential Revision: D20287841

fbshipit-source-id: 6b706b8206e0d0da3c3c217b014e8da5b71b870d
2020-03-05 18:29:41 -08:00
neginraoof
d2b5eb2a45 [ONNX] Fix for random generators export (#33789)
Summary:
Export random generator with dynamic input size
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33789

Reviewed By: hl475

Differential Revision: D20121175

Pulled By: houseroad

fbshipit-source-id: c16d11eb07678166d125759d97aadfcd7c80ef14
2020-03-05 17:58:54 -08:00
Supriya Rao
1cf12b7e53 [quant] Fix histogram observer to work with QAT on GPU (#34232)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34232

By default `torch.zeros` creates the tensor on GPU. Need to specify the device argument to get it to work correctly on GPU during QAT.

Test Plan:
1. Tested by running QAT on GPU

2. python test/test_quantization.py

Imported from OSS

Differential Revision: D20286351

fbshipit-source-id: 745723c85d902870c56c1c7492f26cb027ae9dc6
2020-03-05 17:19:12 -08:00
Elias Ellison
78aebbcb88 [JIT] add other module apis (#34106)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34106

Test Plan: Imported from OSS

Differential Revision: D20283996

Pulled By: eellison

fbshipit-source-id: 88e7bc4547e96717d6c8efe0b25ede0d198d9e68
2020-03-05 16:12:29 -08:00
Martin Yuan
ccf4d69b75 [Lite Interpreter] Enable __setstate__ (#33294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33294

1. Serialize bytecode of __setstate__ and run it when loading the model.
2. One use case is quantization. To test this use case a few operators are registered temporarily for lite interpreter. The "_" prefix registration will be removed when the operators are all migrated to mobile.

Test Plan: Imported from OSS

Differential Revision: D20162898

Pulled By: iseeyuan

fbshipit-source-id: 7a3180807bf38fbce594d86993896861f12bb58c
2020-03-05 15:24:21 -08:00
Elias Ellison
f218842f2e [JIT] Add support for list() (#33818)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33818

Test Plan: Imported from OSS

Differential Revision: D20121915

Pulled By: eellison

fbshipit-source-id: c6c4ef444dbf1d4134dccb28c13315e225945b64
2020-03-05 14:48:20 -08:00
Elias Ellison
479c3b0aa5 [JIT] add support for torch.norm (#33783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33783

Fix for https://github.com/pytorch/pytorch/issues/20113

Test Plan: Imported from OSS

Differential Revision: D20121917

Pulled By: eellison

fbshipit-source-id: ffedcc40678cd80f5529ff9323088eed544e5158
2020-03-05 14:46:24 -08:00
Adam Paszke
3a4bac5c76 Throw a proper error when parsing local variable annotations without assignments (#34133)
Summary:
Currently, putting `outputs: List[Tensor]` instead of `outputs: List[Tensor] = []` in your JITed code results in:
```
Traceback (most recent call last):
  File "custom_lstms.py", line 453, in <module>
    test_script_stacked_bidir_rnn(5, 2, 3, 7, 4)
  File "custom_lstms.py", line 404, in test_script_stacked_bidir_rnn
    rnn = script_lstm(input_size, hidden_size, num_layers, bidirectional=True)
  File "custom_lstms.py", line 62, in script_lstm
    other_layer_args=[LSTMCell, hidden_size * dirs, hidden_size]))
  File "/home/apaszke/pytorch/torch/jit/__init__.py", line 1267, in script
    return torch.jit._recursive.create_script_module(obj, torch.jit._recursive.infer_methods_to_compile)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 305, in create_script_module
    return create_script_module_impl(nn_module, concrete_type, stubs_fn)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 348, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/apaszke/pytorch/torch/jit/__init__.py", line 1612, in _construct
    init_fn(script_module)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 340, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, infer_methods_to_compile)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 348, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/apaszke/pytorch/torch/jit/__init__.py", line 1612, in _construct
    init_fn(script_module)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 340, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, infer_methods_to_compile)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 348, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/apaszke/pytorch/torch/jit/__init__.py", line 1612, in _construct
    init_fn(script_module)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 340, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, infer_methods_to_compile)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 348, in create_script_module_impl
    script_module = torch.jit.RecursiveScriptModule._construct(cpp_module, init_fn)
  File "/home/apaszke/pytorch/torch/jit/__init__.py", line 1612, in _construct
    init_fn(script_module)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 340, in init_fn
    scripted = create_script_module_impl(orig_value, sub_concrete_type, infer_methods_to_compile)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 317, in create_script_module_impl
    stubs = stubs_fn(nn_module)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 511, in infer_methods_to_compile
    stubs.append(make_stub_from_method(nn_module, method))
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 41, in make_stub_from_method
    return make_stub(func)
  File "/home/apaszke/pytorch/torch/jit/_recursive.py", line 34, in make_stub
    ast = torch.jit.get_jit_def(func, self_name="RecursiveScriptModule")
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 173, in get_jit_def
    return build_def(ctx, py_ast.body[0], type_line, self_name)
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 206, in build_def
    build_stmts(ctx, body))
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 129, in build_stmts
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 129, in <listcomp>
    stmts = [build_stmt(ctx, s) for s in stmts]
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 181, in __call__
    return method(ctx, node)
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 294, in build_AnnAssign
    rhs = build_expr(ctx, stmt.value)
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 180, in __call__
    raise UnsupportedNodeError(ctx, node)
  File "/home/apaszke/pytorch/torch/jit/frontend.py", line 116, in __init__
    source_range = ctx.make_range(offending_node.lineno,
AttributeError: 'NoneType' object has no attribute 'lineno'
```

This patch makes the error message more reasonable:
```
torch.jit.frontend.UnsupportedNodeError: annotated assignments without assigned value aren't supported:
  File "custom_lstms.py", line 221
        # type: (Tensor, Tuple[Tensor, Tensor]) -> Tuple[Tensor, Tuple[Tensor, Tensor]]
        inputs = reverse(input.unbind(0))
        outputs: List[Tensor]
        ~ <--- HERE
        for i in range(len(inputs)):
            out, state = self.cell(inputs[i], state)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34133

Differential Revision: D20249076

Pulled By: ezyang

fbshipit-source-id: 40ec34ad38859f9fe56f379d3f8d08644b00fab9
2020-03-05 11:23:07 -08:00
Lara Haidar
8216d9ae64 ONNX Export Support for NLLLoss (#33509)
Summary:
Adding ONNX export support for torch.nn.NLLLoss().
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33509

Reviewed By: hl475

Differential Revision: D20052212

Pulled By: houseroad

fbshipit-source-id: 62efcff4efa1e0e97c65ad1b670c2fc1da08d28f
2020-03-05 11:13:21 -08:00
Rohan Varma
d98bd5e1f5 [test all] Back out "Revert D20171428: [profiler] fix chrome tracing for profiler run with cuda"
Summary:
There was an error in
https://github.com/pytorch/pytorch/pull/30724/files that resulted in
export_chrome_trace generating invalid JSON. This only came up when the
profiler is run with use_cuda=True from what it looks like. In the future, we
should have tests that ensure we generate valid JSON because we no longer use
the json library.
ghstack-source-id: 99508836

Test Plan: Added a unit test.

Differential Revision: D20237040

fbshipit-source-id: 510befbdf4ec39632ac56544afcddee6c8cc3aca
2020-03-05 09:05:56 -08:00
Jie
2b79bab029 [CUDA_FUSER] Fork CUDA fuser (#33527)
Summary:
Separating CUDA fuser from CPU fuser.

1. New node in IR - prim::CudaFusionGroup:
   This enables the cuda fuser to co-exist along side the old fuser. Allows us
   to incrementally build and expand cuda fuser.

2. copied FuseGraph optimization passes to CudaFuserGraph:
   We will re-factor & reuse Chunk/Concat in the old fuser logic, which is
   handled in the optimization pass at this moment. Unfortunately many code in
   the pass is tightly binded with the legacy fuser, which makes code sharing
   difficult.
   The CudaFusionGraph will support only a subset of operations comparing to
   legacy fuser (CUDA only). It is registered as a custom pass post fusion via
     ```torch._C._jit_register_cuda_fuser()```
   To have it in effect, you should also turn off fusion on GPU via
     ```torch._C._jit_override_can_fuse_on_gpu(False)```

3. We don't have codegen in this PR yet (WIP). Currently we just fall back to
   the old fuser.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33527

Differential Revision: D20171598

Pulled By: ZolotukhinM

fbshipit-source-id: 9a3c0f06f46da7eaa80ae7551c04869f5b03ef71
2020-03-04 20:25:08 -08:00
Elias Ellison
e132047f1b [JIT] fix alias assertion (#34268)
Summary:
[This check](019ffdca31/torch/csrc/jit/ir/alias_analysis.cpp (L772)) wasn't being triggered for None outputs of tuples, because `mustBeNone` would return false if `num_outputs != 1`.  This caused an assertion to fail in alias analysis. It's kind of a convoluted case to repro and I wasn't able to make a succinct one, but I tested internally and it fixed the bug.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34268

Differential Revision: D20261539

Pulled By: eellison

fbshipit-source-id: 95edea10e2971727cfd3f3bc2b6bdf9dbadca6a9
2020-03-04 19:00:58 -08:00
Pritam Damania
c62de4286e Add test to verify dist_autograd doesn't populate .grad field. (#33949)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33949

ghstack-source-id: 99419830

Test Plan: waitforbuildbot

Differential Revision: D20165254

fbshipit-source-id: ef4413637b1568d81e4aca053838230025df6bba
2020-03-04 17:08:48 -08:00
Supriya Rao
e236e15934 [quant] Run weight_post_process for QAT (#33852)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33852

This fixes an issue for QAT models. During eval if we call `prepare_qat` and `convert` before calling `load_state_dict` it throws an error because the weight info (num channels) is not updated in the observer module.
It is not an issue for per-tensor case

Fixes issue #33830

Test Plan:
python test/test_quantization.py EagerModePostTrainingQuantTest.test_eval_after_train
python test/test_quantization.py EagerModeQuantizationAwareTrainingTest.test_eval_after_train

Imported from OSS

Differential Revision: D20212996

fbshipit-source-id: a04af8fe4df2e555270ae4d6693f5777d86f8a46
2020-03-04 14:01:32 -08:00
Shen Li
d59e036f4d Revert D20194092: Add support to dump unsupported ops. Add lite_interpter_load test.
Test Plan: revert-hammer

Differential Revision:
D20194092

Original commit changeset: 0d596cd02043

fbshipit-source-id: 17b4bae27543f231bd6c12d90368d399ca55ebdf
2020-03-04 13:53:58 -08:00
Kimish Patel
17a5c67796 Add support to dump unsupported ops. Add lite_interpter_load test. (#34072)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34072

This diff helps check all the ops not supported by lite_interpreter.
Helpful mainly to find all the ops that need to be added instead of adding them
one by one.

Test Plan:
buck run caffe2/binaries:lite_interpreter_model_load --
--model=<bytecode-model-path>

Reviewed By: iseeyuan

Differential Revision: D20194092

fbshipit-source-id: 0d596cd0204308027194af7ed738551d0c32a374
2020-03-04 13:18:12 -08:00
Dmytro Dzhulgakov
67608cc018 Fix MKLDNN conv2d 5d weight handling (#34115)
Summary:
Effectively backporting c5c00c119f before that PR lands

The bug didn't manifesting itself earlier because MkldnnConv2d constructor didn't reorder the weights. So the issue was arising only on second serialization/deserialization. This also fixes the constructor to deliver better perf right away.

Note, that I still serialize 5d tensor - it was the previous behavior, we have to handle it anyway and with https://github.com/pytorch/pytorch/issues/32422 the output of `mkldnn_reorder_conv2d_weight` will always be 4d.

cc pinzhenx
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34115

Reviewed By: wanchaol

Differential Revision: D20224685

Pulled By: dzhulgakov

fbshipit-source-id: 24ca9227c4eb4c139096a64ae348808d7478d7dc
2020-03-04 11:26:38 -08:00
Shen Li
ac6e75a165 Revert D20195053: [pytorch][PR] Add API for listing functions overridable by __torch_function__
Test Plan: revert-hammer

Differential Revision:
D20195053

Original commit changeset: 1585f4e405f5

fbshipit-source-id: 3c1aab9c60e3138d40d200ae4238bda0cddf8896
2020-03-04 10:13:54 -08:00
Omkar Salpekar
78b81dad83 [Dist Autograd][Better Engineering] Enhanced Error Reporting in Dist Autograd/RPC (#34179)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34179

Fixes: https://github.com/pytorch/pytorch/issues/27644

Test Plan: Asserted `test_backward_autograd_engine_error` throws an exception with node information.

Differential Revision: D20238150

fbshipit-source-id: a49b279b77416a7e0e09043aa44ed616023d8e70
2020-03-04 10:13:49 -08:00
Nikita Shulga
45b8c8dbcb [torch] Fix sign-compare warning in torch::utils::rnn:pack_sequence (#34185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34185

ArrayRef<T>::size() is size_t

Test Plan: CI

Reviewed By: EscapeZero

Differential Revision: D20241552

fbshipit-source-id: 73cd062db810ebc5a4e34e094dfe6c7e6571ef2d
2020-03-04 10:13:45 -08:00
Jerry Zhang
6f52562e75 [quant][graphmode] Add add_relu pattern in skip values (#32816)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32816

att

Test Plan:
python test/test_jit.py

Imported from OSS

Differential Revision: D20208786

fbshipit-source-id: ef84b77f46f88b192a75c123aabaa203836a7dfb
2020-03-04 09:36:02 -08:00
Martin Yuan
fdd771c90f Make tracing in code gen optional (#33715)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33715

Tracing codes depend on the full JIT, which is not available in lite interpreter. Use `-c pt.disable_gen_tracing=1` to turn off generating tracing part.
ghstack-source-id: 99252322

Test Plan:
```
buck build xplat/caffe2:torch -c pt.disable_gen_tracing=1
```
The tracing part of generated/VariableType_?.cpp will not be generated.

Reviewed By: smessmer

Differential Revision: D19684577

fbshipit-source-id: a1e5b80eca5e51c7bf72b5cc8f0e36c2135fabc2
2020-03-04 08:16:31 -08:00
Martin Yuan
f097ca503d Add and test training in lite interpreter. (#32359)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32359

Test Plan: Imported from OSS

Differential Revision: D19450614

Pulled By: iseeyuan

fbshipit-source-id: 6bafff39d7880a5b7fb9cd70c33a4e584812be12
2020-03-03 23:33:43 -08:00
Shihao Xu
7d01888a75 [JIT] Register rpc.rpc_async(..) as a JIT operator (#33329)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33329

# Use case

```
torch.jit.script
def send_rpc_async(dst_worker_name, user_callable_qual_name, tensor):
    # type: (str, str, Tensor) -> None
    rpc._rpc_async_torchscript(
        dst_worker_name, user_callable_qual_name, args=(tensor,)
    )
```

# Problem

```
torch.jit.frontend.NotSupportedError: keyword-arg expansion is not supported:
  File "/data/users/shihaoxu/fbsource/fbcode/buck-out/dev/gen/caffe2/test/distributed/rpc/rpc_spawn#binary,link-tree/torch/distributed/rpc/api.py", line 722
    args = args if args else ()
    kwargs = kwargs if kwargs else {}
    fut = _invoke_rpc_torchscript(to, qualified_name, *args, **kwargs)
                                                               ~~~~~~ <--- HERE
    return fut
```

# Solution

Register `rpc.rpc_async(..)` as a JIT operator to handle variable-length argument list.

# Plan

This PR is the required changes to make `rpc.rpc_async(..)` a JIT prim operator, which can dynamically handle different number of arguments.

- Register "prim::rpc_async" as a `Symbol` in "interned_string.h"
- Add a if branch in "python_sugared_value.cpp" `toSugarValue(py::object, ..)` entry utility function to set up how JIT frontend convert `torch.distributed.rpc.rpc_async(..)` Python function (Python object) into a `SpecialFormValue` (IR SugaredValue).
- Add a switch case for "prim::rpc_aynsc" Symbol in "ir_emitter.cpp" and `emitApplySpecialForm(..)` to set up how JIT compiler provides inputs to the "prim::rpc_aynsc" Operator.
- Register "prim::rpc_async" as a `jit::Operator` and provide implementation in "register_distributed_ops.cpp".

Notice, since the distributed module is an optional part when building PyTorch. The code to be added in this PR should be wrapped within preprocessing maco.
```
#ifdef USE_DISTRIBUTED
new code here
#endif
```

Test Plan:
Items that need to be confirmed in the test cases

https://fb.quip.com/DCvdA9ZLjeO0

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork  \
\
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_call_python_function_remotely_from_script_not_supported
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_spawn
```

```
buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:layer_norm_op_test-2.7 -- test_layer_norm_op_jit
```

Differential Revision: D5738300

fbshipit-source-id: a4604fe762e00be062dc8232ca9790df31fb2074
2020-03-03 19:57:42 -08:00
Jerry Zhang
e5bbd23ca7 [quant][graphmode] Skip quantizing input and output in matched module (#32814)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32814

We skip quantization for the intermediate values for patterns like `Conv - ReLU`,
but currently we didn't skip quantizing the input/output of the graphs of matched modules,
since we now changed the way we add observers, this also needs to be updated.

Test Plan:
python test/test_jit.py -- 'TestJit.test_insert_observers_skip_values'

Imported from OSS

Differential Revision: D20208785

fbshipit-source-id: ce30f2c4c8ce737500d0b41357c80ec8b33aecf9
2020-03-03 18:38:36 -08:00
Emilio Castillo
31cc311143 Expose CUDACachingAllocator raw_alloc and raw_delete to python (#33860)
Summary:
This PR aims to improve the interoperability with [CuPy](https://github.com/cupy/cupy/pulls).

Instead of having two separate and conflicting memory pools. With this PR, CuPy can directly alloc memory from the PyTorch allocator by means of this proposal https://github.com/cupy/cupy/pull/3126

We would like to gather feedback to know if this approach makes sense for PyTorch, or other alternative designs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33860

Differential Revision: D20212788

Pulled By: ngimel

fbshipit-source-id: bc1e08a66da1992d26021147bf645dc65239581c
2020-03-03 17:50:11 -08:00
davidriazati
99e211e661 [jit] Add type tags to lists/dicts in pickle (#33255)
Summary:
Stacked PRs
 * #33474 - [jit] Remove list specializations from pickler
 * **#33255 - [jit] Add type tags to lists/dicts in pickle**

This adds a global call to `torch.jit._pickle.restore_type_tags` for
lists and dicts so that we can preserve their types after serialization.
](https://our.intern.facebook.com/intern/diff/19868637/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33255

Pulled By: driazati

Reviewed By: xman1979, Tianshu-Bao

Differential Revision: D19868637

fbshipit-source-id: 2f1826e6679a786ca209198690269f399a542c04
2020-03-03 16:48:21 -08:00
Shen Li
7da24b36b1 Apply clang-format to RPC files (#34139)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34139

Test Plan: Imported from OSS

Differential Revision: D20227342

Pulled By: mrshenli

fbshipit-source-id: 01b478bde1f6a51f69eb5277fa90ba6ac2d4b5dc
2020-03-03 16:44:35 -08:00
Shen Li
3af0dffe84 Use double quotes in C++ to stay consistent with Python RPC docs (#34095)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34095

Test Plan: Imported from OSS

Differential Revision: D20227343

Pulled By: mrshenli

fbshipit-source-id: 69c556beee1f9e944eb1053b5ff0ac368dd99c60
2020-03-03 16:44:30 -08:00
Shen Li
f1085a8e41 Improve ProcessGroup RpcBackendOptions Constructor API (#34081)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34081

Before this commit, applications have to do the following to configure
number of threads in ProcessGroup RPC backend:

```
op = ProcessGroupRpcBackendOptions()
op.rpc_timeout = rpc_timeout
op.init_method = init_method
op.num_send_recv_threads = 32
init_rpc(...., rpc_backend_options=op)
```

After this commit, it can be simplified to:

```
init_rpc(...., rpc_backend_options=ProcessGroupRpcBackendOptions(num_send_recv_threads=32))
```

Fixes #34075

Test Plan: Imported from OSS

Differential Revision: D20227344

Pulled By: mrshenli

fbshipit-source-id: def4318e987179b8c8ecca44d7ff935702c8a6e7
2020-03-03 16:43:29 -08:00
lixinyu
7cda964e20 Remove deprecated codepath for old-style autograd.Function (#30696) (#33956)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33956

Test Plan: Imported from OSS

Differential Revision: D20167359

Pulled By: glaringlee

fbshipit-source-id: 9b323bd29eca97bce0475225ad2b3b2ded29005d
2020-03-03 14:58:02 -08:00
Elias Ellison
04378eb618 [JIT] Add modulelist indexing for integer literal (#29236)
Summary:
Allow indexing into modulelists for integer literals.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29236

Differential Revision: D19583935

Pulled By: eellison

fbshipit-source-id: 24d54051422a69769dac5e82f3bf622ded2bd8a6
2020-03-03 14:47:31 -08:00
Edward Yang
ba1bd41767 Turn on strict dtype checking for test_torch.py (#33825)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33825

Partially addresses #20376

I do this by overriding assertEqual in classes that opt into
this.  This means I have to fix #33821.  The fix is a little
unsatisfactory as idiomatic Python 2 super() calls don't work
(since the class is no longer in scope); hopefully this will just
work when we go to Python 3.

General approach taken:
- A lot of dtype mismatches are because we specified tensor constants
  that infer to some dtype, but the actual dtype needed is something else.
  Those are easy, just annotate the tensor() constructor (often a legacy
  Tensor/FloatTensor call) with dtype
- There are a few cases where the promotion rules are nontrivial.  Some of them
  I just typed out the expected promotion rules manually (based on trial
  and error)
- There are some more complex cases; if it gets too hairy I just
  set exact_dtype=False and nope the fuck out

I don't have time to do it for all the other classes.  But the setup
should work if people just incrementally add the overrides to classes,
and then eventually flip the default.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20125791

Pulled By: ezyang

fbshipit-source-id: 389c2d1efbd93172af02f13e38ac5e92fe730c57
2020-03-03 14:45:53 -08:00
Rohan Varma
c579976603 Revert D20171428: [profiler] fix chrome tracing for profiler run with cuda
Test Plan: revert-hammer

Differential Revision:
D20171428

Original commit changeset: ec135a154ce3

fbshipit-source-id: 51ef4351a0df33fd087edbca1b7cd753cdbf1fdf
2020-03-03 14:36:01 -08:00
Rohan Varma
92083f31b5 [gloo] dont hold locks in calls to buffer in ProcessGroupGloo:RecvWork::wait() and (#33926)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33926

The UnboundBuffer calls here are already protected by a mutex. We only
need to hold the lock while writing the shared structures completed_ and
exception_.
ghstack-source-id: 99315427

Test Plan:
CI

CI

Differential Revision: D20154546

fbshipit-source-id: d1b74508c917b21acdcd0f6a914eb0455437ca0e
2020-03-03 13:28:45 -08:00
Rohan Varma
c93b1d427c [profiler] fix chrome tracing for profiler run with cuda (#33987)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33987

There was an error in
https://github.com/pytorch/pytorch/pull/30724/files that resulted in
`export_chrome_trace` generating invalid JSON. This only came up when the
profiler is run with `use_cuda=True` from what it looks like. In the future, we
should have tests that ensure we generate valid JSON because we no longer use
the json library.

Test Plan: Add UT to validate JSON.

Differential Revision: D20171428

fbshipit-source-id: ec135a154ce33f62b78d98468174dce4cf01fedf
2020-03-03 13:27:26 -08:00
Eleanor Dwight Holland
6a97777f72 Remove use of .data from optimizers (#33640)
Summary:
Removes all uses of `.data` from optimizers.

Or tries to.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33640

Reviewed By: vincentqb

Differential Revision: D20203216

Pulled By: albanD

fbshipit-source-id: 9bfe78bbed00fd4aaa690801cff0201f0bd680a0
2020-03-03 13:21:55 -08:00
Dmytro Dzhulgakov
a8fc3d8c2a Fix HistogramObserver to not do detach on input (#34114)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33545, added a unittest
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34114

Differential Revision: D20224719

Pulled By: dzhulgakov

fbshipit-source-id: 053d3b3b0c86340027ba1b95b5f3c247aa151aee
2020-03-03 13:15:22 -08:00
Nathan Goldbaum
ad2825a2c9 Add API for listing functions overridable by __torch_function__ (#33791)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33182

This adds private API functions that developers of types that implement `__torch_function__` can use to ensure full coverage of the subset of the PyTorch API that can be overrided.

I've refactored some of the code in the tests into a new `torch._overrides.get_overridable_functions` function. I've also changed `TENSOR_LIKE_TORCH_OVERRIDES` into `torch._overrides.get_testing_overrides` and `IGNORED_TORCH_FUNCTIONS` into `torch._overrides.get_ignored_functions`. Making these two static global variables in the tests into functions should allow rewriting their implementation to construct their return values instead of just statically defining the return value as is done here. Currently that is blocked on not being able to inspect function signatures of compiled kernels in PyTorch (see https://github.com/pytorch/pytorch/issues/28233). See the docs I've added for usage examples of these new functions. I also refactored the existing override tests to make use of these new functions, which should be a good forcing function to make sure they're kept up-to-date.

Finally, while working on this I discovered that `TestTorchFunctionOverrides.test_mean` and `TestTorchFunctionOverrides.test_mm` weren't ever being run because they were getting clobbered by the other dynamically generated override tests. I fixed that by renaming the tests and then fixing the actual test code. I've verified that all the subclassing semantics is correct and that the updated test answers are correct. I'm happy to put the fixes to the existing tests in as a separate pull request if that would be easier to review.

ping cpuhrsch since the feature request originally came from them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33791

Differential Revision: D20195053

Pulled By: cpuhrsch

fbshipit-source-id: 1585f4e405f5223932b410eae03a288dc8eb627e
2020-03-03 12:40:34 -08:00
Zachary DeVito
358450e02b improved TorchScript traceback (#33834)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33834

This changes how we report Tracebacks to make them more clear when
there are both serialized and non-serialized ranges. It now looks like:

```
Traceback (most recent call last):
  File "foo.py", line 25, in <module>
    s2(a, b)
  File "/scratch/zdevito/pytorch/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__.py", line 7, in forward
    x: Tensor,
    y: Tensor) -> Tensor:
    return (self).bar(x, y, )
            ~~~~~~~~~ <--- HERE
  def bar(self: __torch__.Moo,
    x: Tensor,
  File "code/__torch__.py", line 11, in bar
    x: Tensor,
    y: Tensor) -> Tensor:
    _0 = (self).baz(x, y, )
          ~~~~~~~~~ <--- HERE
    _1 = torch.ones([3], dtype=None, layout=None, device=None, pin_memory=None)
    return torch.add(_0, _1, alpha=1)
  File "code/__torch__.py", line 17, in baz
    x: Tensor,
    y: Tensor) -> Tensor:
    return torch.add(x, y, alpha=1)
           ~~~~~~~~~ <--- HERE

Traceback of TorchScript, original code (most recent call last):
  File "foo.py", line 11, in forward
    def forward(self, x, y):
        return self.bar(x, y)
               ~~~~~~~~ <--- HERE
  File "foo.py", line 9, in bar
    def bar(self, x, y):
        return self.baz(x, y) + torch.ones(3)
               ~~~~~~~~ <--- HERE
  File "foo.py", line 7, in baz
    def baz(self, x, y):
        return x + y
               ~~~~~ <--- HERE
RuntimeError: The size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 1
```

It follows Python convension of having the most important information last
and reading from the bottom up.

Changes:
* Moved the error message to the end, to copy Python
* Report original traceback separate from serialized traceback
* Make sure root functions have names in the interpreter trace.

Test Plan: Imported from OSS

Differential Revision: D20126136

Pulled By: zdevito

fbshipit-source-id: fd01f9985e5d74e04c4d064c02e8bc320f4fac13
2020-03-03 12:27:38 -08:00