Commit Graph

1823 Commits

Author SHA1 Message Date
Jie
2b79bab029 [CUDA_FUSER] Fork CUDA fuser (#33527)
Summary:
Separating CUDA fuser from CPU fuser.

1. New node in IR - prim::CudaFusionGroup:
   This enables the cuda fuser to co-exist along side the old fuser. Allows us
   to incrementally build and expand cuda fuser.

2. copied FuseGraph optimization passes to CudaFuserGraph:
   We will re-factor & reuse Chunk/Concat in the old fuser logic, which is
   handled in the optimization pass at this moment. Unfortunately many code in
   the pass is tightly binded with the legacy fuser, which makes code sharing
   difficult.
   The CudaFusionGraph will support only a subset of operations comparing to
   legacy fuser (CUDA only). It is registered as a custom pass post fusion via
     ```torch._C._jit_register_cuda_fuser()```
   To have it in effect, you should also turn off fusion on GPU via
     ```torch._C._jit_override_can_fuse_on_gpu(False)```

3. We don't have codegen in this PR yet (WIP). Currently we just fall back to
   the old fuser.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33527

Differential Revision: D20171598

Pulled By: ZolotukhinM

fbshipit-source-id: 9a3c0f06f46da7eaa80ae7551c04869f5b03ef71
2020-03-04 20:25:08 -08:00
Edward Yang
93990bab58 Make use of our S3 mirror if Yann Lecunn's website is not accessible (#34215)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34215

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20251538

Pulled By: ezyang

fbshipit-source-id: c419f0ce869aca4dede7e37ebd274a08632d10bf
2020-03-04 11:35:34 -08:00
Shen Li
78ad3dc174 Fix Lint (#34218)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34218

Test Plan: Imported from OSS

Differential Revision: D20249788

Pulled By: mrshenli

fbshipit-source-id: 5ca2acaff5344fc4455c70af60576f8e93e54cbf
2020-03-04 09:48:57 -08:00
Martin Yuan
fdd771c90f Make tracing in code gen optional (#33715)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33715

Tracing codes depend on the full JIT, which is not available in lite interpreter. Use `-c pt.disable_gen_tracing=1` to turn off generating tracing part.
ghstack-source-id: 99252322

Test Plan:
```
buck build xplat/caffe2:torch -c pt.disable_gen_tracing=1
```
The tracing part of generated/VariableType_?.cpp will not be generated.

Reviewed By: smessmer

Differential Revision: D19684577

fbshipit-source-id: a1e5b80eca5e51c7bf72b5cc8f0e36c2135fabc2
2020-03-04 08:16:31 -08:00
Martin Yuan
f097ca503d Add and test training in lite interpreter. (#32359)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32359

Test Plan: Imported from OSS

Differential Revision: D19450614

Pulled By: iseeyuan

fbshipit-source-id: 6bafff39d7880a5b7fb9cd70c33a4e584812be12
2020-03-03 23:33:43 -08:00
Shihao Xu
7d01888a75 [JIT] Register rpc.rpc_async(..) as a JIT operator (#33329)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33329

# Use case

```
torch.jit.script
def send_rpc_async(dst_worker_name, user_callable_qual_name, tensor):
    # type: (str, str, Tensor) -> None
    rpc._rpc_async_torchscript(
        dst_worker_name, user_callable_qual_name, args=(tensor,)
    )
```

# Problem

```
torch.jit.frontend.NotSupportedError: keyword-arg expansion is not supported:
  File "/data/users/shihaoxu/fbsource/fbcode/buck-out/dev/gen/caffe2/test/distributed/rpc/rpc_spawn#binary,link-tree/torch/distributed/rpc/api.py", line 722
    args = args if args else ()
    kwargs = kwargs if kwargs else {}
    fut = _invoke_rpc_torchscript(to, qualified_name, *args, **kwargs)
                                                               ~~~~~~ <--- HERE
    return fut
```

# Solution

Register `rpc.rpc_async(..)` as a JIT operator to handle variable-length argument list.

# Plan

This PR is the required changes to make `rpc.rpc_async(..)` a JIT prim operator, which can dynamically handle different number of arguments.

- Register "prim::rpc_async" as a `Symbol` in "interned_string.h"
- Add a if branch in "python_sugared_value.cpp" `toSugarValue(py::object, ..)` entry utility function to set up how JIT frontend convert `torch.distributed.rpc.rpc_async(..)` Python function (Python object) into a `SpecialFormValue` (IR SugaredValue).
- Add a switch case for "prim::rpc_aynsc" Symbol in "ir_emitter.cpp" and `emitApplySpecialForm(..)` to set up how JIT compiler provides inputs to the "prim::rpc_aynsc" Operator.
- Register "prim::rpc_async" as a `jit::Operator` and provide implementation in "register_distributed_ops.cpp".

Notice, since the distributed module is an optional part when building PyTorch. The code to be added in this PR should be wrapped within preprocessing maco.
```
#ifdef USE_DISTRIBUTED
new code here
#endif
```

Test Plan:
Items that need to be confirmed in the test cases

https://fb.quip.com/DCvdA9ZLjeO0

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork  \
\
&& buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par -r test_call_python_function_remotely_from_script_not_supported
```

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_spawn
```

```
buck test mode/dev-nosan //caffe2/caffe2/python/operator_test:layer_norm_op_test-2.7 -- test_layer_norm_op_jit
```

Differential Revision: D5738300

fbshipit-source-id: a4604fe762e00be062dc8232ca9790df31fb2074
2020-03-03 19:57:42 -08:00
Jiakai Liu
3c042a6ab9 [pytorch][mobile] support for custom mobile build with dynamic dispatch (#34055)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34055

Enable custom mobile build with dynamic dispatch for OSS build.

It calls a python util script to calculate transitive dependencies from
the op dependency graph and the list of used root ops, then pass the
result as the op registration whitelist to aten codegen, so that only
these used ops are registered and kept at link time.

For custom build with dynamic dispatch to work correctly, it's critical
to have the accurate list of used ops. Current assumption is that only
those ops referenced by TorchScript model are used. It works well if
client code doesn't call libtorch API (e.g.  tensor methods) directly;
otherwise the extra used ops need to be added to the whitelist manually,
as shown by the HACK in prepare_model.py.

Also, if JIT starts calling extra ops independent of specific model,
then the extra ops need to be added to the whitelist as well.

Verified the correctness of the whole process with MobileNetV2:
```
TEST_CUSTOM_BUILD_DYNAMIC=1 test/mobile/custom_build/build.sh
```

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D20193327

Pulled By: ljk53

fbshipit-source-id: 9d369b8864856b098342aea79e0ac8eec04149aa
2020-03-03 19:25:16 -08:00
cyy
5be8a4e027 find mkl installed by nuget (#34031)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34031

Differential Revision: D20221807

Pulled By: ezyang

fbshipit-source-id: 827e2775956f408febb287676bbf9a96a70fe2d4
2020-03-03 07:44:20 -08:00
Michael Ranieri
51d969e86a preprocessor cleanup (#33957)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33957

lots of small preprocessor warning cleanup for windows

Test Plan: CI green

Reviewed By: malfet, albanD

Differential Revision: D20153582

fbshipit-source-id: 18fd61c466fd1f55ededdae4448b3009a9cedc04
2020-03-02 13:37:19 -08:00
Zino Benaissa
cab8772c6c Freezing Torchscript modules (#32178)
Summary:
This patch enables folding GetAttr nodes with their corresponding
values. _jit_pass_freeze_module API returns a new TorchScipt module
where all function calls and get attributes are inlined.
Usage:

frozen_model = torch._C._freeze_module(scrited_model._c)
frozen_model.forward(...)

This API currently optimizes the forward method. We will follow up to
to preserve and optimize methods and attributes that are annotated as
 torch.jit.interface.

Several future improvements to JIT optimizations are required to maximize
clean up/de-sugar the graph and eliminate redundancies.
Ideally, we want to produce a graph that can easily be lowered to
GLOW and other low-level backends.
__
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32178

Differential Revision: D19419640

Pulled By: bzinodev

fbshipit-source-id: 52baffaba9bca2cd60a8e747baa68d57711ad42b
2020-03-02 11:38:36 -08:00
Basil Hosmer
ad769d74d9 Collapse _like overloads into a single overload. (#33705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33705

The fact that there were two overloads appears to be a historical
artifact that dates back to when goldsborough originally added these
bindings in the first place.  If TensorOptions is made optional,
then you only need one overload, not two, as they are exactly redundant
with each other.  When MemoryFormat was added, it was made a little
harder to do this, as the C++ syntax at::empty_like(t, memory_format) would
not work if you collapsed the overload; but now it works because TensorOptions
supports MemoryFormat.

The upshot is, I can get rid of all the overloads and just have one overload.
Amazingly, this change is backwards compatible, as the test attests.  While
I was at it, I also deleted the overload name from the functions entirely.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D20073355

Pulled By: bhosmer

fbshipit-source-id: c6a8908213b32ccf6737ea864d135e2cce34f56b
2020-03-01 19:40:22 -08:00
Jiakai Liu
7f7ea685c0 Revert D18672405: Use codegen'ed unboxing wrappers
Test Plan: revert-hammer

Differential Revision:
D18672405

Original commit changeset: bf2a7056082d

fbshipit-source-id: b7ef1529fc266b4856e49e4dbd1fe8c7ba3d455d
2020-02-29 15:27:54 -08:00
Jiakai Liu
3acfccafbb Revert D20172782: Fix mobile build
Test Plan: revert-hammer

Differential Revision:
D20172782

Original commit changeset: e4bfca2a6076

fbshipit-source-id: 3093efd4a135f8d6c3174887ad1e3362aad9aa7c
2020-02-29 15:21:07 -08:00
Jiakai Liu
595445e889 Revert D20178827: Fix mobile build
Test Plan: revert-hammer

Differential Revision:
D20178827

Original commit changeset: 980ac3d1ab3d

fbshipit-source-id: 9af6cb319e80c9b6a916bbdeffd69920075c7aec
2020-02-29 15:04:35 -08:00
Jiakai Liu
c596ec7eb3 [pytorch] update code analyzer script to cover new c10::Module::def API (#33975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33975

Currently the code analysis script doesn't go beyond the scope of the
registration API call, i.e. calling registration via a wrapper will not
be covered by the analysis - currently the new API is essentially a
wrapper around old API.

Simply adding the new API signature to the registration API pattern can
solve the problem for now. We might need change the analyzer code if
things change significantly in the future.

Test Plan:
- update test project to use the new API;
- run analyzer against pytorch codebase;

Differential Revision: D20169549

Pulled By: ljk53

fbshipit-source-id: c7925fa0486eee18f07e791a38c32152fee59004
2020-02-29 10:29:45 -08:00
Sebastian Messmer
5a8562a6af Fix mobile build (#34000)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34000

-
ghstack-source-id: 99241400

Test Plan: liujiakai

Differential Revision: D20178827

fbshipit-source-id: 980ac3d1ab3d47c12613c20ee9b8dc7d083f56a9
2020-02-28 23:28:00 -08:00
Sebastian Messmer
6e70b2da62 Fix mobile build (#33985)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33985

This was broken by https://github.com/pytorch/pytorch/pull/32521 but only showed up in master CI builds
ghstack-source-id: 99220995

Test Plan: CI

Differential Revision: D20172782

fbshipit-source-id: e4bfca2a6076f1bc1c562fca9c7dfcb156bfbf3e
2020-02-28 18:43:18 -08:00
Ailing Zhang
de55e47a4b Pass all ops to XLA with additional info about whether it's compound (#33908)
Summary:
This PR prepares us to allow XLA use `XLAPreAutograd` to override compound ops.
To do this we'll need to pass all ops, with additional infomation about whether it's compound or not for XLA to parse.
Companion PR: https://github.com/pytorch/xla/pull/1698
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33908

Differential Revision: D20149585

Pulled By: ailzhang

fbshipit-source-id: a93140e8a34548fcabcea454386d15df58177c1d
2020-02-28 18:17:23 -08:00
Ailing Zhang
69d2741480 Add list of view ops to public doc. (#32560)
Summary:
This PR comes from discussion with albanD in https://fb.quip.com/npBHAXaPfnbu. Main goal is to clarify view ops with general outplace/inplace ops and remind users about the difference.
For reference this information is only available in code which is internal and hard to find. Also changes to this list actually affect users so we think it's better to expose it as public information. It's also helpful for new backend like XLA when implementing PyTorch ops. 19bbb4fccb/tools/autograd/gen_autograd.py (L32-L68)
Please feel free to comment!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32560

Differential Revision: D20161069

Pulled By: ailzhang

fbshipit-source-id: b5f1fd4353fe7594a427784db288aeb5a37dc521
2020-02-28 15:05:55 -08:00
Sebastian Messmer
3c5677a676 Use codegen'ed unboxing wrappers (#32521)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32521

Not all ops support the templated unboxing wrappers yet. For the ones that don't,
let's use the codegen'ed unboxing wrappers from register_aten_ops.cpp, but register
them with c10 directly instead of JIT.

The `use_c10_dispatcher` setting in `native_functions.yaml` now has a new option 'with_codegenerated_unboxing_wrapper' which means we take the codegened unboxing wrapper from register_aten_ops.cpp and stuff it into c10. This new argument is the default, 'unboxed_only' is not the default anymore. For the (very few) ops that don't support boxed dispatch yet (i.e. ops taking TensorOptions arguments), we set them to 'unboxed_only' and they follow the old behavior of having register_aten_ops.cpp register the jit op.

Next steps here are (1) to make TensorOptions work with boxed dispatch and remove the `unboxed_only` option from `use_c10_dispatcher`, so that all ops go through the new path and (2) make the new path template-only and remove codegen from it (see https://github.com/pytorch/pytorch/issues/32366).

First experiments show that
- For a small JITted model that calls add (i.e. a op with just two arguments that are both tensors) on two tensors in a loop, we see a 2-4% performance improvement (~35-50ns) when compared to the old path. This is a simple op that takes two tensor arguments and no non-tensor arguments, so iterating over it in boxed dispatch is cheap.
- For a small JITted model that calls avgpool1d (i.e. an op that has one tensor arg and 5 non-tensor args) on a tensor in a loop, we see a 3-4% performance regression (~60ns) when compared to the old path. This is an op that takes only one tensor argument and then 6 non-tensor arguments. Unboxed dispatch doesn’t have to look at those but boxed dispatch still needs to iterate over them.

This performance difference is likely due to boxed dispatch iterating over all arguments in a loop and unboxed dispatch not having to look at non-tensor arguments.

ghstack-source-id: 99161484

Test Plan: unit tests that call existing ops through JIT

Differential Revision: D18672405

fbshipit-source-id: bf2a7056082dfad61e7e83e9eeff337060eb6944
2020-02-28 14:48:25 -08:00
lixinyu
d66c320b10 disable leaky_relu_ backward calculation with negative slope (#33639)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33639

Test Plan: Imported from OSS

Differential Revision: D20045735

Pulled By: glaringlee

fbshipit-source-id: b3becf30a8fe9ee178792bd88f6ee10102504ed5
2020-02-27 18:54:57 -08:00
Michael Suo
dbe850af5b [jit] do the code reorg (#33851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851

Rationale and context described in #33828.

Script to reproduce the move:
https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9
ghstack-source-id: 99079645

Test Plan: Make sure CI passes

Reviewed By: jamesr66a

Differential Revision: D20133869

fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e
2020-02-27 13:02:51 -08:00
JeongUkJae
b10761d890 fix type stub errors (#33762)
Summary:
I've been using pytorch with type hintings, and I found errors that can be easily fixed. So I'm creating this PR to fix type bugs.

I expected below code should be type-checked without any errors.

```python
import torch
from torch.nn import Linear
from torch.autograd import Variable
from torch.optim import AdamW
from torch.utils import hooks

# nn.Module should have training attribute
module = Linear(10, 20)
module.training

# torch should have dtype bfloat16
tensor2 = torch.tensor([1,2,3], dtype=torch.bfloat16)

# torch.Tensor.cuda should accept int or str value
torch.randn(5).cuda(1)
torch.tensor(5).cuda('cuda:0')

# optimizer should have default attribute
module = Linear(10, 20)
print(AdamW(module.weight).default)

# torch.Tensor should have these boolean attributes
torch.tensor([1]).is_sparse
torch.tensor([1]).is_quantized
torch.tensor([1]).is_mkldnn

# Size class should tuple of int
a, b = torch.tensor([[1,2,3]]).size()

# check modules can be accessed
torch.nn.parallel
torch.autograd.profiler
torch.multiprocessing
torch.sparse
torch.onnx
torch.jit
torch.hub
torch.random
torch.distributions
torch.quantization
torch.__config__
torch.__future__

torch.ops
torch.classes

# Variable class's constructor should return Tensor
def fn_to_test_variable(t: torch.Tensor):
    return None

v = Variable(torch.tensor(1))
fn_to_test_variable(v)

# check RemovableHandle attributes can be accessed
handle = hooks.RemovableHandle({})
handle.id
handle.next_id

# check torch function hints
torch.is_grad_enabled()
```

But current master branch raises errors. (I checked with pyright)

```
$ pyright test.py
Searching for source files
Found 1 source file
test.py
  12:45 - error: 'bfloat16' is not a known member of module
  15:21 - error: Argument of type 'Literal[1]' cannot be assigned to parameter 'device' of type 'Optional[device]'
  'int' is incompatible with 'device'
  Cannot assign to 'None'
  16:22 - error: Argument of type 'Literal['cuda:0']' cannot be assigned to parameter 'device' of type 'Optional[device]'
  'str' is incompatible with 'device'
  Cannot assign to 'None'
  23:19 - error: Cannot access member 'is_sparse' for type 'Tensor'
  Member 'is_sparse' is unknown
  24:19 - error: Cannot access member 'is_quantized' for type 'Tensor'
  Member 'is_quantized' is unknown
  25:19 - error: Cannot access member 'is_mkldnn' for type 'Tensor'
  Member 'is_mkldnn' is unknown
  32:7 - error: 'autograd' is not a known member of module
  33:7 - error: 'multiprocessing' is not a known member of module
  34:7 - error: 'sparse' is not a known member of module
  35:7 - error: 'onnx' is not a known member of module
  36:7 - error: 'jit' is not a known member of module
  37:7 - error: 'hub' is not a known member of module
  38:7 - error: 'random' is not a known member of module
  39:7 - error: 'distributions' is not a known member of module
  40:7 - error: 'quantization' is not a known member of module
  41:7 - error: '__config__' is not a known member of module
  42:7 - error: '__future__' is not a known member of module
  44:7 - error: 'ops' is not a known member of module
  45:7 - error: 'classes' is not a known member of module
  60:7 - error: 'is_grad_enabled' is not a known member of module
20 errors, 0 warnings
Completed in 1.436sec
```

and below list is not checked as errors, but I think these are errors too.

* `nn.Module.training` is not boolean
* return type of `torch.Tensor.size()` is `Tuple[Unknown]`.

 ---

related issues.

https://github.com/pytorch/pytorch/issues/23731, https://github.com/pytorch/pytorch/issues/32824, https://github.com/pytorch/pytorch/issues/31753
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33762

Differential Revision: D20118884

Pulled By: albanD

fbshipit-source-id: 41557d66674a11b8e7503a48476d4cdd0f278eab
2020-02-27 06:58:53 -08:00
Pavel Belevich
095de1e872 Migrate random_ from the TH to Aten (CPU and CUDA) (#33663)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33663

Test Plan: Imported from OSS

Differential Revision: D20056350

Pulled By: pbelevich

fbshipit-source-id: f9859b79ffdec70c48d6ee3ec70fd6fad593a9f5
2020-02-27 05:05:42 -08:00
Shihao Xu
9733711394 [JIT] Support calling Tensor.element_size() in TorchScript (#33808)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33808

# Problem

https://github.com/pytorch/pytorch/issues/33620
ghstack-source-id: 99073701

Test Plan:
```
buck test mode/dev-nosan //caffe2/test:jit -- test_numel

buck test mode/dev-nosan //caffe2/test:jit -- test_element_size

buck build mode/dev-nosan //caffe2/test:jit \
&& buck-out/gen/caffe2/test/jit\#binary.par -r test_numel

buck build mode/dev-nosan //caffe2/test:jit \
&& buck-out/gen/caffe2/test/jit\#binary.par -r test_element_size
```

Compile error

P126667043

Generated code,
```
buck-out/dev/gen/caffe2/generate-code=register_aten_ops_0.cpp/register_aten_ops_0.cpp

buck-out/dev/gen/caffe2/generate-code=register_aten_ops_2.cpp/register_aten_ops_2.cpp
```
P126667064

Differential Revision: D7050644

fbshipit-source-id: 20dbdb9c500b6d7683c23e3049d43ed0ca06d831
2020-02-26 22:30:44 -08:00
Emilio Castillo
a836c4ca78 Skip manual backward for cdist with case p=2 (#31167)
Summary:
Fixes an issue with `cdist` backward calculation for large inputs for the euclidean case.

The grid size when launching the kernel exceeded the 2^16 limit for the second dimension, resulting in `RuntimeError: CUDA error: invalid configuration argument`

Code to reproduce:

```
h, w, d = 800, 1216, 12
n = 133
A = torch.randn(n, d).cuda()
B = torch.randn(h, w, d).cuda()
A.requires_grad = True
B.requires_grad = True

B = B.reshape(-1, d).contiguous()
dist = torch.cdist(A, B)
loss = dist.sum()
loss.backward()
```

Thanks to tkerola for the bug report, reproduction and suggesting a solution.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31167

Differential Revision: D20035605

Pulled By: ngimel

fbshipit-source-id: ae28ba4b549ee07a8bd937bb1de2438dc24eaa17
2020-02-25 18:19:30 -08:00
Edward Yang
0e74cbcc54 Revert "Revert "Revert D19975411: Remove special case codegen for tril_indices/triu_indices." (#33572)" (#33742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33742

This reverts commit 90f4c5695e.

Test Plan: Imported from OSS

Differential Revision: D20095103

Pulled By: ezyang

fbshipit-source-id: ff47dae21c278570b4ca497d76deedb75823d6d7
2020-02-25 12:09:49 -08:00
Jeong Ukjae
819ca2c285 add bfloat16 conversion method in type stub (__init__.pyi) (#33747)
Summary:
Resolve https://github.com/pytorch/pytorch/issues/33699

`torch/__init__.pyi` will be generated like

```python
# TODO: One downside of doing it this way, is direct use of
# torch.tensor.Tensor doesn't get type annotations.  Nobody
# should really do that, so maybe this is not so bad.
class Tensor:
    requires_grad: _bool = ...
    grad: Optional[Tensor] = ...

    # some methods here...

    overload
    def bernoulli_(self, p: _float=0.5, *, generator: Generator=None) -> Tensor: ...
    def bfloat16(self) -> Tensor: ...
    def bincount(self, weights: Optional[Tensor]=None, minlength: _int=0) -> Tensor: ...

    # some methods here...
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33747

Differential Revision: D20090316

Pulled By: ngimel

fbshipit-source-id: b9ce4c0d4ef720c94ccac0a0342a012e8cf3af0c
2020-02-25 08:49:47 -08:00
Jeong Ukjae
fd175fa8a2 fix bugs in gen_pyi.py (#33748)
Summary:
This loop should generate type hints for inplace binary operator methods (`binop` variable) but had been using `name` variable. That's why that wrong type hints had been generated.

Resolve https://github.com/pytorch/pytorch/issues/33698

 ---

Current `__init__.pyi` has these type hints.

```python
class Tensor:

    # some codes here...

    overload
    def zeros_like_(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like_(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like_(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like_(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like__(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like__(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like__(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like__(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like___(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like___(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like___(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like___(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like____(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like____(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def zeros_like____(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def zeros_like____(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...

    # some codes here...
```

But `__init__.pyi` should generate these type hints.

```python
class Tensor:

    # some codes here...

    overload
    def add_(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def add_(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def add_(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def add_(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...

    # some codes here...

    overload
    def div_(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def div_(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def div_(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def div_(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...

    # some codes here...

    overload
    def mul_(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def mul_(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def mul_(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def mul_(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...

    # some codes here...

    overload
    def sub_(self, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def sub_(self, value: Number, other: Union[Tensor, Number]) -> Tensor: ...
    overload
    def sub_(self, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...
    overload
    def sub_(self, value: Number, other: Union[Tensor, Number], *, out: Optional[Tensor]=None) -> Tensor: ...

    # some codes here...
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33748

Differential Revision: D20090444

Pulled By: ngimel

fbshipit-source-id: e4a5dd08126629ec4c54b630a87ee540e669ec9a
2020-02-25 08:45:19 -08:00
Mikhail Zolotukhin
bf00b4d305 [TensorExpr] Add a boilerplate pass for future TensorExpr fusion pass. (#33464)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33464

I added a python-exposed knob to register this pass in custom passes pipeline. If the knob is not used, the pass is not registered and thus not run at all.

Differential Revision: D19958217

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: fecdd98567fcda069fbdf8995c796899a3dbfa5c
2020-02-24 18:47:31 -08:00
Yanli Zhao
5090d7082b add propagate flag USE_DISTRIBUTED for libtorch_python_source
Reviewed By: pritamdamania87

Differential Revision: D20070789

fbshipit-source-id: fdb8a2eefb5bfc1ae1d80e29bd15eb1d70920c87
2020-02-24 16:02:47 -08:00
Pavel Belevich
312627a7c3 Revert D19776613: Migrate random_ from the TH to Aten (CPU)
Test Plan: revert-hammer

Differential Revision:
D19776613

Original commit changeset: a8d262bccf5f

fbshipit-source-id: 36389ffa3d8377743f55f97221d7a7ee25a409f6
2020-02-22 08:15:27 -08:00
Pavel Belevich
d971007c29 Migrate random_ from the TH to Aten (CPU) (#32534)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32534

Fixes #24752
Fixes #32510

Test Plan: Imported from OSS

Differential Revision: D19776613

Pulled By: pbelevich

fbshipit-source-id: a8d262bccf5f2807f6125c83080aa16d77491b19
2020-02-21 16:13:58 -08:00
Edward Yang
a72946dbab Stop generating out full function type for registration, use decltype or infer it (#33097)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33097

Previously, we had to specify full types because the functions we registering
might be overloaded, and the type was necessary to resolve the ambiguity.  I
disambiguate all of these names by mangling the names of the methods we
place on CPUType/CUDAType/TypeDefault with the overload name (these are
*internal* wrappers which are not user visible), and then can strip
the generation of full function types from the registration.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19837898

Pulled By: ezyang

fbshipit-source-id: 5f557184f6ec84cb0613d4eb2e33b83fd1712090
2020-02-21 14:26:14 -08:00
Edward Yang
22963f42ec Delete unnecessary aliasAnalysis specification from operator registrations. (#33093)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33093

In #30187 the aliasAnalysis field on operator registration was updated
so that alias analysis could be specified in only some registration call
sites, rather than requiring it be consistently specified in all call
sites.  With this change, we can eliminate the requirement that all
registrations specify aliasAnalysis; as long as we know *one* site
specifies the correct aliasAnalysis, we don't have to specify it
any of the other sites.

In this patch, the "one site" is TypeDefault.cpp (previously we only
generated these stub declarations for manually registered functions,
but now we generate the stubs for everything).  Then I delete aliasAnalysis
anywhere we register an op for an existing function (which is a lot
of places).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19837897

Pulled By: ezyang

fbshipit-source-id: 26a7fbc809ec1553da89ea5c0361f3e81526d4c2
2020-02-21 14:24:44 -08:00
Mikhail Zolotukhin
bb5181b716 [TensorExpr] Add IR Printer. (#33220)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33220

Test Plan: Imported from OSS

Differential Revision: D19848379

Pulled By: ZolotukhinM

fbshipit-source-id: 1c6ab4f63080d4506dedc3c47938de92fb4bfba2
2020-02-21 13:10:26 -08:00
Mikhail Zolotukhin
fc70fc3610 [TensorExpr] Add IR visitor, IR mutator, and IR evaluator. (#33219)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33219

Test Plan: Imported from OSS

Differential Revision: D19848381

Pulled By: ZolotukhinM

fbshipit-source-id: 44ca7cd99c25e290a8ffd8146785c19f9c785dfd
2020-02-21 13:10:22 -08:00
Mikhail Zolotukhin
49af9425a7 [TensorExpr] Add core classes for representing expressions and statements. (#33218)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33218

Test Plan: Imported from OSS

Differential Revision: D19848378

Pulled By: ZolotukhinM

fbshipit-source-id: 48399f8651324d5ad0607e08573d5d7b2026bb23
2020-02-21 13:10:17 -08:00
Mikhail Zolotukhin
1a4f997178 [TensorExpr] Add a class for representing data type. (#33217)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33217

Test Plan: Imported from OSS

Differential Revision: D19848380

Pulled By: ZolotukhinM

fbshipit-source-id: d8683f8fc4555d2456cd2a7c827d8e8231915b49
2020-02-21 13:10:12 -08:00
Mikhail Zolotukhin
089d658153 [TensorExpr] Add classes for memory management in tensor expressions. (#33216)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33216

All tensor expressions belong to a kernel arena and are freed when the
arena is destroyed. Until it is destroyed, all expressions stay valid.

Test Plan: Imported from OSS

Differential Revision: D19848382

Pulled By: ZolotukhinM

fbshipit-source-id: a581ea2b635b9ba2cc53949616a13d8d3a47caae
2020-02-21 13:08:50 -08:00
Nathan Goldbaum
fa80299bdf __torch_function__ overrides for torch.functional and torch.nn.functional (#32799)
Summary:
This adds `__torch_function__` support for all functions in `torch.functional` and `torch.nn.functional`.

The changes to C++ code and codegen scripts are to facilitate adding `__torch_function__` support for the native functions in `torch._C._nn`. Note that I moved the `handle_torch_function` C++ function to a header that both `python_torch_functions.cpp` and `python_nn_functions.cpp` include. The changes to `python_nn_functions.cpp` mirror the changes I made to `python_torch_functions.cpp` when `__torch_function__` support was first added in https://github.com/pytorch/pytorch/issues/27064. Due to the somewhat different way the `torch._C` and `torch._C._nn` namespaces are initialized I needed to create a new static reference to the `torch._C._nn` namespace (`THPNNVariableFunctions`). I'm not sure if that is the best way to do this. In principle I could import these namespaces in each kernel and avoid the global variable but that would have a runtime cost.

I added `__torch_function__` support to the Python functions in `torch.nn.functional` following the approach in https://github.com/pytorch/pytorch/issues/32194.

I re-enabled the test that checks if all functions in the `torch` namespace are explicitly tested for `__torch_function__` support. I also generalized the check to work for `torch.functional` and `torch.nn.functional` as well. This test was explicitly disabled in https://github.com/pytorch/pytorch/issues/30730 and I'm happy to disable it again if you think that's appropriate. I figured now was as good a time as any to try to re-enable it.

Finally I adjusted the existing torch API tests to suppress deprecation warnings and add keyword arguments used by some of the code in `torch.nn.functional` that were missed when I originally added the tests in https://github.com/pytorch/pytorch/issues/27064.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32799

Differential Revision: D19956809

Pulled By: ezyang

fbshipit-source-id: 40d34e0109cc4b9f3ef62f409d2d35a1d84e3d22
2020-02-21 08:38:37 -08:00
Edward Yang
90f4c5695e Revert "Revert D19975411: Remove special case codegen for tril_indices/triu_indices." (#33572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33572

This reverts commit 687a7e4a25.

Original PR #33305

Reland with BC tests whitelisted. See https://github.com/pytorch/pytorch/issues/33580 for reasoning why this change is not actually BC breaking.

Test Plan: Imported from OSS

Differential Revision: D20011011

Pulled By: ezyang

fbshipit-source-id: 116374efc93af12b8ad738a0989d6f0daa9569e2
2020-02-21 08:36:32 -08:00
Michael Suo
0bde610c14
Re-sync with internal repository (#33591) 2020-02-20 16:46:16 -08:00
Vitaly Fedyunin
687a7e4a25 Revert D19975411: Remove special case codegen for tril_indices/triu_indices.
Test Plan: revert-hammer

Differential Revision:
D19975411

Original commit changeset: 996598759bed

fbshipit-source-id: 6bdb4b8f903e13815fc146e6f3260e5bb04c1045
2020-02-20 11:29:53 -08:00
Edward Yang
196fda5a79 Remove special case codegen for tril_indices/triu_indices. (#33305)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33305

The current TensorOptions code is written to exactly extract out
TensorOptions based on exact struct match, including default arguments.
That meant that tril_indices/triu_indices which had a different
default argument didn't match, and thus needed a special case.

I resolve this special case by instead replacing the explicit long
default argument with a None default argument, and then adjusting
the actual implementations to select the correct dtype when none
was specified.  I think the general rule I'm following here is that
it is always acceptable to replace an explicit default argument,
with a None argument (assuming the backend will compute it appropriately);
the documentation gets modestly worse, but everything that was
previously expressible continues to be expressible.  Maybe later
we should switch the default argument back to long, but for now
the simplification in code is worth it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D19975411

Pulled By: ezyang

fbshipit-source-id: 996598759bed9e8d54fe61e19354ad038ed0e852
2020-02-20 09:34:28 -08:00
Edward Z. Yang
883b18ea70 Delete build_variables.bzl following configerator change.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2020-02-20 10:26:49 -05:00
Vitaly Fedyunin
ea514c819a Make slow_conv_transpose2d_backward tensors contiguous (#33462)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33462

Test Plan: Imported from OSS

Differential Revision: D19956516

Pulled By: VitalyFedyunin

fbshipit-source-id: 4fa9dcba0dd02b891ab36e6ecee8fc59e049c15c
2020-02-19 16:44:14 -08:00
albanD
8908b62fb2 Clean views created inside no_grad that are modified inplace (#32839)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32839

As mentioned in the updated comment in `variable.h`, this disambiguate code like:
```python
base = torch.rand(10, requires_grad=True)
with torch.no_grad():
    view = base[1]
view.copy_(var)
torch.autograd.grad(base.sum(), var)  # <- what should it return?
```
Given that there is no consensus of what should happen here (does the gradient flow through the view in the no_grad or not). This special case is detected and forbidden.
As mentionned in the error message:
- If you want it to be tracked: move both out of the no_grad
- If do not want them to be tracked, move both inside the no_grad

This implies that any custom Function that returns views does not allow inplace modification on its output. I'll add a PR to the stack to relax this to be a DeprecationWarning for now. And we will make it into an actual error for 1.6

This replaces https://github.com/pytorch/pytorch/pull/26607
cc sublee

Test Plan: Imported from OSS

Differential Revision: D19814114

Pulled By: albanD

fbshipit-source-id: ff2c9d97c8f876d9c31773a2170e37b06d88bed7
2020-02-19 14:55:53 -08:00
Michael Suo
20c1e25832
Re-sync with internal repository (#33519) 2020-02-19 14:33:44 -08:00
Gregory Chanan
d7f00b1b45 Remove using declaration from widely-used header file. (#33293)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33293

Test Plan: Imported from OSS

Differential Revision: D19904992

Pulled By: gchanan

fbshipit-source-id: b5ac76db2e5cdb422671c6c5424858e1d97c323e
2020-02-19 08:19:11 -08:00