Commit Graph

685 Commits

Author SHA1 Message Date
David Riazati
a23863fd6f Add Pooling modules to Script (#14527)
Summary:
Depends on #14584
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14527

Differential Revision: D13270773

Pulled By: driazati

fbshipit-source-id: e4acd43ccbce0f4b62d41c30ce8d5c721171e19a
2018-12-03 23:55:04 -08:00
David Riazati
d429e78a9a Add fractional_max_pool2d to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14591

Differential Revision: D13270755

Pulled By: driazati

fbshipit-source-id: 138a60256795f5ef8d236c75be2cfd929059b98f
2018-12-03 23:49:38 -08:00
Michael Suo
95e5a5ae0c basic testing of builtin alias annotations (#14588)
Summary:
Check whether the codegen'd alias annotations actually track alias creation and writes correctly. This could be made more exhaustive, but it's good enough for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14588

Differential Revision: D13312653

Pulled By: suo

fbshipit-source-id: 98de1610ea86deada71957c75c222fff331a0888
2018-12-03 22:31:02 -08:00
Wanchao Liang
119f9ec291 enable NoneValue parameter assignment for WeakScriptModule (#14715)
Summary:
This PR:

1. Handle None value attr in the WeakScriptModuleProxy
2. add back module tests that now passing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14715

Differential Revision: D13313573

Pulled By: wanchaol

fbshipit-source-id: a6b7892707350290a6d69b6f6270ad089bfc954b
2018-12-03 20:40:55 -08:00
Zachary DeVito
bb546b2e5b WAR for self.training (#14719)
Summary:
To enable self.training in script modules, this PR automatically adds a buffer called 'training' if a script method requests self.training. Assignment to self.training is overloaded to assign both to the boolean property and the tensor value.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14719

Differential Revision: D13310569

Pulled By: zdevito

fbshipit-source-id: 406387bb602f8ce5794eeff37642863c75928be5
2018-12-03 20:32:16 -08:00
Zachary DeVito
78d594f46c Implement Device as a type in the script (#14666)
Summary:
[ note:  stacked on expect files changes, will unstack once they land ]
This adds DeviceObjType (cannot use DeviceType it is already an enum)
to the type hierarchy and an isDevice/toDevice pair to IValue.
Previous hacks which used an int[] to represent Device are removed
and at::Device is used instead.

Note: the behavior or .to is only a subset of python, we need to
fix the aten op so that it accepts Option[Device] and Optional[ScalarType].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14666

Reviewed By: suo

Differential Revision: D13290405

Pulled By: zdevito

fbshipit-source-id: 68b4381b292f5418a6a46aaa077f1c902750b134
2018-12-03 16:54:40 -08:00
Wanchao Liang
4b31572375 Meta programming on If Stmt cond to enable conditional emit blocks (#14533)
Summary:
This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below

```python
import torch

class Test(torch.jit.ScriptModule):
  def __init__(self, b = None):
    self.b = b
  def forward(self, input):
    x = input
    if self.b is not None:
      x = self.b(input)

    return x

  Test()(torch.randn(2, 3))
```
This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533

Differential Revision: D13310526

Pulled By: wanchaol

fbshipit-source-id: 78d1a8127acda5e44d2a8a88f7627c43d29ff244
2018-12-03 15:47:15 -08:00
Michael Suo
9ac845f734 Revert D13280899: [pytorch][PR] Reduce broadcasted inputs in derivative code
Differential Revision:
D13280899

Original commit changeset: 80cc5ec9331b

fbshipit-source-id: 2335093cca8fd7db95470fd83b9299adfa17aa8e
2018-12-03 14:55:02 -08:00
Lu Fang
e0f68671bd Restore device when import jit script module (#14454)
Summary:
We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454

Reviewed By: zrphercule

Differential Revision: D13271956

Pulled By: houseroad

fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49
2018-12-03 14:10:30 -08:00
David Riazati
b8da44dc13 Add linear + pixelshuffle modules to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14654

Differential Revision: D13300968

Pulled By: driazati

fbshipit-source-id: 2c36aab91ea99681687f8da6d318981fee49785b
2018-12-03 14:01:16 -08:00
Adam Paszke
68ffe46991 Reduce broadcasted inputs in derivative code (#14485)
Summary:
Previously symbolic AD formulas assumed that no broadcasting happened,
and would return gradients of incorrect shapes (possibly leading to
silent errors later).

Fixes a few bugs (known and unknown):
- #11736
- ArgumentSpec didn't compute the input types correctly [(it didn't advance the offset for non-tensor args)](https://github.com/pytorch/pytorch/pull/14485/files#diff-4fd3157a056596aefb8cdf41022a208bR153)
- Symbolic AD could suffer from use after free (dangling pointers in grad map), because [`EliminateDeadCode` could have removed nodes](https://github.com/pytorch/pytorch/pull/14485/files#diff-25d33ad1ed6855684dec79d927ca6142L781) that referenced gradients of certain values.
- Undefined behavior in `aten::size`

During my tests I've also found a few new problems, and I have opened issues for them:
- FusionGroup seems to think that cat nodes broadcast their inputs (#14483)
- `prim::ConstantChunk` derivative formula doesn't handle undefined inputs (#14484)

This patch unfortunately deoptimizes some of our code (Fusion doesn't happen past chunk nodes, and outputs more tensors only because we have to get their size). I know how to fix those issues, but wanted to fix this terrible bug quickly.

cc zou3519 zdevito ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14485

Differential Revision: D13280899

Pulled By: soumith

fbshipit-source-id: 80cc5ec9331be80e1bb9ddfe85b81c2b997e0b0c
2018-12-03 13:44:18 -08:00
Michael Suo
b768db0810 Allow DCE to clean up some mutable ops (#14601)
Summary:
This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to.

This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils.

Tested on traced ResNet, and it gets rid of the dead code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601

Differential Revision: D13309118

Pulled By: suo

fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254
2018-12-03 13:31:08 -08:00
Michael Suo
9783ce3825 Revert D13272203: [pytorch][PR] [jit] Meta programming on If Stmt cond to enable conditional emit blocks
Differential Revision:
D13272203

Original commit changeset: 44a545abb766

fbshipit-source-id: 8861eb4810a6c9ea4aba8427b3a07d2fa0d69a15
2018-12-03 13:28:52 -08:00
Wanchao Liang
5a2f5a216f Make convertable to list also accepts optional
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14598

Differential Revision: D13308254

Pulled By: wanchaol

fbshipit-source-id: bd0b6f9f20294d3d589cf68732dbd8c57b67e0e9
2018-12-03 13:09:11 -08:00
Wanchao Liang
4b90702037 Meta programming on If Stmt cond to enable conditional emit blocks (#14533)
Summary:
This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below

```python
import torch

class Test(torch.jit.ScriptModule):
  def __init__(self, b = None):
    self.b = b
  def forward(self, input):
    x = input
    if self.b is not None:
      x = self.b(input)

    return x

  Test()(torch.randn(2, 3))
```
This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533

Differential Revision: D13272203

Pulled By: wanchaol

fbshipit-source-id: 44a545abb766bbd39b762a6e19f9ebaa295e324b
2018-12-03 12:14:52 -08:00
Zachary DeVito
4c11dee0e8 Use Type::str() in Type::operator<< (#14657)
Summary:
Stacked on zip commit because it also changes expect files, read only the last commit.

This reduces the number of ways we can print a Type from 3 (python_str, str, operator<<) to 2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14657

Differential Revision: D13288912

Pulled By: zdevito

fbshipit-source-id: f8dd610cea798c511c1d4327395bba54b1aa1697
2018-12-01 00:53:27 -08:00
Zachary DeVito
170ff7764f Use a zip archive as our container format (#14521)
Summary:
After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format.
miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it.

This replaces our custom container format with a zip archive, preserving all of the
desirable features of our custom format, such as append-oriented writing, and
mmap'able tensor data while adding a bunch of debugging advantages:

1. You can unzip and explore the container to debug what is going on with a model.
2. You can edit the model using a text editor (e.g. change the definition of a method,
   or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress'
   option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug
   serialized models.
3. We can easily enable features like compression in the future.
4. Stock python , without pytorch installed, and other programming languages
   can reasonably consume this format,using json  and zipfile packages, which enables
   people to build tools like visualizers without those visualizers depending on pytorch.
   This will be especially useful if you want to, for instance, write a visualizer in javascript.

Notes:

*  This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained
   library for reading/writing zipfiles that unlike other zip libraries also includes libz
   compatible compress/decompress support. It is a single header and a single C file without
   any other dependencies. Note that the instructions for miniz explicitly state:

   > Please use the files from the releases page in your projects. Do not use the git checkout directly!

   So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable
   to doing zip-align style things to align data.

*  Removes 'size' from RecordRef. This allows you to edit files in the zip archive without
   editing the meta-data file. Very important if you want to print-debug serialized models.

*  PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings)
   However, their implementation is completely swapped out to use miniz.

*  Code exists to check for the old magic number to give a decent warning to our preview users
   after we change the format.

*  Container version information is now put in a stand-alone 'version' file in the archive
   and serves a similar purpose to the other container version info.

*  All files in the zip archive start at 64-byte boundaries, using an approach similar to
   zip-align. Tests check that this property remains true. While the writer does this,
   the reader doesn't depend on it, allowing user-created archives that can use compression,
   and do not have to align data.

*  Added test to check for > 4GB files and archives. Disabled by default because it takes
   almost 2 minutes to run.

*  torchscript files are now optional: if a submodule does not have methods, it will
   not be written.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521

Reviewed By: jamesr66a

Differential Revision: D13252945

Pulled By: zdevito

fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c
2018-11-30 19:19:29 -08:00
Elias Ellison
404ad939e5 Revert existing no_grad_embedding_renorm_ from aten (#14639)
Summary:
Remove no_grad_embedding_renorm_ from aten. Setting the derivatives of the inputs to false has different semantics from calling with no_grad(), because it will not error if an input is modified and then has it's grad accessed.

Instead, make a custom op, and use NoGradGuard.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14639

Differential Revision: D13285604

Pulled By: eellison

fbshipit-source-id: c7d343fe8f22e369669e92799f167674f124ffe7
2018-11-30 16:57:51 -08:00
David Riazati
814b5715ba Move module tests to common_nn (#14578)
Summary:
This moves `new_module_tests` from `test_nn.py` to `common_nn.py` so
that they can be used in `test_jit.py` without running any of
`test_nn.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14578

Differential Revision: D13268286

Pulled By: driazati

fbshipit-source-id: 6e8654a4c29ab754d656ac83820c14d1c1843e03
2018-11-30 12:14:59 -08:00
David Riazati
89c3dbcad8 Add binary cross entropy to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14583

Differential Revision: D13269423

Pulled By: driazati

fbshipit-source-id: 7cc1594d8189c3e8f2d4ce0462fdc0a03683006e
2018-11-29 22:23:13 -08:00
James Reed
1975917d0e fix copy_ (#14593)
Summary:
Closes https://github.com/pytorch/pytorch/issues/14590
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14593

Differential Revision: D13272510

Pulled By: jamesr66a

fbshipit-source-id: b6921a98460c371d435277c416dad0b5ab0fec8c
2018-11-29 20:31:53 -08:00
Zachary DeVito
fd31eae9ad Switch import/export to python printing (#14400)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit.

This changes the way methods are defined in TorchScript archives to use
PythonPrint rather than ONNX protobufs.

It also updates torch.proto to directly document the tensor data
structure actually being serialized.

Notes:
* because PythonPrint prints all the methods at once per module, this
  removes MethodDef in favor of a single torchscript_area and a separate
  caffe2_graphs entry. Note that NetDef's already have method names,
  so there is no need or a separate method name entry.
* This switches cpp/pickle area to RecordRef (references to a file in
  the container format) since it is possible the data in these arenas
  may be large and not suited to json ouput.
* Removes 'annotations' -- annotations should be re-added on the first
  commit that actually has a practical use for them. In the current state
  it is unlikely they are representing the right information.
* Some expect files have changed because PythonPrint is preserving more
  debug name information for parameter names.
* MethodEncoder (the ONNX output format) has been deleted. There is still
  some cleanup possible combining EncoderBase and GraphEncode now that there
  is only a single pathway using EncoderBase.
* This incorporates the changes from #14397
  to define TensorDef
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400

Reviewed By: suo

Differential Revision: D13231800

Pulled By: zdevito

fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571
2018-11-29 17:53:49 -08:00
David Riazati
666d383a00 Add broadcast list default arg support (#14361)
Summary:
To convert `max_unpool` functions to weak script, this PR adds support
for `T` as default arguments for `BroadcastingListN[T]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14361

Differential Revision: D13192231

Pulled By: driazati

fbshipit-source-id: a25b75a0e88ba3dfa22d6a83775e9778d735e249
2018-11-29 15:15:47 -08:00
Adam Paszke
31b3d81714 Broadcast prim::FusedConcat inputs independently when checking kernels (#14503)
Summary:
Fixes #14483.

cc zou3519 mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14503

Differential Revision: D13256343

Pulled By: zou3519

fbshipit-source-id: 1c68a23f425be067a742bada7ee8cdfab7fc3fa2
2018-11-29 13:05:00 -08:00
David Riazati
9e93a02624 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13252887

Pulled By: driazati

fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988
2018-11-28 23:31:25 -08:00
Elias Ellison
6d63e9dbff Support Embedding + EmbeddingBag in Script + (Ignore flakey test) (#14509)
Summary:
Resubmitting PR #14415

The tests added for Embedding + EmbeddingBag had random numbers as input, which affected the random number generator & caused the flakey test to break.

Everything but the last two commits have already been accepted
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14509

Differential Revision: D13247917

Pulled By: eellison

fbshipit-source-id: ea6963c47f666c07687787e2fa82020cddc6aa15
2018-11-28 19:16:38 -08:00
Elias Ellison
105fa58748 pointwise_loss (#14134)
Summary:
Adding pointwise loss ops to weak_script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14134

Differential Revision: D13209455

Pulled By: eellison

fbshipit-source-id: 87fc0222121f34a2f4edb24c2da2a11124b097d8
2018-11-28 18:14:38 -08:00
Edward Yang
5f07b33857 Revert D13219647: [pytorch][PR] Support Embedding + EmbeddingBag in Script
Differential Revision:
D13219647

Original commit changeset: c90706aa6fbd

fbshipit-source-id: d189e717ba0773de43d633876bc3a688830a9303
2018-11-28 13:38:58 -08:00
Elias Ellison
7749804099 Support Embedding + EmbeddingBag in Script (#14415)
Summary:
Add support for Embedding and EmbeddingBag in script. Both functions require with torch.no_grad(), which we don't have any plans to support in the near future. To work around this, I added a embedding_renorm function without derivatives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14415

Reviewed By: wanchaol

Differential Revision: D13219647

Pulled By: eellison

fbshipit-source-id: c90706aa6fbd48686eb10f3efdb65844be7b8717
2018-11-28 10:52:30 -08:00
David Riazati
3d98810fbd Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit
Differential Revision:
D13192230

Original commit changeset: 36488960b6c9

fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909
2018-11-28 00:23:09 -08:00
David Riazati
4cdcbbf410 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13192230

Pulled By: driazati

fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e
2018-11-27 21:19:51 -08:00
David Riazati
662f66ebb9 Add poisson_nll_loss to script
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14420

Differential Revision: D13220726

Pulled By: driazati

fbshipit-source-id: 6c08a0050075beafcc8ba413c9603b273870c70c
2018-11-27 19:39:16 -08:00
David Riazati
d75f751bec Add boolean dispatch for function overloading (#14425)
Summary:
This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See max_pool1d for an example usage.

This is the first step in enabling the use of max_pool functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions.

Fixes #14081
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14425

Differential Revision: D13222104

Pulled By: driazati

fbshipit-source-id: 8cb676b8b13ebcec3262234698edf4a7d7dcbbe1
2018-11-27 19:36:47 -08:00
Zachary DeVito
23f901a737 fix enable_cpu_fuser
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14440

Differential Revision: D13226354

Pulled By: zdevito

fbshipit-source-id: e4ed023eece8b5b670a4a27d24a8688907b36b90
2018-11-27 19:14:10 -08:00
Elias Ellison
82175f31b4 Move Affine grid to C++ (#14392)
Summary:
Port AffineGrid to C++, because script does not support compiling Function classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14392

Differential Revision: D13219698

Pulled By: eellison

fbshipit-source-id: 3ddad8a84c72010b5a6c6f7f9712be614202faa6
2018-11-27 18:38:11 -08:00
Zachary DeVito
226a01e5a1 Handling of pretty-printing methods (#14378)
Summary:
Stacked on #14176, review only the last commit.
* Print parameters to methods as self.weight rather than as extra inputs.
* Print entire set of methods out as a single string
* Update test code to test the module-at-a-time export/import
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14378

Differential Revision: D13198463

Pulled By: zdevito

fbshipit-source-id: 3fab02e8239cfd6f40d6ab6399047bd02cf0a8c8
2018-11-27 17:10:23 -08:00
zrphercule
ba6c49cb9c Add test of ONNX_ATEN (#14259)
Summary:
In #14239 we fixed ONNX_ATEN.
In order to make sure its correctness in the future, we should add related test case.
We use torch.fmod() to test ONNX_ATEN.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14259

Differential Revision: D13204610

Pulled By: zrphercule

fbshipit-source-id: e4660c346e5edd201f1458b7d74d7dfac49b94c7
2018-11-27 13:51:51 -08:00
David Riazati
1b80644b4d Revert D13192228: [pytorch][PR] [jit] Add boolean dispatch for function overloading
Differential Revision:
D13192228

Original commit changeset: fce33c400c1f

fbshipit-source-id: 75c9991dc7097f9513c6c89d16eff2de6e287c3b
2018-11-27 13:14:42 -08:00
Michael Suo
3fca4bde50 Trace in-place ops (#14254)
Summary:
This PR adds a `try_outplace` option to the tracer. When `try_outplace` is true, the tracer will attempt to out-of-place ops (similar to how things are done today). When it's false, the correct in-place op is emitted.

I made `try_outplace` false by default, but flipped it to true for ONNX export utils. zdevito jamesr66a, anywhere else I should preserve the existing behavior?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14254

Reviewed By: eellison

Differential Revision: D13166691

Pulled By: suo

fbshipit-source-id: ce39fdf73ac39811c55100e567466d53108e856b
2018-11-27 12:40:56 -08:00
Zachary DeVito
e22cc7c072 Print default values and introduce ir view classes (#14176)
Summary:
[Stacked commit, only review the last commit]

This PR adds support for printing default values in python printing as well as the logic
for parsing default values back in using the parser. For simplicity, this PR simply
creates a subgraph of the constant expressions and then runs that graph to generate the defaults.
A more lightweight approach should be possible later, but would require more machinery.

To make reading code in the printer easier, this also add ir_views.h.
Similar to tree_views.h these classes can provide views of some commonly used IR nodes
that have complicated structure and common operations on that structure.

Currently it has only read-only views for prim::If and prim::Loop,
but we should eventually add helpers to manipulate If/Loop nodes as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14176

Differential Revision: D13198455

Pulled By: zdevito

fbshipit-source-id: dc99ab9692804ccaedb60a55040c0b89ac7a6a6d
2018-11-27 11:48:27 -08:00
Thomas Viehmann
8408dff55a Add Type support to the fuser, fuse more (#14336)
Summary:
This adds scalar type support to the fuser, both internally (instead of auto / assuming float) and for the inputs/outputs.
We can now fuse things with input / output of arbitrary scalar type, in particular comparisons and where work well. So it fixes #13384 by returning the right type tensor (and adds a test where byte and double tensors are returned).
The type inference is done by re-calling PropagateTensorShapeOnNode in the compilation, I would venture that it isn't prohibitively expensive compared to the actual compilation. (Propagation was fixed for where to return the second argument's type and amended to handle FusedConcat.)
I'm not sure how to add a check for the code generated by the fuser, but I am not sure we absolutely need to (we'd see if it is invalid / produces wrong results).

Thanks in particular to apaszke, fmassa, mruberry for advice and encouragement! All the errors are my own.

I have discussed order of PRs briefly with mruberry, if this goes in before he submits the PR, he graciously agreed to rebasing his, but I'd happily rebase, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14336

Differential Revision: D13202620

Pulled By: soumith

fbshipit-source-id: 855159e261fa15f21aca3053bfc05fb3f720a8ef
2018-11-27 11:33:11 -08:00
David Riazati
66c8bbf021 Add boolean dispatch for function overloading (#14081)
Summary:
This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See `max_pool1d` for an example usage.

This is the first step in enabling the use of `max_pool` functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions.

Depends on #14232 for `Optional[BroadcastingList[T]]`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14081

Differential Revision: D13192228

Pulled By: driazati

fbshipit-source-id: fce33c400c1fd06e59747d98507c5fdcd8d4c113
2018-11-27 10:51:32 -08:00
Richard Zou
b13f91dbd9 Allow graph fuser to move chunks past multiple nodes. (#14055)
Summary:
Fixes #12290. Also speeds up JIT LSTM forward pass from 8.8ms to 7.8ms; previously, each JIT lstm cell used 2 fused kernels. Now, it only uses one fused kernel (which is how many kernels cudnn uses).

Explanation:

Let f, g, h be fusible ops.
```
x = f(v, w)
z = g(x, y)
a, b = chunk(z)
c = h(a, b)
```
becomes (before this PR):
```
x = f(v, w)
x', y' = broadcast_tensors([x, y])
ax, bx = chunk(x')
ay, by = chunk(y')
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```
The graph fuser then puts g, g, and h into one FusionGroup and is unable
to move `x = f(v, w)` into the FusionGroup.

This PR lets the graph fuser move `x = f(v, w)` into the FusionGroup.
It does this by abstracting the broadcast_tensors + multiple chunk nodes
into one intermediate `prim::BroadcastingChunk[chunks, dim]` node.

A `BroadcastingChunk[chunks, dim](*inputs)` node is equivalent to:
- broadcasting all of *inputs
- chunk-ing each broadcasted input into `chunks` chunks along dim `dim`.

Abstracting the broadcasting chunk behavior away, it is now a lot easier
for the graph fuser to move (broadcast + chunk) past an operation. After
this PR, the above graph becomes:
```
x = f(v, w)
ax, bx, ay, by = BroadcastingChunk(x, y)
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```
Now, to move `x = f(v, w)` after the BroadcastingChunk, one just needs
to add f's operands to the BroadcastingChunk:
```
ay, by, av, bv, aw, bw = BroadcastingChunk(y, v, w)
ax = f(av, aw)
by = f(bv, bw)
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```

cc apaszke mruberry zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14055

Differential Revision: D13159259

Pulled By: zou3519

fbshipit-source-id: 134e9e645c950384d9be6a06a883a10e17a73d7d
2018-11-26 12:31:49 -08:00
Michael Suo
2fa3c8327c fix tensor advanced indexing with assignment (#14311)
Summary:
Fix a mishandling of `foo[a] = b` when `a` was a tensor. We were assigning to a copy of `foo`, not a view of it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14311

Differential Revision: D13196109

Pulled By: suo

fbshipit-source-id: c929401fda7c4a27622d3fe2b11278b08a7f17f1
2018-11-26 12:10:48 -08:00
Adam Paszke
a60368982b Batch more matrix multiplies (#13456)
Summary:
This handles the input pre-multiplication in RNNs, yielding pretty significant speedups in backward times. This pass depends on loop unrolling, so we'll batch only as many elements as the unrolling factor allows.

cc mruberry ngimel zou3519 zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13456

Differential Revision: D12920339

Pulled By: zou3519

fbshipit-source-id: 5bcd6d259c054a6dea02ae09a9fdf9f030856443
2018-11-26 09:20:35 -08:00
Wanchao Liang
7fc34a4122 Convert gumbel_softmax, lp pooling weak functions and modules (#14232)
Summary:
1. Support `Optional[BroadcastingList1[int]]` like type annotation to accept a int or a list[int]
2. Convert gumbel_softmax, lp pooling weak functions and modules
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14232

Differential Revision: D13164506

Pulled By: wanchaol

fbshipit-source-id: 6c2a2b9a0613bfe907dbb5934122656ce2b05700
2018-11-21 23:44:24 -08:00
David Riazati
d9cdcc9a3b Add list inequality operator (#14129)
Summary:
This PR adds `aten::neq` for list inequality comparisons and converts
`nll_loss` to weak script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129

Differential Revision: D13123894

Pulled By: driazati

fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f
2018-11-21 16:32:58 -08:00
Zachary DeVito
788d2e87bd Address jittering issues in python_print (#14064)
Summary:
export - print a method with python_print
import - import a method with import_method

We want to ensure:

    export(g) == export(import(export(g)))

That is after after exporting/importing once, the graph will stay exactly
the same. This is less strict that g == import(export(g)) which would
require us to maintain a lot more information about the structure of the
IR and about the names of debug symbols.

This PR addresses this with the following fixes:
* print out double-precision numbers with high enough precision such
  that they always parse in the same way
* when creating loop-carried dependencies, sort them
  by variable name, ensuring a consistent order
* parse nan correctly
* DCE: remove unused outputs of if statements, and loop-carried dependencies
  in loops that are dead both after the loop and inside the body of the
  loop.
* Do not set uniqueName for variables whose names are _[0-9]+, these
  are probably rare in user code, and we need a way to communicate
  that we do not care about a variable name when re-parsing the graph.
  Otherwise temporary variable names will jitter around.
* Expand the definition of a constant in printing code to None,
  and family.
* Allow re-treeing to work as long as the only thing in its way is a
  constant node. These do not have side effects but are sometimes
  inserted in a different order when tracing compared to how we print them.
* Print all constant nodes out first in the order in which they are used_val
 (or, if they are inlined, ensure they get assigned CONSTANT.cX number
  in a consistent order). Cleanup tuples (this is done in the compiler,
  but not in the tracer, leading to some tuple indexing jitter if not
  done).
* use strtod_l, not std::stod which can throw exceptions

Other:
* Add REL_WITH_DEB_INFO to setup.py. It already existed for the
  cmake files. Threading it into setup.py allows us to turn on
  debug symbols with optimization everywhere.
* enable round trip testing for all generated graphs. This only adds
  ~6 seconds to total build time but tests printing for every graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064

Differential Revision: D13094637

Pulled By: zdevito

fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d
2018-11-21 06:38:29 -08:00
David Riazati
8f20d40bb7 Allow undefined tensors as constants (#14120)
Summary:
This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`:

```python
torch.jit.script
def fn(x=None):
    # type: (Optional[Tensor]) -> Tensor
    return torch.jit._unwrap_optional(x)

torch.jit.script
def fn2():
    # type: () -> Tensor
    return fn()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120

Differential Revision: D13124625

Pulled By: driazati

fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569
2018-11-20 16:54:27 -08:00
Wanchao Liang
d6bfc53b9e Export BatchNorm functional and module, add necessary JIT support (#14016)
Summary:
This PR did three things:

1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features
2. In the process of export, add necessary compiler support for in_place op aug assign
4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016

Differential Revision: D13112064

Pulled By: wanchaol

fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91
2018-11-20 14:15:06 -08:00
Thomas Viehmann
1256cbaa69 Relax limits for gradients in test_jit's checkGraph (#14094)
Summary:
- This should help TestJit.test_lstm_fusion_concat_cuda
  to be less flaky. (Checked on manual_seed 0..99)
  Fixes: #14026
- Revert the renaming of test_fused_abs that was introduced
  to game the order of tests to avoid the flakiness above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14094

Differential Revision: D13100174

Pulled By: soumith

fbshipit-source-id: 91bb63b07a960a81dddfc0bf25c67696c0f6c46d
2018-11-16 11:43:52 -08:00
David Riazati
0d29846d5e Convert more weak functions (#14003)
Summary:
Same deal as #13707
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14003

Differential Revision: D13076403

Pulled By: driazati

fbshipit-source-id: eb3cb3b2c31caf1de591b613bdc4c9a6ed4e1767
2018-11-15 16:45:50 -08:00
Zachary DeVito
0573169e23 Import a method from an python_print string (#13959)
Summary:
* Add hooks to get a callback whenever a valid graph is produced in the compiler or through tracing. These hooks can be used to pretty_print and then reparse every graph our tests produce to check that the serialization function works correctly. Currently this is guarded by an environment variable since there are a few remaining failures.
* Fix printing bugs: True and False rather than 1 and 0, print 0. for floating point zero
* Change behavior of NoneType. It is now no longer a subtype of Optional but instead implicitly converts to it, returning a prim::Node with an Option[T] type for some specific T. This allows functions like `_unwrap_optional` to correctly match against a None while still deriving the right type.
* Fix a bug where empty blocks did not correctly emit "pass" in printer.
* Fix a bug where prim::Undefine sometimes cannot be printed as None because it is being used in a schema-less op. This should be fixable once Optional[T] always uses the same None object.
* Other minor printing bugs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13959

Reviewed By: jamesr66a

Differential Revision: D13073519

Pulled By: zdevito

fbshipit-source-id: 4167a6b614f2e87b4d21823275a26be5ba4fc3dd
2018-11-15 16:11:37 -08:00
Thomas Viehmann
c7e0db140e use fabs instead of absf in fuser code for aten::abs (#13985)
Summary:
absf didn't work for CUDA

Fixes: #13971
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13985

Differential Revision: D13084601

Pulled By: soumith

fbshipit-source-id: 0027ee719ae2b6a2bfce9c26f21db9c5e6159686
2018-11-15 13:23:59 -08:00
Xiang Gao
143ba72264 Move cosine_similarity to ATen (#12199)
Summary:
I'm now traveling and don't have access to a good computer to compile test by myself. Will see the outcome of CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12199

Differential Revision: D13062326

Pulled By: nairbv

fbshipit-source-id: 85873525caa94906ccaf2c739eb4cd55a72a4ffd
2018-11-14 10:41:44 -08:00
Zachary DeVito
30676bdcd3 Finish up TODOs in python printer (#13879)
Summary:
* Correctly adds annotate when needed for lists
* Parser/Emitter handles octal escapes so we do not fail for some strings.
* more complete keyword list in pretty printer
* floating point numbers are always printed with a decimal to ensure
  we never mistake them in parsing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13879

Differential Revision: D13037860

Pulled By: zdevito

fbshipit-source-id: f09ab174fc33402a429b21a5bfaf72e15c802cad
2018-11-13 16:39:46 -08:00
Elias Ellison
f649d8b3a9 add floordiv and bitwise ops
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13873

Reviewed By: driazati, wanchaol

Differential Revision: D13033709

Pulled By: eellison

fbshipit-source-id: df7edee0f790038fb2a806d20640ad25c70b50eb
2018-11-13 16:32:22 -08:00
David Riazati
5163a28917 Convert more weak functions (#13707)
Summary:
Convert some more functions to match up with features added. Some
conversions were unsuccessful but the type line was left in for later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13707

Differential Revision: D13030210

Pulled By: driazati

fbshipit-source-id: 02d5712779b83b7f18d0d55539e336321335e0cc
2018-11-13 13:50:57 -08:00
David Riazati
53bc5fb043 Support nn.Sequential in script (#13889)
Summary:
This PR makes weak modules in `nn.Sequential` get properly compiled
when used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13889

Differential Revision: D13039559

Pulled By: driazati

fbshipit-source-id: d3266305f0e206b2a19b63230ac2ab8f02faa603
2018-11-13 13:48:58 -08:00
Elias Ellison
686e83223f add ops between float & int, and change list equality output to be a boolean
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13793

Reviewed By: wanchaol

Differential Revision: D13010872

Pulled By: eellison

fbshipit-source-id: 2c8248f30b51eab1a87290711f99b7ceb6df2009
2018-11-12 14:39:47 -08:00
David Riazati
0c375571f5 Support OptionalType export and type match (#13647)
Summary:
* Adds `OptionalType` support for import/export
    * Optionals get exported along with their contained type, i.e. 'Optional[int]'
* Allows concrete types and `None` to be passed to an op that takes an optional
* Converts `softmax`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13647

Differential Revision: D12954672

Pulled By: driazati

fbshipit-source-id: 159e9bfb7f3e398bec3912d414c393098cc7455a
2018-11-12 12:15:25 -08:00
Zachary DeVito
aef9e76283 Get pretty printer ready for use as a serialization format (#13616)
Summary:
Get pretty printer ready for use as a serialization format

This PR adds a bunch of functionality to the pretty printer (now called python_printer to reflect
the fact that it will be used to output valid python source). The idea is to get the printer
ready for use as serialization format.  This PR does not have tests beyond what the pretty
printer already had. PRs stacked on this one will do round-trip export/import to test this functionality more robustly.

Notes:
* PythonPrinter is an evolution of the original pretty printer. However, much of it has changed so it is best just to
  read it as a new implementation. Trying to correlate it to the original implementation is probably not much help.
* The printer tries to get reasonably close to how the original function was likely written, such as
  writing expressions rather than making intermediates when possible. We may decide to turn this off
  for the actual serialization, but it is useful for pretty printing.
* tensor field access was changed so that prim::device and family have schema
* fixed a bug in the compiler where setUniqueName gets called even when a value already has one.
  this sometimes assigned really poor names to graph inputs
* Graph::insert gains an optional range argument to make range-preserving inserts easier.
* prim:: ops that can have schema now have schema. This is because when we parse them back in,
  we will need the schema to correctly set their output types.
* there is code in the python printer to complain if you try to add a prim op and do not update the printer.
* BuiltinModule is generalized to take an operator namespace and a version number for work in future commits.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13616

Reviewed By: goldsborough

Differential Revision: D13008252

Pulled By: zdevito

fbshipit-source-id: 32b33bc6410d6ca1c6f02bd6e050f8d5eea32083
2018-11-12 10:21:30 -08:00
Wanchao Liang
79ceecec8e Optional undefined tensor support (#13650)
Summary:
This PR is a part of task to unblock standard library export.
* we treat None differently from Tensor and other types, when passing None as Tensor, it's an undefined tensor rather than the None IValue.
* Refine the type system so that we have correct tensor types hierarchy (Dynamic/Tensor/CompleteTensor), Dynamic should be at the top of the inheritance hierarchy.
* It also tries to export bilinear as an example of undefined tensor(None) input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13650

Differential Revision: D12967026

Pulled By: wanchaol

fbshipit-source-id: 6aedccc7ce2a12fadd13d9e620c03e1260103a5a
2018-11-09 11:29:57 -08:00
Thomas Viehmann
9ffabcfcaa Use nested variant of getValueTrace to allow more flexible tracing script modules (#13597)
Summary:
When tracing scripted functions, we used to only allow Tensor arguments.
This enables tracing script modules with List[Tensor] or Tuple[Tensor, Tensor] arguments (passing
tuples).

Fixes: #13566
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13597

Differential Revision: D12990464

Pulled By: soumith

fbshipit-source-id: fdce3afcb1e09f3c26d6ce834c01bf18d261f47c
2018-11-09 06:24:02 -08:00
James Sun
dca3c2c60f Save and execute futures in a task queue (#13212)
Summary:
Upon calling wait(), save the forked thread and the current thread to a
task queue. A idling thread (which currently is single threaded) should
pick a ready task and run till there is nothing in the task queue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13212

Differential Revision: D12884522

Pulled By: highker

fbshipit-source-id: b3942a0ee63c148e05f5f41bdc73007fa3c3368e
2018-11-09 01:46:35 -08:00
Zachary DeVito
44fb23a2f5 Add ability to annotate jit types inside function (#13752)
Summary:
This adds torch.jit.annotate for annotating the type of an intermediate.
This is Py2/3 compatible, e.g.:

```
from torch.jit import annotate
from typing import List

torch.jit.script
def foo():
  a = annotate(List[int], [])
```

This is needed to output valid python programs from our IR. It removes
the need for the empty list constructors.

A future patch can add support to the C++ parser and Python 3,
via desugaring:

```
a : int = b
a = anntoate(int, b)
```

But this functionality is not required for serialization so is not added in this patch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13752

Differential Revision: D12989885

Pulled By: zdevito

fbshipit-source-id: 161573a7352094543dc0d33a892f2a3b9103d847
2018-11-08 20:25:00 -08:00
James Reed
85bde3801b Tracer now records Python variable names (#13441)
Summary:
This is probably slow but it should make the traces more understandable and make debugging easier. Any suggestions for how to make it faster (i.e. make it so we don't have to traverse all of locals() and globals()) would be appreciated
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13441

Differential Revision: D12879763

Pulled By: jamesr66a

fbshipit-source-id: b84133dc2ef9ca6cfbfaf2e3f9106784cc42951e
2018-11-08 13:08:42 -08:00
David Riazati
556ff8e7b7 Add builtins for size() and list with defaults (#13639)
Summary:
* `aten::size()` to match `torch.Tensor.size`
* `aten::list_with_default` for semantics of `torch.nn.modules.utils.list_with_default`
* converts `adaptive_avg_pool2d` and `adaptive_avg_pool3d`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13639

Differential Revision: D12954670

Pulled By: driazati

fbshipit-source-id: 68c30af0efc02c60af5fb8c9715b2435cc01a0d9
2018-11-08 11:26:35 -08:00
David Riazati
4472ad3b2f Move functional _Reduction to its own module (#13401)
Summary:
To support `_Reduction` in the jit this PR moves it out to a new file so that it goes through the paths for python modules in the script compiler and converts `F.ctc_loss` to weak script

Depends on #13484 for saving rng state
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13401

Differential Revision: D12868501

Pulled By: driazati

fbshipit-source-id: 23cec0fb135744578c73e31ac825e238db495d27
2018-11-08 01:04:10 -08:00
Michael Suo
21991c05a9 Support assignment to subscripted lhs expr (#13486)
Summary:
Support things like `foo[0] = bar` in script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13486

Differential Revision: D12964550

Pulled By: suo

fbshipit-source-id: 3dda8ffd683d1b045787c65bfa0c7d43b0455658
2018-11-07 23:07:57 -08:00
Zachary DeVito
c8bb665b5d Fix a bug in tuple assignment (#13656)
Summary:
Previously, we did not distinguish between `a = b` (simple assignment),
and `a, = b` (tuple destructuring of a singleton tuple).

The second case would fail in the string frontend, and would not unpack
in the python frontend. This patch fixes both issues and also cleans up
the error reporting for unexpected expressions on the LHS.

Will likely conflict with #13486
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13656

Differential Revision: D12964566

Pulled By: zdevito

fbshipit-source-id: 992b19e5068aef59a78cd23cb0e59a9eeb7755d1
2018-11-07 16:44:22 -08:00
Peter Goldsborough
9403eddce4 Fix tracing bug for custom ops (#13654)
Summary:
Due to a logic bug, tracing is broken for custom ops. Unfortunately, there also weren't any tests for tracing custom ops.

The fix is a single line change of moving `pop(stack, std::get<Is>(arguments)...);` before `node = getTracedNode<Is...>(schema, arguments);`. Other changes are added tests and improved commenting/formatting.

Fixes https://github.com/pytorch/pytorch/issues/13564

CC The controller you requested could not be found. fmassa

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13654

Differential Revision: D12952887

Pulled By: goldsborough

fbshipit-source-id: 87d256576f787c58e8d8f5c13a0fecd0ec62a602
2018-11-07 09:22:44 -08:00
Gregory Chanan
7341ab0a33 Fix range of target examples and JIT test case for CTC loss.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13644

Differential Revision: D12949733

Pulled By: gchanan

fbshipit-source-id: 1c4cacbb6a50d5002165bdd0a7881883db5c8249
2018-11-07 07:04:31 -08:00
Alex Şuhan
a132a7d9ce Add autodiff support for a few additional operators (#13288)
Summary:
Added aten::{avg_pool2d, log_softmax, max_pool2d_with_indices, threshold},
enabled aten::{expand, view}.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13288

Differential Revision: D12954929

Pulled By: soumith

fbshipit-source-id: 6fba58af82cafbc7446705d8c8145cdeaf4954ca
2018-11-06 23:24:12 -08:00
David Riazati
dbc467545f Update weak script modules to match fns (#13631)
Summary:
Add weak modules for those that use weak script functions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13631

Differential Revision: D12945328

Pulled By: driazati

fbshipit-source-id: 6cb235763bf5ab35c7b32e0f734f08d22418594f
2018-11-06 21:22:52 -08:00
Elias Ellison
6cf450744f propagate python op error msg (#13624)
Summary:
Correctly propagate the error msg from a python op to the JIT interpreter. In the interpreter we wrap the exception and re-throw it as a Runtime Exception. Potentially in a future diff we can throw the same type of python exception as was originally thrown.

Fix for https://github.com/pytorch/pytorch/issues/13560
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13624

Differential Revision: D12948756

Pulled By: eellison

fbshipit-source-id: 94cdf4c376143c5e40dcb9716aefb3c1e2d957db
2018-11-06 16:28:39 -08:00
Elias Ellison
137150be88 add unwrap optional operator (#13599)
Summary:
Add a builtin to refine the type of Optional[T] -> T. This is a short-term solution to unblock porting of the the standard library.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13599

Reviewed By: driazati, wanchaol

Differential Revision: D12943193

Pulled By: eellison

fbshipit-source-id: 31c893a78d813313bbbc1d8212b5c04e403cfb4d
2018-11-06 11:54:56 -08:00
Soumith Chintala
a7ee632dff Various Test and build fixes (#13556)
Summary:
- fixes weights-contiguous requirement for THCUNN Convolutions
- Add tests that conv backward pass works for non-contiguous weights
- fix RNN tests / error messages to be consistent and pass
- relax weight grad precision for fp16 for a particular test
- fix regression of CMAKE_PREFIX_PATH not passing through
- add missing skipIfNoLapack annotations where needed

Differential Revision: D12918456

Pulled By: soumith

fbshipit-source-id: 8642d36bffcc6f2957800d6afa1e10bef2a91d05
2018-11-06 07:13:47 -08:00
David Riazati
fc6a9a19ea Add torch._C._nn built-in, more weak fns (#13322)
Summary:
This PR adds functions defined in `torch._C._nn` as builtin functions (including inplace variants). This allows for the conversion of more functions to weak script

NB: many `torch.nn.functional` functions will have to be slightly rewritten to avoid early returns (as with `threshold` in this PR)

Converts these functions to weak script:
* `threshold`
* `relu`
* `hardtanh`
* `relu6`
* `elu`
* `selu`
* `celu`
* `leaky_relu`
* `rrelu`
* `tanh`
* `sigmoid`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13322

Differential Revision: D12852203

Pulled By: driazati

fbshipit-source-id: 220670df32cb1ff39d120bdc04aa1bd41209c809
2018-11-05 21:02:18 -08:00
Wanchao Liang
af4a228426 Fix erase_number_type pass, negative indices in c2 and some onnx symbolics (#12888)
Summary:
The PR did two things:

1. fix the bug in erase_number_type on node inputs
2. handle negative indices for dim-reduce in caffe2
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12888

Reviewed By: houseroad

Differential Revision: D12833486

Pulled By: wanchaol

fbshipit-source-id: c3ceb400d91f0173b73ad95e392b010c3c14db7d
2018-11-05 19:13:49 -08:00
David Riazati
1969898647 Convert functional dropouts to weak script (#13484)
Summary:
To convert `nn.functional.dropout`
* `_VF` had to be exposed as a Python module so this PR adds a module class to forward to `torch._C._VariableFunctions`
* rng state between calls in the tests needed to be made consistent
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13484

Differential Revision: D12929622

Pulled By: driazati

fbshipit-source-id: 78b455db9c8856b94d2dda573fb7dc74d5784f56
2018-11-05 17:13:07 -08:00
David Riazati
23e3a12d5e Add pass support to script (#13535)
Summary:
This PR adds basic support for `pass` statements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13535

Differential Revision: D12929529

Pulled By: driazati

fbshipit-source-id: 70c7c52630d46e76366c4caa875d6c5419a1e03f
2018-11-05 17:13:06 -08:00
David Riazati
df67d4180a Validate schema with no returns (#13525)
Summary:
If there is no return type then the returns of the schema are not
checked against the returns in the graph, so this PR adds an error if
that case is detected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13525

Differential Revision: D12929524

Pulled By: driazati

fbshipit-source-id: da562e979482393098830bbded26729a2499152a
2018-11-05 16:51:55 -08:00
Adam Paszke
e988dc621b Stop depending on static analysis of tensor types in graph fuser (#13387)
Summary:
Built on top of #13108, so please review only the last commit.

This makes the graph fuser ignore input types (device/scalar type) when considering graphs for fusion, making it much more robust to shape-prop failures. Those properties are now checked at run time, as part of the kernel validation. This should enable graph fusions in `jit_premul` and `jit_multilayer` timelines in our benchmarks.

One regression is that I've disabled fusions of comparison ops (and `type_as`). That's because there's really no good way to ensure that those are really valid, and are a source of bugs (I filed #13384).

cc ngimel mruberry zdevito zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13387

Differential Revision: D12888104

Pulled By: zou3519

fbshipit-source-id: c233ea599679c34ac70fb4d8b8497c60aad9e480
2018-11-05 06:32:08 -08:00
Zachary DeVito
86192301b3 Fix a few bugs in format and vararg handling (#13492)
Summary:
There are a couple subtle bugs in the way varargs is implemented:

1. it fails if you pass 0 arguments, because it doesn't handle the case when there are 0 varargs, and because Operator::matches was not updated.
2. it breaks all the named-based lookups on nodes. For instance node->get<int>(attr::value)
   will return a single entry of the varargs if you look it up by name.

Furthermore it complicates some assumptions about the positional arguments (e.g. they use to be
1-to-1 with node inputs but with varargs they are not).

Because varargs are only being used for format, this diff instead
just allows format to take any value as input, regardless of type. It just provides a way to set is_vararg
from the schema but does not restrict the type of the varargs things. This is inline with
the pre-existing behavior for is_vararg so it doesn't require Operator::matches changes.

This also keeps format inline with how print works, and is closer to the python implementation of format. Note that the implementation
of format already worked with arbitrary IValues so restricting to strings was just making it more conservative than needed.

This also fixes the implementation of format to work when there are 0 arguments or text before and after a format string, where it would not print things.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13492

Differential Revision: D12896989

Pulled By: zdevito

fbshipit-source-id: 21425bac8edc81709030a7408180494edea0a54b
2018-11-02 00:07:00 -07:00
Michael Suo
5fbaf0eaf8 add augmented assignment ops (#13364)
Summary:
This PR changes the compiler to correctly emit in-place operators for augmented assignments (`+=` and friends).
- To better match the Python AST structure, add an `AugAssign` tree view and make `Assign` apply only to `=` assignments.
- Emit those `AugAssign` exprs in the compiler, dispatching to in-place aten ops for tensors and lowering to simple assignments for scalar types.
- In order to preserve (suspect) ONNX export semantics, add a pass to lower the in-place operators to out-of-place operators.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13364

Differential Revision: D12899734

Pulled By: suo

fbshipit-source-id: bec83be0062cb0235eb129aed78d6110a9e2c146
2018-11-02 00:01:07 -07:00
Wanchao Liang
0fd176fea4 Add operator is, not, is not to script (#13336)
Summary:
As titled, this PR is a part of tasks to unblock exporting the standard library.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13336

Differential Revision: D12888912

Pulled By: wanchaol

fbshipit-source-id: 6213a17a75a593ae45999994fd9562f29b7d42df
2018-11-01 16:55:28 -07:00
Elias Ellison
421f3f3e52 add npair builtins (#13473)
Summary:
Add npair builtins to unblock standard library. As with broadcasting list, the only occurrences are with int/floats.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13473

Differential Revision: D12890844

Pulled By: eellison

fbshipit-source-id: c360bb581d0f967cb51b858b6f964c300992d62a
2018-11-01 15:42:52 -07:00
Elias Ellison
edc6d721e0 fix flake (#13463)
Summary:
fix flake on test/test_jit.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13463

Differential Revision: D12886532

Pulled By: eellison

fbshipit-source-id: 1cd2a736663d5037bb4bdcd1d8ca1f201cf6a1cf
2018-11-01 13:39:39 -07:00
David Riazati
99ce499bfe Revert D12852205: [pytorch][PR] [jit] Add str() builtin
Differential Revision:
D12852205

Original commit changeset: 3e0e9218afdf

fbshipit-source-id: 114b4873504109394fe9d489200d39764ecc638e
2018-11-01 12:48:48 -07:00
David Riazati
8f2bc1bc56 Add str() builtin (#13278)
Summary:
Allow casting to string from any IValue type
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13278

Differential Revision: D12852205

Pulled By: driazati

fbshipit-source-id: 3e0e9218afdf27569da3ebf155f25e77e9f12984
2018-11-01 12:01:50 -07:00
Elias Ellison
70db53661b expose fixed length list argument (#13142)
Summary:
Arguments have an optional fixed length list field which allows either a list or a single element that will be broadcast to a fixed length.

This PR exposes that as a denotable argument, mostly to cover the many instances in which this used in the standard library. It appears in the standard library with ints & floats. Since this is not really a pattern we want to promote moving forward, I did not expose this for booleans or tensors.

We could consider making the optional static length part of the list type, instead of the argument, which would make some of this code much nicer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13142

Differential Revision: D12876047

Pulled By: eellison

fbshipit-source-id: e7359d2a878b4627fc2b9ebc090f9849ee524693
2018-11-01 10:34:52 -07:00
Elias Ellison
a5b627a0bf add assert statements (#13408)
Summary:
Adding assert statements to unblock standard library.

The same limitations that apply to the existing implementation of Exceptions apply to this as well
(No control-flow logic, & we ignore the specific Exception thrown).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13408

Reviewed By: driazati

Differential Revision: D12876451

Pulled By: eellison

fbshipit-source-id: 767ba5a50ba7c5dd6a857ed4845ac076a81cf305
2018-11-01 10:01:07 -07:00
David Riazati
f9c0a08eed Fix len() for tensors (#13398)
Summary:
Fixes #13376, `len(tensor)` was converting tensor to a 1 element list and returning 1 every time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13398

Differential Revision: D12867630

Pulled By: driazati

fbshipit-source-id: 28f3580a072d763df0980b3149c49d1894842ec9
2018-10-31 13:13:21 -07:00
David Riazati
404f8660e7 Add string.format() (#13157)
Summary:
This PR adds `aten::format` as a builtin op for strings with the basic formatting semantics of Python.

It also adds varargs to the schema parser (with the limitation that the varargs item is the last argument, i.e. `(*args, **kwargs)` is not supported) and to the compiler
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13157

Differential Revision: D12832537

Pulled By: driazati

fbshipit-source-id: 17c1a5615bb286c648fc9e38f2ebe501b064c732
2018-10-31 12:50:56 -07:00
David Riazati
bc74ec80d0 Add support for torch.backends.cudnn.enabled (#13057)
Summary:
This is used commonly in `nn` functions. This PR adds it as a weak
module (and also alters the conversion of weak modules to strong modules
to accept ordinary `object`s)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13057

Differential Revision: D10846618

Pulled By: driazati

fbshipit-source-id: 028b9f852d40e2e53ee85b93282c98cef8cd336b
2018-10-31 09:31:09 -07:00
Elias Ellison
59f8e8ada7 First step at adding exceptions (#12789)
Summary:
This is a first step towards adding exceptions. We need minimal support in order to begin converting the torch library to weak script mode (which is the main goal here).

Some limitations (that are documented in the tests & compiler):
1. Cannot assign exceptions to variables
2. Any name after raise is being treated as a valid Exception
3. No control flow analysis yet. Below a will be undefined:

if True:
     a = 1
else:
     raise Exception("Hi")
return a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12789

Differential Revision: D12848936

Pulled By: eellison

fbshipit-source-id: 1f60ceef2381040486123ec797e97d65b074862d
2018-10-30 20:25:50 -07:00
James Reed
7d9ab140bf Fix aten::to symbolic + add expand_as (#13325)
Summary:
https://github.com/pytorch/pytorch/pull/13146 broke some cases of ONNX export, this fixes them
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13325

Differential Revision: D12844294

Pulled By: jamesr66a

fbshipit-source-id: f98dd0685820b2a1e5fcd49733cfa5c19c48a4e7
2018-10-30 17:28:15 -07:00
David Riazati
ac64724ed9 Add support for tuple constants (#13086)
Summary:
Depends on #13072

Adds support for tuples as variables instead of just as literals. Before, tuples would give the error `python value of type 'tuple' cannot be used as a value`. This PR adds a flag on `SugaredValue` to determine in a value is a tuple or not.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13086

Differential Revision: D10846632

Pulled By: driazati

fbshipit-source-id: 7b5d6ae9426ca3dd476fee3f929357d7b180faa7
2018-10-30 09:01:17 -07:00
Richard Zou
8c2d0c831f Speed up tensor.storage_offset (#13267)
Summary:
This PR special cases tensor.storage_offset to avoid dispatches in the
common case. tensor.storage_offset is important for torch.as_strided
performance, because as_strided(sizes, strides) shares an implementation
with as_strided(sizes, strides, storage_offset) and it might not be the
best if there were two separate implementations (including backward
implementations).

This PR reduces times on a tensor.storage_offset
microbenchmark from 22ns to 2ns (these numbers are pretty stable). For
a torch.as_strided benchmark, this PR reduces numbers from 1042 to
928ns, a 100ns improvement, but this number is noisy and goes up and
down.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13267

Reviewed By: ezyang

Differential Revision: D12829828

Pulled By: zou3519

fbshipit-source-id: df907731e2398ce2baf1c8b1860a561ccc456f78
2018-10-30 07:36:21 -07:00
mruberry
955a01562d Removes debug spew in test_jit.py (#13280)
Summary:
Looks like a print() snuck in by accident with a recent PR and it's printing a lot of spew when the tests are run.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13280

Differential Revision: D12833449

Pulled By: michaelsuo

fbshipit-source-id: 5b50fd4b03bb73e5ca44cabdc99609c10017ff55
2018-10-29 18:25:30 -07:00
James Reed
db0b5c7ab7 ArgumentStash for int64_t arguments (#12939)
Summary:
Closes https://github.com/pytorch/pytorch/issues/12906. https://github.com/pytorch/pytorch/issues/12580 is still open because the schema is marked as `traceable=false` in the arg parser constructor, I think.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12939

Differential Revision: D10492031

Pulled By: jamesr66a

fbshipit-source-id: ca5376de3997b5fb62b493e2e6a9bb0d6c3b9687
2018-10-29 13:55:24 -07:00
Elias Ellison
9e6a695116 Add string equality test, string concat (#12992)
Summary:
Adding string equality comparison, and concat. Both are used in the standard library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12992

Differential Revision: D10513681

Pulled By: eellison

fbshipit-source-id: 1f845ef50be7850fdd3366951b20dc2a805c21fd
2018-10-29 10:13:21 -07:00
James Sun
4d62eef505 Add Future to IValue (#12976)
Summary:
Future now is an IValue. prim::Wait now is replaced by aten::wait

This PR is built on top of #12925
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12976

Differential Revision: D10861483

Pulled By: highker

fbshipit-source-id: 9e17926a625bc502fb12335ef9ce819f25776be7
2018-10-27 10:00:35 -07:00
Zachary DeVito
dae7616078 Shard all of tests based on how many tests exist. (#13160)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13160

Reduces pytorch_core build from 2 hours to 30 minutes

Reviewed By: soumith, dzhulgakov

Differential Revision: D10524261

fbshipit-source-id: 97270ac73404b5ea4c264cd0e9d8d4b1be79b0e9
2018-10-26 18:20:34 -07:00
Wanchao Liang
7ca995c815 Add optional default type annotation to support JIT None default value (#13161)
Summary:
As titled, this PR is a part of tasks to unblock exporting the standard library
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13161

Differential Revision: D10866927

Pulled By: wanchaol

fbshipit-source-id: 50038dbe6840b097b98cbed9d46a189a64e82302
2018-10-26 11:38:50 -07:00
Zachary DeVito
ce0d3e9b35 Bind inplace and _out variants into JIT (#13093)
Summary:
This commit is a minimial initial pass at adding inplace and _out variants to the JIT.
It changes gen_jit_dispatch.py to add bindings for these operators, and it also
supplements the FunctionSchema with alias information for these operators and for
viewing operators.

Tests are very minimal and will need to be improved in future commits.

Notes:

* Custom operator tests needed to be changed since _out variants add overloads, which
  the custom operator pipeline does not handle when called from python. This commit
  registers special test ops in the _test namespace for this purpose.
* Extends the schema parser to parse alias annotations more robustly.
* Extends FunctionSchema with `writes()` a set of alias set names that the op will write to,
  and `annotatedType()` which will return AnnotatedType objects which contain the alias_set
  information that was parsed from the schema.
* Disables all optimizations in graph executor when a mutable operator is found. This
  is something that will be improved in the future but is necessary for correctness now.
* Adds annotate_ops to gen_jit_dispatch which adds aliasing information to all of the
  aten ops.
* Adds AnnotatedType to the type hierarchy which is used to mark List and Tensor types
  with their alias_set. These types only appear in schema when you call annotatedType
  and are erased from types in normal use.
* Extends jit::Type with .containedTypes() and .withContained(new_types). The first returns all types contained
  within the type (e.g. T for T[], or {T,L} for a tuple (T, L)). The second constructs a new
  version of the same type, replacing the contained types with new_types. This simplifies
  a lot of logic for recursively cleaning up types.
* Refactor List[T] into a common part that is shared with Annotated[T] and can be shared
  with Optional[T] and Future[T] when they are merged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13093

Differential Revision: D10848176

Pulled By: zdevito

fbshipit-source-id: d057f23eeb99cde8881129b42d3f151ed5e7655d
2018-10-26 10:37:20 -07:00
Richard Zou
efab8e8fdf Speed up tensor.get_device(), is_cuda(), is_sparse() by avoiding dispatches (#12841)
Summary:
`tensor.get_device()` went through two dispatches: once to the native
function
`get_device()`, and another when `get_device` calls `_th_get_device()`.
This PR avoids the dispatch by directly implementing the `get_device`
function
as a method on Tensor.

Future Work:
- Investigate caching Device on TensorImpl. This will probably bring the
  tensor.get_device down to 2ns, but I'm not sure it's worth it.

before:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                             8 ns          8 ns   89407911
BM_TensorIsCuda                          24 ns         24 ns   29313017
BM_TensorIsSparse                        27 ns         27 ns   26083160
BM_TensorTypeIsCuda                      11 ns         11 ns   65128120
BM_TensorNumel                           11 ns         11 ns   68314492
BM_TensorGetDevice                       71 ns         71 ns    9633125
BM_DeviceGuardCtor                      173 ns        173 ns    4067173
BM_DeviceGuard                          232 ns        232 ns    3009690
```

after:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                            10 ns         10 ns   69803872
BM_TensorIsCuda                           2 ns          2 ns  321626683
BM_TensorIsSparse                         6 ns          6 ns  177045382
BM_TensorNumel                           12 ns         12 ns   58770533
BM_TensorGetDevice                        4 ns          4 ns  128113396
BM_DeviceGuardCtor                       52 ns         52 ns   14997278
BM_DeviceGuard                          158 ns        158 ns    5767248

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12841

Differential Revision: D10489353

Pulled By: zou3519

fbshipit-source-id: a596bc77352f21d5d35433c6de02c2f65aab5f9e
2018-10-25 19:57:52 -07:00
Wanchao Liang
4e1c64caee Add c10::optional to type syntax (#12582)
Summary:
This PR adds optional type to ATen native, autograd, JIT schema and Python Arg parser, closes #9513. It allows us to use optional default values (including None) for function signature and implementations like clamp, etc., and also let us remove the python_default_init hack.

Follow up:

remove python_default_init completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12582

Differential Revision: D10417423

Pulled By: wanchaol

fbshipit-source-id: 1c80f0727bb528188b47c595629e2996be269b89
2018-10-25 16:08:29 -07:00
David Riazati
14ea4bf0d1 Make 7 nn modules into weak modules (#12966)
Summary:
Depends on #12682 ([stacked diff](https://github.com/driazati/pytorch/compare/weak_mod...driazati:mod_conv1))

* Adds tests for weak module conversion that creates a `ScriptModule` that uses the weak module and checks its graph
* Adds `torch._jit_internal.weak_module` tags to modules that already work
  * `Sigmoid`
  * `Tanh`
  * `Hardshrink`
  * `PReLU`
  * `Softsign`
  * `Tanhshrink`
  * `PairwiseDistance`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12966

Differential Revision: D10559557

Pulled By: driazati

fbshipit-source-id: dc4bea3aa744b3c44d4fa7dceefd97e951f824d0
2018-10-25 13:59:34 -07:00
David Riazati
eac3e7ab7c improve constants error message (#13072)
Summary:
Adds the attribute name to the error message and fixes the corresponding
test to actually run
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13072

Differential Revision: D10846622

Pulled By: driazati

fbshipit-source-id: a7eee6320c28140c4937ede3d4e4685cfce08d84
2018-10-25 10:45:42 -07:00
David Riazati
6727133f3d Support warnings.warn (#12964)
Summary:
`warnings.warn` is used commonly thoughout `nn.functional`, so this adds
support for it by forwarding its arguments to `print`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12964

Differential Revision: D10559427

Pulled By: driazati

fbshipit-source-id: 5b591f6f446c906418f9fc7730c17e301f263d9b
2018-10-24 16:48:02 -07:00
Soumith Chintala
cf235e0894 fix lint after new flake8 release added new style constraints (#13047)
Summary:
fix lint after new flake8 release added new style constraints
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13047

Differential Revision: D10527804

Pulled By: soumith

fbshipit-source-id: 6f4d02662570b6339f69117b61037c8394b0bbd8
2018-10-24 09:03:38 -07:00
Elias Ellison
f9b7ce9c99 Add tuple indexing support for constant integers (#11492)
Summary:
Add support indexing tuples with constant integers by creating a new prim::TupleIndex operator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11492

Differential Revision: D9811996

Pulled By: eellison

fbshipit-source-id: a458c2522b3c81476252d920e27a8d6c7b9a036b
2018-10-23 17:52:03 -07:00
David Riazati
af78d4cd49 Add weak script modules (#12682)
Summary:
Adds support for weak script modules created that get compiled to `ScriptModule`s once added as a submodule of a `ScriptModule`:

```python
weak_module
class Test(torch.nn.Module):
	...
	weak_script_method
	def forward(self, x):
		...
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12682

Differential Revision: D10458626

Pulled By: driazati

fbshipit-source-id: 10ae23cb83cdafc4646cee58f399e14b2e60acd4
2018-10-23 09:06:02 -07:00
Edward Yang
bc1d96ca98 Add support for inline expect tests. (#12825)
Summary:
expecttest and test_expecttest are the implementation and tests
for this functionality.  I wired it up to the --accept flag,
but there's also a new environment variable EXPECTTEST_ACCEPT
which may be more convenient to trigger.  Haven't tested if this
works in fbcode.

There may be a few expect tests which will benefit from inline
treatment, but I just did one to show it works.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12825

Reviewed By: teng-li

Differential Revision: D10448630

Pulled By: ezyang

fbshipit-source-id: 3d339f82e2d00891309620a60e13039fa1ed8b46
2018-10-22 19:29:04 -07:00
David Riazati
1e8064dec0 Convert 2 nn.functional functions to weak script (#12723)
Summary:
* Moves `weak_script` annotation to `torch/_jit_internal.py` folder to resolve dependency issue between `torch.jit` and `torch.nn`
* Add `torch._jit.weak_script` to `tanhshrink` and `softsign`, their tests now pass instead of giving an `unknown builtin op` error
* Blacklist converted `torch.nn.functional` functions from appearing in the builtin op list if they don't actually have corresponding `aten` ops
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12723

Differential Revision: D10452986

Pulled By: driazati

fbshipit-source-id: c7842bc2d3ba0aaf7ca6e1e228523dbed3d63c36
2018-10-21 14:09:55 -07:00
Elias Ellison
f3e1fe5ca5 add string as supported input / output of script functions (#12731)
Summary:
Add strings to our set of built-in types for annotations. This is used in the the functional library.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12731

Differential Revision: D10453153

Pulled By: eellison

fbshipit-source-id: f54177c0c529f2e09f7ff380ddb476c3545ba5b0
2018-10-19 11:17:19 -07:00
Zachary DeVito
87d3d209a6 Enable JIT tests in fbcode (#12777)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12777

Enables JIT tests in FBCode. Changes pybind11 code to avoid mixing py::args with positinally matched arguments because old versions of PyBind11 leak memory in this case.

Reviewed By: jamesr66a

Differential Revision: D10419708

fbshipit-source-id: 74bc466001b5d363132d1af32e96841b38601827
2018-10-18 18:18:37 -07:00
James Sun
f4944f0f8a Rename test/common.py to test/common_utils.py (#12794)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794

common.py is used in base_module for almost all tests in test/. The
name of this file is so common that can easily conflict with other dependencies
if they happen to have another common.py in the base module. Rename the file to
avoid conflict.

Reviewed By: orionr

Differential Revision: D10438204

fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380
2018-10-17 23:04:29 -07:00
Sepehr Sameni
cffeb03a2d fix forward and backward for norm with negative infinity norm (#12722)
Summary:
I found a bug in norm() and fixed it (and added tests to make sure it's fixed)
here is how to reproduce it:
```python
import torch
x = torch.FloatTensor([[10, 12, 13], [4, 0, 12]])
print(torch.norm(x, -40, dim=0, keepdim=True)) #output is tensor([[ 4.0000,  0.0000, 11.9853]])
print(torch.norm(x, float('-inf'), dim=0, keepdim=True)) #output is tensor([[1., 1., 1.]]) which is wrong!
from numpy.linalg import norm as np_norm
x = x.numpy()
print(np_norm(x, ord=-40, axis=0)) #output is array([[4., 0., 11.985261]])
print(np_norm(x, ord=float('-inf'), axis=0)) #output is array([[4., 0., 12.0]])
```
it's related to [#6817](https://github.com/pytorch/pytorch/issues/6817) and [#6969](https://github.com/pytorch/pytorch/pull/6969)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12722

Differential Revision: D10427687

Pulled By: soumith

fbshipit-source-id: 936a7491d1e2625410513ee9c39f8c910e8e6803
2018-10-17 21:07:43 -07:00
Zachary DeVito
c8ac878b98 Fix bug in script for where (#12385)
Summary:
Where is declared as:

```
where(Tensor condition, Tensor self, Tensor other)
```

Previously the compiler assumed that self must be the first argument.
But this is not true in practice for `where` and for a few other exceptions.

This changes the compiler to take an explicit self argument which gets matched
to the `self` that appears in the schema.

Note that this requires renaming a variant of pow, which referred to
an exponent Tensor as `self` because otherwise that would cause `t^3`
to match against `t` being the exponent.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12385

Differential Revision: D10364658

Pulled By: zdevito

fbshipit-source-id: 39e030c6912dd19b4b0b9e35fcbabc167b4cc255
2018-10-16 21:05:14 -07:00
Natalia Gimelshein
a98958d3bd dtype option for softmax (#11719)
Summary:
Add dtype argument to softmax/log_softmax functions.
Computing softmax in fp32 precision is necessary for mixed precision training, and converting output of the previous layer into fp32 and then reading it as fp32 in softmax is expensive, memory and perf-wise, this PR allows one to avoid it.
For most input data/dtype combinations, input data is converted to dtype and then softmax is computed. If input data is half type and dtype is fp32, kernels with the corresponding template arguments are called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11719

Reviewed By: ezyang

Differential Revision: D10175514

Pulled By: zou3519

fbshipit-source-id: 06d285af91a0b659932236d41ad63b787eeed243
2018-10-13 17:57:10 -07:00
Xiang Gao
97eec33f80 Allow tensor.device, tensor.dtype, and tensor.shape in JIT (#12363)
Summary:
Closes https://github.com/pytorch/pytorch/issues/12364
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12363

Differential Revision: D10362491

Pulled By: ezyang

fbshipit-source-id: f2716e656977370c5ec51cb15f62b6376798e617
2018-10-12 11:29:04 -07:00
James Reed
2279299c6c Implement aten::contiguous (#12541)
Summary:
Implement contiguous as `aten::contiguous` so it can be recorded during tracing. This was causing issues with both the trace checker as well as when a `contiguous()`-ed tensor was used downstream in a view that expected certain strides
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12541

Differential Revision: D10304028

Pulled By: jamesr66a

fbshipit-source-id: dc4c878771d052f5a0e9674f610fdec3c6782c41
2018-10-11 23:39:39 -07:00
David Riazati
eb5fdc5fb5 Add default values in script (#12345)
Summary:
Add support for default values on script functions and Modules

Followup to #11962
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12345

Reviewed By: michaelsuo

Differential Revision: D10263613

Pulled By: driazati

fbshipit-source-id: 9b380d8c3f8c4abb2d24c33b23c00ec5896ca372
2018-10-11 20:49:23 -07:00
Richard Zou
a1487bf874 Smarter differentiable subgraph slicing (#12175)
Summary:
If any inputs require_grad then the graph executor does differential subgraph slicing. The existing algorithm combines adjacent differentiable Node*.

There are two major motivations. The first is improving fusion opportunities: the graph fusion pass runs after differential subgraph slicing. This means that only nodes that are a part of the same differential subgraph may be considered for fusion. If something like the following happens,
```
y = f(x)
k = not_differentiable_op(m)
z = g(y)
```
and f and g are both fusible and differentiable operations, then they will be inserted into different differential subgraphs and not fused together.

The second is to enable JIT optimizations on backward passes for things like an (automatically) unrolled LSTM. Right now, in an unrolled LSTM, we see something like the following:
```
lstm_cell()
non_differentiable_list_op()
lstm_cell()
non_differentiable_list_op()
lstm_cell()
non_differentiable_list_op()
```
Each lstm_cell itself is differentiable and gets put into a separate differential subgraph. During the backwards pass, each prim::DifferentiableSubgraph has its own graph executor: these graph executors cannot talk to each other. It is better if we combined all of the lstm_cells (where applicable) into one differential subgraph so their backward passes are combined into one graph executor that can perform better optimizations than several separate graph executors.

Think about the computation graph as a DAG where edges are data dependencies and vertices are operations (the nodes). Each vertex is either black or red; a vertex is colored black if it is differentiable and red otherwise. The goal is to contract edges (merge nodes) to have the fewest black vertices remaining such that the graph is still a DAG.

The algorithm is the following:
- Take the Graph& and create a shadow "DynamicDAG" object to wrap Node* and edges. Each Vertex holds multiple Node* (but starts out holding one Node*) and each edge is a data dependency.
- Greedily contract vertices in the DynamicDAG if they are "differentiable". This operation is unrelated to the Graph&.
  - A Vertex is "differentiable" if all the nodes it holds is differentiable.
  - When contracting vertices, combine their Node* contents.
  - The DynamicDAG keeps its vertices in topological order and complains if the contraction is invalid so everything is good.
- Take the DynamicDAG: reorder the nodes in the Graph& to match the topological order in the DynamicDAG.
- Finally, go through each Vertex in the DynamicDAG: if it contains multiple Node* then merge all of them into a prim::DifferentiableGraph.

The DynamicDAG is based off of the dynamic top sort algorithm in [this paper](https://www.doc.ic.ac.uk/~phjk/Publications/DynamicTopoSortAlg-JEA-07.pdf) by Pearce and Kelly.

Each contractEdge(producer, consumer) call is `O(|AR| log |AR| * min(|out_edges(producer)|, |in_edges(consumer)|)` where `AR` is the "affected region" (defined as the set of nodes that, in topological order, are between producer and consumer). By only considering contractions such that `|ord(producer) - ord(consumer)| < threshold1` and `|out_edges(producer)| < threshold2` we can make each contractEdge(producer, consumer) call take constant time. The resulting algorithm is linear in the number of nodes.

Added a lot of small test cases.

Looking for suggestions on the following:
- what big computation graphs should I run this on to test how fast or slow it is?
- what things other than correctness should I be thinking about when I test this?

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12175

Differential Revision: D10302564

Pulled By: zou3519

fbshipit-source-id: 8a94d130d82f8a1713cc28483afef9a72d83d61a
2018-10-11 16:20:53 -07:00
James Reed
0f9807ee61 Enable addmm fusion for ONNX export only (#12538)
Summary:
There's some action at a distance issues and not having this is disabling quantization in C2 for prod use cases

ref T34831022
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12538

Differential Revision: D10302931

Pulled By: jamesr66a

fbshipit-source-id: 700dc8c5c4297e942171992266ffb67b815be754
2018-10-11 13:57:50 -07:00
James Reed
a4120fa132 Get rid of emitApplyIdent (#12504)
Summary:
And reroute builtin/CompilationUnit function resolution through one resolution pathway
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12504

Differential Revision: D10319920

Pulled By: jamesr66a

fbshipit-source-id: 3ab9877664dd32b97136a7625d0688e1adc0c022
2018-10-11 10:53:53 -07:00
Roy Li
1a0d82e4f4 fix import for script module with control flow blocks (#12351)
Summary:
The value_info proto field was being processed in BuildGraph, but control flow blocks used buildBlocks instead. This PR moves moves that step to BuildBlock.

I removed DecoderBase because it was making the code confusing and we never needed it in the first place.

closes #12319
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12351

Differential Revision: D10212411

Pulled By: li-roy

fbshipit-source-id: 47f289a462a1ab7391ff57368185401673980233
2018-10-08 22:25:14 -07:00
Elias Ellison
00aedfc0e2 constant pooling pass (#12222)
Summary:
Add a pass to move all constants to the beginning of the graph, and deduplicate.

This extends https://github.com/pytorch/pytorch/pull/10231 to also handle constants introduced in inlining, constant propagation, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12222

Reviewed By: driazati

Differential Revision: D10201616

Pulled By: eellison

fbshipit-source-id: bc9c5be26868c8b5414257a0d4462de025aeb9bd
2018-10-08 11:55:02 -07:00
David Riazati
92b0e7026e Add weak script mode for script functions (#11963)
Summary:
This PR is the start of weak script mode for functions

Weak scripts allow you to compile a graph from Python code at runtime by annotating with `torch.jit.weak_script` for use in the JIT without affecting eager execution. Scripts are compiled lazily on the first call in a graph to avoid long Python startup times.

apaszke zdevito ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11963

Differential Revision: D10183451

Pulled By: driazati

fbshipit-source-id: 128750994d5eb148a984f8aba4113525c3e248c8
2018-10-05 18:55:49 -07:00
Zachary DeVito
b937cbb776 Fix a bug that would resize tensor storage on export
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12377

Differential Revision: D10219213

Pulled By: zdevito

fbshipit-source-id: 85cfa4467c672ff5a718e58cfae7e8c8b1cfc532
2018-10-05 16:24:54 -07:00
David Riazati
f0b73ff790 Pretty printer improvements (#12179)
Summary:
* Replaces `prim::PythonOp` with the name of the function being called
* Delays printing values used in `prim::Return` nodes until the return
node itself if that is the only place the value is used to remove some
useless assigns

zdevito apaszke ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12179

Differential Revision: D10132661

Pulled By: driazati

fbshipit-source-id: cbc4ac34137ed5872049082e25d19eb1ebc71208
2018-10-04 15:14:51 -07:00
David Riazati
c9f9df002d Properly catch errors in PythonOps (#12243)
Summary:
If a PythonOp throws an error it raises an exception to the interpreter and also releases the GIL which causes [pybind to segfault](https://github.com/potassco/clingo/issues/42)

This fix catches pybind errors while the GIL is still held and throws a `python_error` to re-capture the GIL

Fixes #12118

apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12243

Differential Revision: D10182787

Pulled By: driazati

fbshipit-source-id: 719d4a7c3294af201e061cf7141bec3ca0fb1f04
2018-10-03 17:25:03 -07:00
David Riazati
d1ac1eba3b Add bool type to IR (#11834)
Summary:
This PR adds a bool type to `IValue` and puts it into place.

* changes conds for `prim::If` and `prim::Loop` to use `bool` type
* changes operators that take `bool`s to match their native ops
* fixes ambiguous `aten` ops `aten::std` and `aten::var`
	* fixes tests in `test_jit.py TestJitGenerated`
		```
		'test_std_dim',
		'test_std_dim_1d',
		'test_std_dim_1d_neg0',
		'test_std_dim_neg0',
		'test_var_dim',
		'test_var_dim_1d',
		'test_var_dim_1d_neg0',
		'test_var_dim_neg0'
		```
* adds `prim::BoolToTensor` and `prim::TensorToBool`

apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834

Differential Revision: D9928570

Pulled By: driazati

fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb
2018-10-03 12:40:03 -07:00
Elias Ellison
fed91f873f (Very small) allow trailing commas in assign or tuples (#11723)
Summary:
Allow trailing commas in assign statements or tuples, which also allows single element tuples.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11723

Differential Revision: D10052162

Pulled By: eellison

fbshipit-source-id: 344d908a3ad942a23ebd9f341794bc9734226aa8
2018-10-01 10:10:13 -07:00
iotamudelta
a2ebbccc9f fix unit tests on CI
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12187

Differential Revision: D10118483

Pulled By: bddppq

fbshipit-source-id: 986c8fb48d61e00103c713548a50e74489a0e442
2018-09-28 23:11:55 -07:00
mruberry
7b2c0a09e4 Adds support for NaN, +inf, -inf float scalars to CPU and CUDA fusers (#12070)
Summary:
In current upstream float scalars are always written into kernels with:

`out << std::scientific << v << "f";`

When the floats are special values like NaN, +inf, or -inf this produces nonsense that causes compilation to fail. This fix updates the conversion of float scalars to device-specific special values. The appropriate macros are added to the CPU and CUDA resource strings. Note that a NAN macro was not necessary on the CPU since math.h defines NAN.

To verify this fix I updated the test_clamp_fusion test in test_jit.py. I wanted to test -inf, too, but -inf is not currently accepted by the interpreter.

Edit:

Forgot to mention, this partially addresses issue #12067.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12070

Reviewed By: ezyang

Differential Revision: D10044704

Pulled By: soumith

fbshipit-source-id: 8f4a930862d66a7d37d985e3f6a6fb724579e74c
2018-09-28 14:11:49 -07:00
Luca Antiga
5be0baefa2 Use streams in JIT serialization, allow JIT serialization to/from buffer (#11932)
Summary:
This PR replaces the use of `std::FILE` with `istream`/`ostream` for JIT serialization.
It uses this mechanism to add the possibility to serialize to/from binary buffers, in addition to files, both in `libtorch` and from Python.

`getExportImportCopy` in `test_jit.py` has been updated so that both file and buffer codepaths are exercised during tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11932

Differential Revision: D10084303

Pulled By: apaszke

fbshipit-source-id: b850801b3932922fa1dbac6fdaed5063d58bc20d
2018-09-28 07:54:27 -07:00
Michael Suo
7f35e92af2 mutable lists (#10700)
Summary:
This PR implements the design that we discussed. Changes:
- Added a World token IValue and type. The IValue is basically a dummy struct for now, in the future we may extend it (say, add thread-local state).
- Effectful ops explicitly declare they are mutable by having World tokens as inputs and outputs in their schema.
- Purely functional ops that use mutable values will get "fenced" and the world token will be threaded through the fences
- AnnotateEffects pass which wires up all the world tokens together.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10700

Reviewed By: eellison

Differential Revision: D9547881

Pulled By: michaelsuo

fbshipit-source-id: ebbd786c31f15bf45e2ddb0c188438ff2f5f3c88
2018-09-27 19:25:13 -07:00
Zachary DeVito
478803a75f Introduce type variables to implement generic list operators (#12040)
Summary:
We generate specialized list operations for int, float, and Tensor lists so that small lists of integers like the arguments to conv do not involve tons of boxing code.

This PR adds a fallback GenericList for List types that contain any other type. It does so by adding type variables to `jit::Type`, and machinery for matching/replacing the type variables during `tryMatchSchema` and operator lookup.

It also modifies the builtin list ops to include a fallback that works on a GenericList object that simply holds IValues. This is distinguished from IValue's tuple type so that conversion to/from Python still happens losslessly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12040

Differential Revision: D10037098

Pulled By: zdevito

fbshipit-source-id: 0c5f2864d12e7d33554bf34cc29e5fb700dde150
2018-09-26 17:02:51 -07:00
Adam Paszke
18f9c07b18 Enable tracing of tensor factories with an out argument
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12051

Differential Revision: D10044890

Pulled By: apaszke

fbshipit-source-id: 2d794bf408875600bc71f354f0b4961d6b715094
2018-09-26 09:40:34 -07:00
Richard Zou
c8a0b11b7f add autodiff expressions for common operations (#11832)
Summary:
This PR does a few things:

Previously test_jit.py only tested autograd on backward graphs.
This is because we borrow from test_autograd and construct graphs with a small
number of nodes. Because the number of nodes is small (typically 1-2), those graph
do not end up containing autodiff subgraphs, so autodiff never gets tested.

This PR enables autodiff testing by doing the following:
- added disableDebugAutodiffSubgraphInlining fn to graph_executor to disable
  autodiff subgraph inlining.
- (implementation) added autodiffSubgraphNodeThreshold and autodiffSubgraphInlineThreshold.
  These are set to their default values (2, 5) but disableDebugAutodiffSubgraphInlining()
  sets both to 1, disabling subgraph inlining and allowing 1-node autodiff subgraphs.
- The relevant backward jit tests disable autodiff subgraph inlining so they
  will test the autodiff versions of the operators instead of autograd whenever
  an autodiff variant exists.
- We don't run the tests that do inline autodiff subgraphs anymore.
  This has no impact on testing correctness because the assumption is
  that autograd functions are correct and are tested in test_autograd.py

This allows the graph fuser to work better because a lot of these ops were previously not autodiff-compatible but fusible. On a more concrete example, lstm backward contains a lot of tensor-scalar operations; these autodiff formulas help its double backward pass.

Included:
- arithmetic overloads
- abs, acos, asin, atan, ceil, cos, cosh, exp, expm1, floor, fmod, frac, log, log10, log1p, log2 reciprocal, remainder, round, sin, sinh, tan, trunc, rsqrt

TestJitGenerated tests autodiff for all of the added operations.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11832

Differential Revision: D10031256

Pulled By: zou3519

fbshipit-source-id: 9daf9900a5ad187743609cd0fbbd10b15411ad93
2018-09-26 08:10:04 -07:00
Adam Paszke
a830964007 Eliminate no-op adds and muls in peephole pass (#11801)
Summary:
Because we emit a lot of them in our symbolic AD. This brings down the backward time of an LSTM I'm testing from 14.2ms to 12.5ms (a 15% improvement).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11801

Differential Revision: D9916815

Pulled By: apaszke

fbshipit-source-id: 2d9cb886c424ccd43b9f996aad89950d3bddf494
2018-09-24 17:48:48 -07:00
Adam Paszke
51414822f5 Stop moving constants into DifferentiableSubgraphs (#11809)
Summary:
Or even taking them as inputs. This prevents optimizations to happen
either inside the differentiable subgraphs, or in the surrounding graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11809

Differential Revision: D10009680

Pulled By: apaszke

fbshipit-source-id: face638566228e470a6deec48dc2aa3a1cce26d4
2018-09-24 13:24:53 -07:00
Richard Zou
b5f60af94c Shape prop view/reshape/as_strided through prim::ListConstructs (#11877)
Summary:
Previously, aten::view returned a Dynamic type when attr::size is a prim::ListConstruct.
See [this for a repro](https://gist.github.com/zou3519/cbd610472ba3369f556fa612a7d93b28).
This prevented a pre-multipled lstm input graph from being fusible (aten::view is necessary
to do premultiplication).

If aten::view is passed an output of a prim::ListConstruct node, then shape prop should
be able to figure out its TensorType because we statically know the number of inputs to
prim::ListConstruct. This PR implements that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11877

Differential Revision: D9972356

Pulled By: zou3519

fbshipit-source-id: cb87786f6e7f222d4b8f07d8f2a9de34859cb6a5
2018-09-21 14:20:01 -07:00
Adam Paszke
7efbf3a827 Specialize ArgumentSpecs on tuple elements too (#11863)
Summary:
This is pretty important because a common situation of passing LSTM hidden states as a tuple completely trashes performance of a network.

Cleans up all our propagation/undef specialization passes, at a cost of increased complexity of `ArgumentSpec` and `GraphExecutor`. An alternative would be to simply flatten all tuple inputs to a graph ahead of time, but that might just end up being confusing in the future (you never know if you're working with a graph that can have tuple or not).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11863

Differential Revision: D9992814

Pulled By: apaszke

fbshipit-source-id: 0a565a3b23e32f8fa72c0534e07c1ce6187739fc
2018-09-21 14:19:58 -07:00
Adam Paszke
1ad7e0c5ec Minor JIT improvements (#11654)
Summary:
- Disable addmm fusion. The reason for this is explained in the comment.
- Tiny change in `stack.h` that lets us avoid constructing an unnecessary temporary `IValue` on the (C++) stack (it will only get created on the interpreter stack directly).
- Fixed a correctness issue in requires grad propagation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11654

Reviewed By: colesbury

Differential Revision: D9813739

Pulled By: apaszke

fbshipit-source-id: 23e83bc8605802f39bfecf447efad9239b9421c3
2018-09-21 14:19:54 -07:00
David Riazati
4e65fbfee5 Remove tests from EXCLUDE_SCRIPT that pass (#11916)
Summary:
Spruriously added in #11261

I had a PR to catch these automatically (#11279), but it had some issues
passing on some CI environments but not others (e.g. for
`test_nn_group_norm`), any ideas?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11916

Differential Revision: D9992065

Pulled By: driazati

fbshipit-source-id: 05cfa8ed9af939e8ffd5827847ee7bfe0be799b2
2018-09-21 14:19:50 -07:00
Luca Antiga
58d28a5f12 Fix saving loaded module (#11915)
Summary:
This PR fixes #11913.

In order to test for this, the model is serialized twice in `getExportImportCopy`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11915

Differential Revision: D9984697

Pulled By: soumith

fbshipit-source-id: ae0250c179000c03db1522b99410f6ecb9681297
2018-09-21 06:58:16 -07:00
yya007
b91b15d86e Implementing Matrix Norm for torch.norm (#11261)
Summary:
Currently, norm function only supports vector norm. This PR extends vector norm to matrix norm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11261

Reviewed By: li-roy

Differential Revision: D9652379

Pulled By: yya007

fbshipit-source-id: 519b3fb80b563c17c56a24675c7b0e46bf5a3a1c
2018-09-20 14:43:13 -07:00
Thomas Viehmann
068eac255b Jit fuse clamp (#11574)
Summary:
This patch adds fused forward and backward for clamp to the jit.
This is one item of #11118 . If it's OK, I'd be happy to also add some more of #11118 .

The patch depends on #11150 , which I merged into master as a base. I'll rebase it when that or #10981 is merged.

This is first serious jit patch, thank you, ngimel and the others for their guidance. All errors are my own.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11574

Differential Revision: D9943090

Pulled By: apaszke

fbshipit-source-id: c40954b8c28c374baab8d3bd89acc9250580dc67
2018-09-20 14:43:10 -07:00
Richard Zou
8f4601fbac renable test_scalar_fusion
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11378

Differential Revision: D9943578

Pulled By: zou3519

fbshipit-source-id: fb9e4303e844d5e2515acce7869bcbe11526ab56
2018-09-20 07:56:25 -07:00
David Riazati
a79f5d77ad Add pretty printer for JIT IR (#10319)
Summary:
Adds some pretty-printing capability to the IR graph to make debugging easier/more human readable, see `torch/csrc/jit/test_jit.cpp:925` and onwards for example outputs. Results aren't perfect yet but it's a start.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10319

Reviewed By: zdevito

Differential Revision: D9558402

Pulled By: driazati

fbshipit-source-id: 1d61c02818daa4c9bdca36d1477d1734cfc7d043
2018-09-18 17:39:44 -07:00
Wanchao Liang
d4e1fa45d0 allow no-alpha add/sub in onnx symbolic (#10972)
Summary:
The PR fixes #10873

The context is aten::add and aten::sub ST overloads don't have alpha, so onnx symbolic does not match.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10972

Reviewed By: jamesr66a

Differential Revision: D9724224

Pulled By: wanchaol

fbshipit-source-id: eb5d1b09fa8f1604b288f4a62b8d1f0bc66611af
2018-09-18 13:55:39 -07:00
David Riazati
7671f4ab1c Add math to scope when using inf in tests (#11302)
Summary:
This fixes #8515 which was mostly issues in the test themselves. As long
as `math` is imported in the scope in which the script runs it resolves
to a `prim::Constant` with value `inf` correctly. This PR adds this to
the `test_jit.py` tests involving `inf` and adds a test to demonstrate
`inf` in a non-generated test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11302

Differential Revision: D9684336

Pulled By: driazati

fbshipit-source-id: 73df2848dfdb45ab50690a7c88df8fda269a64eb
2018-09-17 14:08:32 -07:00
Natalia Gimelshein
336323f53c return aten::gt to the list of fusable operations, add expected graphs (#11150)
Summary:
Fixes one of #11118 issues.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11150

Differential Revision: D9861372

Pulled By: apaszke

fbshipit-source-id: 98b196b89e991d3936360b30568360367fd32e8b
2018-09-17 13:40:41 -07:00
Mike Ruberry
96d3f968eb Splits CPU and CUDA fusion compilers (#10981)
Summary:
This PR splits the CPU and CUDA fusion compilers, putting them into a new jit/fusers/ directory with jit/fusers/common for common components. In particular:

- A fusion interface is created that allows "fusion handles" to be requested
- The CPU and CUDA fusers implement this interface, with dispatch determined by device
- The fusion compilers, fusion function specializations and resource strings are split
- CPU-specific classes like TempFile and DynamicLibrary are in the CPU fuser
- Common classes likes TensorDesc and the base fusion function class are in jit/fusers/common
- There is still some specialization in jit/fusers/common, but these specializations are small(-ish)
- Updates the build system to remove the dummy interface on Windows and minimize the use of macros

This structure should allow in-flight PRs to easily rebase while providing a clear interface to the fusers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10981

Reviewed By: soumith

Differential Revision: D9701999

Pulled By: apaszke

fbshipit-source-id: 3b6bec7b97e0444b2a93caa38d9b897f2e68c1b3
2018-09-14 14:05:34 -07:00
James Reed
278e304c18 Implement elif in string frontend (#11667)
Summary:
Closes #11625
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11667

Differential Revision: D9828145

Pulled By: jamesr66a

fbshipit-source-id: c72dc41cb310a4211b4e4c6b33f7e2c1fb3581a0
2018-09-14 10:09:46 -07:00
Adam Paszke
98e04db955 Implement requires_grad propagation in the JIT (#11586)
Summary:
Previously, we would pretty much assume that all floating point tensors do require grad, which might result in some unnecessary compute.

I don't really like the fact that `TensorType` uses `tensor.is_variable() && tensor.requires_grad()` to infer the value of `requires_grad`, but changing constants to keep variables turns out to be pretty hard. I got halfway there, but it would still need some more work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11586

Reviewed By: ezyang

Differential Revision: D9813648

Pulled By: apaszke

fbshipit-source-id: 77f77756d18ff7632fca3aa68ce855e1d7f3bdb8
2018-09-13 19:25:26 -07:00
James Reed
0f1ca569ce End-to-end dynamic slicing with ONNX DynamicSlice experimental operator (#11255)
Summary:
Requires https://github.com/onnx/onnx/pull/1377

This PR makes it so that slices with dynamic boundary values can be exported from pytorch and run in caffe2 via ONNX.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11255

Differential Revision: D9790216

Pulled By: jamesr66a

fbshipit-source-id: 6adfcddc5788df4d34d7ca98341077140402a3e2
2018-09-13 12:39:52 -07:00
Roy Li
75f49befeb move instance_norm to aten (#10792)
Summary:
This also removes the usage of torch.onnx.symbolic_override in instance_norm. Fixes #8439.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10792

Differential Revision: D9800643

Pulled By: li-roy

fbshipit-source-id: fa13a57de5a31fbfa2d4d02639d214c867b9e1f1
2018-09-13 12:26:22 -07:00
Richard Zou
45e9ee096e Fix test_mnist_training_leaks_no_memory_cuda warning (#11639)
Summary:
Before this PR it would warn that "dropout is non deterministic and can
cause problems when checking trace", so I disabled the trace checking.

cc zdevito apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11639

Differential Revision: D9812493

Pulled By: zou3519

fbshipit-source-id: fab86928a5fba8b218b47543533aaf7c82a10b4a
2018-09-13 12:09:20 -07:00
David Riazati
6f53b4efea Remove implicit bool casts (#11503)
Summary:
In order to comply with Python's rules on implicit casting of
non-booleans to booleans, this PR removes implicit casting in favor of
explicit casts via `bool()`

cc zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11503

Differential Revision: D9780869

Pulled By: driazati

fbshipit-source-id: c753acaca27f4e79dddf424c6b04674f44a6aad9
2018-09-13 11:26:45 -07:00
Zachary DeVito
ab3a2d25fb Improve error messages when trying to use nested lists.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11606

Differential Revision: D9806949

Pulled By: zdevito

fbshipit-source-id: c38abc4ce745a63d26a64f6aa1b41350e4b1acd5
2018-09-13 11:10:38 -07:00
Roy Li
a861573e36 fix tensor export bug in IR export (#11613)
Differential Revision: D9811094

Pulled By: li-roy

fbshipit-source-id: 012792dbedc70bd3fa242fdf2e39da0b21ce158d
2018-09-13 11:10:35 -07:00
Elias Ellison
77f6998e54 Guard against inputting or returning sparse tensors (#11550)
Summary:
Add guards against using sparse tensor by checking the conversion from IValue -> PyObject & PyObject -> IValue.

This diff also changes the behavior in constant propagation to not run python ops even if all ops are constant because of possible mutation to global state. This came up in trying to run get_sparse(), and I'm including it here to make it easier to land.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11550

Differential Revision: D9804712

Pulled By: eellison

fbshipit-source-id: 9fe7daf721c6d6e48df4925c0f9c775873bcdc77
2018-09-13 08:58:29 -07:00
Wanchao Liang
44b2b6b150 clean up jit generated tests (#11403)
Summary:
Clean up some generated tests after we have newly nice features like var args.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11403

Differential Revision: D9800545

Pulled By: wanchaol

fbshipit-source-id: e9973b113f78dc38cf99a81b6ede3fa3485f1cfa
2018-09-12 22:55:03 -07:00
Wanchao Liang
739e6af869 Add reminder % to the jit
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11557

Reviewed By: apaszke

Differential Revision: D9784642

Pulled By: wanchaol

fbshipit-source-id: b7c60c3e9534555c9d7db83769965b3f2f277cdf
2018-09-12 12:40:38 -07:00
Zachary DeVito
ad7936e108 Fix reloading modules back into python (#11552)
Summary:
This changes the way module import works so that when a module
is reloaded in python it becomes a ScriptModule and not a _C.ScriptModule
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11552

Differential Revision: D9782751

Pulled By: zdevito

fbshipit-source-id: 9576850b75494b228ce3def94c0d371a4a44b11d
2018-09-12 12:25:15 -07:00
Richard Zou
13b05c8c78 Add EndToEndHybridModel CUDA tests (#11544)
Summary:
Also adds two additional tests that check for memory leaks while the relevant graph executors are alive:
- (minimal test): Create a ScriptModule, keep it alive, and test that it does not leak memory while it is alive
- (large test) Do MNIST training with a traced MNIST module and test that no memory is leaked while the traced module (with graph executor) is alive

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11544

Reviewed By: apaszke

Differential Revision: D9778479

Pulled By: zou3519

fbshipit-source-id: 2d6cdea81dd1264f2c0396b662f70fdafecb3647
2018-09-12 11:25:18 -07:00
Adam Paszke
62c9d4ac96 Make .to() methods native functions (to fix JIT tracing)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11491

Differential Revision: D9771121

Pulled By: apaszke

fbshipit-source-id: 08d11101fb12093f8cf913b06359adddf3af9da7
2018-09-11 21:55:42 -07:00
Adam Paszke
8b196d671b Allow tracing random functions (only when using default generators) (#11539)
Summary:
Fixes #11504.

zdevito, neerajprad, fritzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11539

Differential Revision: D9777897

Pulled By: apaszke

fbshipit-source-id: 56983260f5b93da7d5540a6242769ea7bd50eb06
2018-09-11 17:56:39 -07:00
Zachary DeVito
289a8c9b7d Allow train/eval, and non-Tensor arguments to python functions (#11505)
Summary:
This whitelists train/eval functions in script modules, and tests that nested nn.Modules still work.

This also changes the code for calling python functions from script to allow non-tensor inputs/outputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11505

Differential Revision: D9765466

Pulled By: zdevito

fbshipit-source-id: 1177bff931324422b69e18fa0bbaa82e3c98ec69
2018-09-11 15:05:09 -07:00
James Reed
deac304b6b Bugfix for basic slicing
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11428

Differential Revision: D9753999

Pulled By: jamesr66a

fbshipit-source-id: cfc4163a5a06b41beb808a4e24650d71f5d91f4f
2018-09-11 09:39:29 -07:00
Adam Paszke
120d769432 Add support for tracing strings (#11506)
Summary:
This enabled `torch.einsum` both in tracing and in script mode. It's used all over Pyro at the moment, and is needed for any use of the JIT in there.

Fixes #11157.

zdevito fritzo neerajprad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11506

Differential Revision: D9764787

Pulled By: apaszke

fbshipit-source-id: 9b5251b9e7c5897034602bd07ff67b425d33326c
2018-09-11 06:02:41 -07:00
Adam Paszke
0ddbe668cd Improve shape analysis to cover all most commonly used ops (#11358)
Summary:
[Here's a list](https://gist.github.com/apaszke/f0821840bdcc67a977832dc58acc1b85) of ops that are in `register_aten_ops.cpp`, but aren't supported in shape prop. Everything else should work now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11358

Differential Revision: D9753693

Pulled By: apaszke

fbshipit-source-id: efeae0126ce16cb56b8797fc5246405588bcae3c
2018-09-11 06:02:39 -07:00
James Reed
3ad67c60f0 Traceable explicit Variable instantiation (#11463)
Summary:
There's a bunch of legacy code where people are explicitly instantiating Variable, and these call-sites have thus far been untraceable (appearing as prim::Constant nodes with the tensor value at the time of tracing). This makes it so that the new variable inherits the traced Value* from the tensor it's being constructed from
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11463

Differential Revision: D9756529

Pulled By: jamesr66a

fbshipit-source-id: da99c6a7621957a305f2699ec9cb9def69b1b2d7
2018-09-10 17:03:24 -07:00
Adam Paszke
3e665cc29b Improve support for tracing sizes, add more tracer warnings (#11288)
Summary:
Many constructors like `torch.zeros` or `torch.randn` didn't support
size tracing correctly which is fixed by this pass. Same issue has been
fixed in legacy tensor constructors.

Additionally, new tensor constructors, which do not participate in
tracing (most notably `torch.tensor`, `torch.as_tensor` and
`torch.from_numpy`) raise a warning when they are used.

Finally, entering a traceable operation disables the tracing in its body.
This is needed because

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11288

Reviewed By: ezyang

Differential Revision: D9751183

Pulled By: apaszke

fbshipit-source-id: 51444a39d76a3e164adc396c432fd5ee3c8d5f7f
2018-09-10 15:22:48 -07:00
Elias Ellison
2158f4a9c8 add export import test to TestJitGenerated (#10982)
Summary:
Checking assertExportImport for all of the generated test jit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10982

Differential Revision: D9636935

Pulled By: eellison

fbshipit-source-id: f3f1ce77d454848098f2ac7e0fa18bf8564890be
2018-09-10 11:37:05 -07:00
Tongzhou Wang
d3f98b5ffc Add matrix power (#11421)
Summary:
vishwakftw Your patch needed some updates because the default native function dispatches changed from `[function, method]` to `[function]`. The CI was run before that change happened so it still shows green, but the internal test caught it.

I did some changes when rebasing and updating so I didn't just force push to your branch. Let's see if this passes CI and internal test. If it does, let me know if you want me to force push to your branch or use this PR instead.

Note to reviewers: patch was already approved at #10068 .

cc yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11421

Differential Revision: D9733407

Pulled By: SsnL

fbshipit-source-id: cf2ed293bb9942dcc5158934ff4def2f63252599
2018-09-08 15:25:56 -07:00
James Reed
47c1de25e8 Test exporting batch norm, dropout, RNN
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11126

Differential Revision: D9727689

Pulled By: jamesr66a

fbshipit-source-id: f142257a2fba27d86844bf33084174f1f68a8ca5
2018-09-07 19:41:39 -07:00
James Reed
4ae16c9ad9 Recursive descent for validation + convert expands in ATen fal… (#11356)
Summary:
…lback
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11356

Differential Revision: D9721002

Pulled By: jamesr66a

fbshipit-source-id: eeb50b56f8a72e929860c5e459a5ab50ac624814
2018-09-07 16:39:36 -07:00
David Riazati
4bf5fc44c8 Fix split_size test failures (#11051)
Summary:
~~This PR fixes #8525 by renaming `split_with_sizes` to `split` so that 2 `aten::split` ops are
generated (previously `aten::split(self, int, int)` and `aten::split_with_sizes(self, int[], int)` were generated)~~

~~`split_with_sizes` was made in PR #5443, but I don't see a reason for it to have
a different name than `split` rather than just overload `split`.~~

This PR fixes #8525 by adding `register_special_ops.cpp` to mirror Python dispatching from `split` to `split` and `split_with_sizes` in [tensor.py](https://github.com/pytorch/pytorch/blob/master/torch/tensor.py#L279).

It also fixes #8520 by adding an `int[]` wherever it sees `torch.Size`

In a follow up PR this could also be used to fix some of the other `unknown builtin op` test errors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11051

Differential Revision: D9582443

Pulled By: driazati

fbshipit-source-id: d27201f85937d72e45e851eaa1460dd3dd1b61a9
2018-09-07 15:39:24 -07:00
Wanchao Liang
69b4b45f91 enable missing nn tests with single grad check, minor refactor
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11366

Differential Revision: D9723305

Pulled By: wanchaol

fbshipit-source-id: 9e7e2e7e68cb4919610bccfbf76fa33b647f6eb7
2018-09-07 14:27:46 -07:00
Edward Yang
2946b021e3 Disable flaky test, see #11360 (#11361)
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11361

Reviewed By: yf225

Differential Revision: D9696524

Pulled By: ezyang

fbshipit-source-id: f6801d6f4f34090d467b16810db9cf576d5d519b
2018-09-06 20:40:00 -07:00
Richard Zou
4d678790c5 enable advanced indexing with tensors (#10862)
Summary:
On the way to #10774

This PR adds advanced indexing with tensors.
The approach is to desugar advanced indexing into an at::index op.
This is exactly how normal pytorch does it.
[(I used this code as reference)](https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/python_variable_indexing.cpp)

Supporting sequences is a little tricky because JIT script doesn't have
an easy way to turn arbitrary n-dimensional python lists into a tensor
(it would be easy if we supported `torch.tensor`), so that'll come
in a future PR.

cc jamesr66a zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10862

Differential Revision: D9659449

Pulled By: zou3519

fbshipit-source-id: 56d293720d44c0fd27909e18327ab3985ddfced6
2018-09-06 16:41:45 -07:00
Richard Zou
1ad61a18b2 Rename cuda tests to have 'cuda' in their names (#11332)
Summary:
Not a lot changed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11332

Differential Revision: D9683680

Pulled By: zou3519

fbshipit-source-id: 95f444e54049dd268fc10effe425ef2df79c6467
2018-09-06 11:57:52 -07:00
Elias Ellison
4ae95738b2 Ignore FuseGraph Call on Windows (#11015)
Summary:
Fusion is NYI implemented on Windows, so ignore FuseGraph call instead of failing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11015

Differential Revision: D9619121

Pulled By: eellison

fbshipit-source-id: ad09aeaa41b7fdeb9ca7bf5e1c166923ca405b15
2018-09-06 09:54:51 -07:00
Richard Zou
656e81db93 Fix scalar tensor assert in fusion compiler (#10952)
Summary:
Fixes #8560.
Unblocks #10715.

The assert (nDim <= uncompressedDims) was being triggered for a scalar
tensor because we compute nDim to be 1 for a scalar tensor but
uncompressedDim = 0.

This PR changes it so that we compute nDim to be 0 for a scalar tensor. This
works because indexing in a kernel depends on nDim. If nDim = 0, then
offset is always 0, which is what we want.

Some other (small) changes were necessary to make this work:
- One cannot define a 0-length array `IndexType arr[0]` so the code
  guards against that
- Needed to change some of the maxTensorInfoSize logic to handle the
  case when uncompressedDim == 0.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10952

Differential Revision: D9544607

Pulled By: zou3519

fbshipit-source-id: 2b873f47e2377125e1f94eb1b310a95cda51476c
2018-09-06 07:54:57 -07:00
Richard Zou
68c2e014cb Handling for py2/py3 division differences (#11016)
Summary:
- In Python 2, use of `/` (regardless of int/float/Tensor) causes a compiler error if
  `from __future__ import division` is not imported in the file.
- The / operator is universally set to do "true" division for integers
- Added a `prim::FloorDiv` operator because it is used in loop unrolling.

The error if users use '/' in python 2 without importing from __future__
occurs when building the JIT AST.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11016

Differential Revision: D9613527

Pulled By: zou3519

fbshipit-source-id: 0cebf44d5b8c92e203167733692ad33c4ec9dac6
2018-09-05 14:57:38 -07:00
Roy Li
9fc22cb772 Add import export step to end to end tests
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10717

Differential Revision: D9562888

Pulled By: li-roy

fbshipit-source-id: 8f5d62fd0a44aca0a41dc10438e7bb91cc2a972a
2018-09-05 09:39:47 -07:00
Adam Paszke
6d6655e6be Port PackedSequences functions to C++ (#11224)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11224

Differential Revision: D9652703

Pulled By: apaszke

fbshipit-source-id: 558e39457e590cad07516e5bb2ecb12789564950
2018-09-05 06:35:15 -07:00
Adam Paszke
b7038f7c37 Treat numerical differences as warnings instead of errors when tracing (#11246)
Summary:
Also, make `torch.isclose` work with integral tensors and refactor `_check_trace` a bit.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11246

Differential Revision: D9652701

Pulled By: apaszke

fbshipit-source-id: fb0bdbfd1952e45e153541e4d471b423a5659f25
2018-09-05 06:35:13 -07:00
Zachary DeVito
1eed7d5f0b Report an error when trying to record a mutable operator when (#11129)
Summary:
there are multiple views of the tensor live.

Also adds recording for copy_ because this is the critical in place
op where these views will cause LHS indexing to fail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11129

Differential Revision: D9600195

Pulled By: zdevito

fbshipit-source-id: bfd8f5befa47377e36d704dbdb11023c608fe9a3
2018-09-04 13:40:51 -07:00
Elias Ellison
539579aa9a Logical short circuit (#11116)
Summary:
Adding short circuit evaluation to AND or OR. The second expression of and AND or OR gets lifted into an if branch, which is conditionally evaluated.

BatchOps was using the expression `dims = dims1 or dims2`, where dims is often an empty tensor. This nows throws an error, because dims1 gets cast to a boolean, and you can't convert an empty tensor to a scalar. It now matches the behavior of pytorch in python.

One thing that came up is if the second expression in an and/or in python gets returned, it does not get coerced to a boolean.

`tensor == (False or tensor)`
`tensor == (True and tensor)`

We do not currently support this.

edit: wording
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11116

Differential Revision: D9618168

Pulled By: eellison

fbshipit-source-id: 93b202be2f222d41f85d38d9c95f04d1749e8343
2018-09-04 09:25:13 -07:00
iotamudelta
33c7cc13ca improve docker packages, fix bugs, enable tests, enable FFT (#10893)
Summary:
* improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs)
* integrate rocFFT (i.e., enable Fourier functionality)
* fix bugs in ROCm caused by wrong warp size
* enable more test sets, skip the tests that don't work on ROCm yet
* don't disable asserts any longer in hipification
* small improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893

Differential Revision: D9615053

Pulled By: ezyang

fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b
2018-09-02 08:54:42 -07:00
James Reed
43e73f85ad Dont optimize slicing dispatch when we are tracing (#11156)
Summary:
Previously when we had a slicing expression like `x[0:5, 0]`, where the sliced tensor was of size `5` in dimension 0, we would skip dispatching the actual slice call as an optimization.

This caused incorrect behavior under tracing, as we would not record the slice op and thus if we encountered an input with a different shape while running the trace, we would get incorrect results.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11156

Differential Revision: D9622252

Pulled By: jamesr66a

fbshipit-source-id: 822f2e8f01504e131f53bd9ef51c171c7913a7cc
2018-09-01 17:13:03 -07:00
James Reed
03c06ec93d Traceable detach (#11038)
Summary:
This makes it so `detach` and `detach_` are traceable and also adds a pass to erase them before ONNX export
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11038

Differential Revision: D9588038

Pulled By: jamesr66a

fbshipit-source-id: 263dd3147e24fcb0c716743f37fdb9f84c0015e7
2018-08-31 16:40:42 -07:00
Adam Paszke
780d2792c5 Warn about non-traceable behavior when tracing (#11088)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11088

Differential Revision: D9585527

Pulled By: apaszke

fbshipit-source-id: 29a03cb152d83b626f748fff4501ac9e139994c2
2018-08-31 14:27:00 -07:00
Adam Paszke
82aeebb3d9 Fix a bug in addmm fusion in the JIT (#11100)
Summary:
Fixes #10839.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11100

Differential Revision: D9585533

Pulled By: apaszke

fbshipit-source-id: 19e2710c8fc113f577faf14c080d8c89afbe23c4
2018-08-31 07:24:34 -07:00
Adam Paszke
00df09b65d Change specialization rules in GraphExecutors (#10977)
Summary:
**Review last commit only.** Stacked on top of #10949.

This commit fixes a number of issues connected to caching
differentiability status of graphs inside graph executors,
and changes the rules for optimization of differentiable subgraphs.
Previously every one of those was instantiated as a separate graph
executor, but now they are simply heavier-optimized graph regions,
and graph executors are only instantiated for their backward.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977

Differential Revision: D9600626

Pulled By: apaszke

fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3
2018-08-30 22:11:01 -07:00
Adam Paszke
f3c3127c67 Don't flatten output lists in the JIT IR (#10949)
Summary:
Operators like aten::chunk used to return a number of tensors, but
now return a list. To make it easier to do shape prop through
aten::chunk and fuse it, I've also introduced prim::ConstantChunk,
which behaves like the previous implementation (has a variable length
output list).

The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10949

Reviewed By: zdevito

Differential Revision: D9556823

Pulled By: apaszke

fbshipit-source-id: 33e63b17fc7247cac6cfc05eb7eb9bf069b499ee
2018-08-30 19:54:39 -07:00
Zachary DeVito
93bd291e55 Change torch.jit.trace to no longer be a decorator (#11069)
Summary:
This was done because it surprising for a decorator to run a function
rather than wrap it, and not simplify the syntax for tracing modules.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11069

Reviewed By: jamesr66a

Differential Revision: D9583192

Pulled By: zdevito

fbshipit-source-id: b914b7ab4c73c255086465a6576eef3a22de1e13
2018-08-30 13:56:05 -07:00
Erik Brinkman
611a608517 Add ATen pdist CPU kernel (#10782)
Summary:
Also add single grad whitelist to the jit test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10782

Reviewed By: ezyang

Differential Revision: D9583378

Pulled By: erikbrinkman

fbshipit-source-id: 069e5ae68ea7f3524dec39cf1d5fe9cd53941944
2018-08-30 11:55:27 -07:00
Zachary DeVito
ae635b16f7 Record tensor factory functions in trace (#10935)
Summary:
Things like torch.zeros now appear in traces rather than constants.

To continue to support our current level of ONNX export, we run
constant prop to turn these back into constants where possible before
export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10935

Differential Revision: D9527427

Pulled By: zdevito

fbshipit-source-id: 552a8bcc01b911251dab7d7026faafdd7a3c758a
2018-08-29 17:10:24 -07:00
Adam Paszke
d9b74f6540 Make it possible to disable JIT using env variables (#10867)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10867

Differential Revision: D9556882

Pulled By: apaszke

fbshipit-source-id: 04c0ca875d15d37dd9ac05ac7b515cd899ddb7e4
2018-08-29 15:11:05 -07:00
James Reed
beeec47041 Sanity checks for tracing (#10841)
Summary:
TODO: integrate into torch.onnx.export -- separate PR

*Problem:* We have a facility to trace PyTorch operations on Python code, but there are several failure modes where the trace is not representative of the actual underlying computation:

* The tracer encountered dynamic control flow
* Some computation escaped the tracer, and appeared as a Constant tensor node in the graph
* Some stateful function was traced, e.g. someone did an optimization in Python by memoizing function outputs

*Objective*: In an ideal world, this whole process would be automated and the user can trust that the system will magically capture the intended semantics from the program. Realistically speaking, we will likely have to settle with a human-in-the-loop error reporting system, allowing for the user to identify problems and modify the source code to allow for tracing.

*Stage 1* (this PR): Output-level checking & graph diff. torch.jit.trace gains a kwarg 'check_inputs', which is a list of tuples of input arguments. We will iterate through the list and trace the function again for each set of check inputs. We'll also interpret the original trace with these inputs and compare output values and graphs, printing a diff of the graph if there is a difference.

Examples:

```
torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 5),)])
def foo(x):
    y = torch.arange(0, x.shape[0]).float()
    return x + y.unsqueeze(1)
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!
	Graph diff:
		  graph(%0 : Dynamic) {
		-   %1 : Dynamic = prim::Constant[value= 0  1  2 [ CPULongType{3} ]]()
		?                                                              ^
		+   %1 : Dynamic = prim::Constant[value= 0  1  2  3 [ CPULongType{4} ]]()
		?                                                +++              ^
		    %2 : int = prim::Constant[value=0]()
		    %3 : Dynamic = aten::_cast_Float(%1, %2)
		    %4 : int = prim::Constant[value=1]()
		    %5 : Dynamic = aten::unsqueeze(%3, %4)
		    %6 : int = prim::Constant[value=1]()
		    %7 : Dynamic = aten::add(%0, %5, %6)
		    return (%7);
		  }
	Node diff:
		- %1 : Dynamic = prim::Constant[value= 0  1  2 [ CPULongType{3} ]]()
		?                                                            ^
		+ %1 : Dynamic = prim::Constant[value= 0  1  2  3 [ CPULongType{4} ]]()
		?                                              +++              ^
	Trace source location:
		dank.py(5): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper
		dank.py(3): <module>
	Check source location:
		dank.py(5): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper
		dank.py(3): <module>
ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code.
	Node:
		%1 : Dynamic = prim::Constant[value= 0  1  2 [ CPULongType{3} ]]()
	Source Location:
		dank.py(5): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper
		dank.py(3): <module>
	Comparison exception:
		Not equal to tolerance rtol=1e-07, atol=0

		(shapes (3,), (4,) mismatch)
		 x: array([0, 1, 2])
		 y: array([0, 1, 2, 3])

```
==

```
torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])
def foo(x):
    y = x.data
    return x + y
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Traced function outputs do not match the Python function outputs.
ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code.
	Node:
		%1 : Dynamic = prim::Constant[value=<Tensor>]()
	Source Location:
		dank.py(6): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper
		dank.py(3): <module>
	Comparison exception:
		Not equal to tolerance rtol=1e-07, atol=0

		(mismatch 100.0%)
		 x: array([0.397137, 0.956105, 0.169478, 0.560292, 0.392568, 0.108441,
		       0.97645 , 0.34412 , 0.951246, 0.793061, 0.557595, 0.770245],
		      dtype=float32)
		 y: array([0.243178, 0.315964, 0.972041, 0.0215  , 0.927751, 0.457512,
		       0.951092, 0.97883 , 0.048688, 0.118066, 0.779345, 0.271272],
		      dtype=float32)
```

==

```
import torch

torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 4),)])
def foo(x):
    for _ in range(x.size(0)):
        x = torch.neg(x)
    return x
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Traced function outputs do not match the Python function outputs.
ERROR: Graphs differed across invocations!
	Graph diff:
		  graph(%0 : Dynamic) {
		    %1 : Dynamic = aten::neg(%0)
		    %2 : Dynamic = aten::neg(%1)
		    %3 : Dynamic = aten::neg(%2)
		+   %4 : Dynamic = aten::neg(%3)
		-   return (%3);
		?            ^
		+   return (%4);
		?            ^
		  }
```

==

```
import torch

def foo(x):
    if not hasattr(foo, 'cache'):
        foo.cache = torch.neg(x)
    return x + foo.cache

traced = torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])(foo)
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Traced function outputs do not match the Python function outputs.
ERROR: Graphs differed across invocations!
	Graph diff:
		  graph(%0 : Dynamic) {
		-   %1 : Dynamic = aten::neg(%0)
		+   %1 : Dynamic = prim::Constant[value=<Tensor>]()
		    %2 : int = prim::Constant[value=1]()
		    %3 : Dynamic = aten::add(%0, %1, %2)
		    return (%3);
		  }
	Node diff:
		- %1 : Dynamic = aten::neg(%0)
		+ %1 : Dynamic = prim::Constant[value=<Tensor>]()
	Trace source location:
		test.py(5): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper
		test.py(8): <module>
	Check source location:
		test.py(6): foo
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace
		/Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper
		test.py(8): <module>
```

The following two examples show instances where program semantics are lost in the Python -> trace transformation, and repeated invocation does not give us useful debug information. Further design in underway for catching these scenarios.

```
import torch

torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])
def foo(x):
    for i in range(3):
        x[i, :] = torch.zeros(4)
    return x
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Traced function outputs do not match the Python function outputs.
Exception:
Not equal to tolerance rtol=1e-07, atol=0

(mismatch 100.0%)
 x: array([0.830221, 0.915481, 0.940281, 0.555241], dtype=float32)
 y: array([0., 0., 0., 0.], dtype=float32)
```

==

```
import torch

torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(5, 6),)])
def foo(x):
    x.view(-1).add_(-x.view(-1))
    return x
```

```
torch.jit.TracingCheckError: Tracing failed sanity checks!
ERROR: Traced function outputs do not match the Python function outputs.
Exception:
Not equal to tolerance rtol=1e-07, atol=0

(mismatch 100.0%)
 x: array([0.734441, 0.445327, 0.640592, 0.30076 , 0.891674, 0.124771],
      dtype=float32)
 y: array([0., 0., 0., 0., 0., 0.], dtype=float32)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10841

Differential Revision: D9499945

Pulled By: jamesr66a

fbshipit-source-id: 1f842a32d0b0645259cc43b29700b86d99c59a45
2018-08-28 20:25:26 -07:00
Zachary DeVito
22c9bc3117 Resolve builtins using a dict rather than by name (#10927)
Summary:
Changes the approach for resolving builtin ops so that the following works

```
add = torch.add
script
def foo(x):
  return add(x, x)
```

This handles cases when people alias torch and torch.nn.functional to
shorter names.

This works by building a table of id -> builtin name for the know builtin
ops in torch, torch.nn.functional, and for any user-defined
op created by accessing in torch.ops.foo.bar

This allows us to clean up many SugaredValue types in the compiler.

Notes:
* we now consider any attributes on python modules to be constants
(e.g. math.pi, and torch.double).
* fixes a bug where we incorrectly allowed attribute lookup on arbitrary
pyton objects. It is now restricted to modules only.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10927

Differential Revision: D9527522

Pulled By: zdevito

fbshipit-source-id: 0280422af08b4b0f48f302766d5a9c0deee47660
2018-08-28 11:25:11 -07:00
Elias Ellison
58b145f515 Fix negative indices in tracer (#10560)
Summary:
Previously when tracing slicing & select negative indices would get normalized, fixing the index to the size of the traced tensor. This makes the behavior the same as script so aten::select with negative indices is emitted.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10560

Differential Revision: D9493614

Pulled By: eellison

fbshipit-source-id: ce7a8bae59863723247208d86b9f2948051ccc6c
2018-08-27 15:19:41 -07:00
Zachary DeVito
6ce799edd6 Tuples/Lists can now be inputs/outputs to script and other simple fixes. (#10812)
Summary:
* Fix the necessary pathways so that tuples and lists can be inputs to the script.

* prevent linear algebra functions from being run in shape prop because
they frequently will error out for nonsense data.

* favor schema-driven python input conversion where possible.
remaining cases where we directly create Stacks without schema are
only for debugging

* Make the error messages when calling script/trace functions more pythonic

* Simplify FlattenTuples -- now that tuples are supported we can choose to only flatten tuples when needed. This may have to be revisited pending onnx test results, but is necessary for making tuple io work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10812

Differential Revision: D9477982

Pulled By: zdevito

fbshipit-source-id: ed06fc426e6ef6deb404602a26c435a7fc40ea0c
2018-08-27 14:40:40 -07:00
Richard Zou
35beecfe17 fix xfails involving literals (#10905)
Summary:
I missed these in #10900

cc apaszke jamesr66a zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10905

Differential Revision: D9516748

Pulled By: zou3519

fbshipit-source-id: a5c3e3b65a33c339d5c4e9fc160462c3d35705f3
2018-08-27 12:41:06 -07:00
Richard Zou
67f6f930a8 Remove FIXME_zerol() from test_jit.py (#10900)
Summary:
The scalar situation has gotten a lot better and now we can
remove all instances of FIXME_zerol().

cc zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10900

Differential Revision: D9514206

Pulled By: zou3519

fbshipit-source-id: e4e522f324126c5454cd6de14b832d2d1f6cb0ce
2018-08-27 08:55:08 -07:00
Adam Paszke
c8b246abf3 Prevent JIT from overspecializing to every single size configuration (#10844)
Summary:
Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details.

Summary of changes:

- Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail.
- Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`.
- Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`.
- Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`.

zdevito ezyang mruberry ngimel csarofeen
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10844

Differential Revision: D9498705

Pulled By: apaszke

fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61
2018-08-26 09:54:48 -07:00
Elias Ellison
0ef5cfd28c fix ivalue printing for lists (#10777)
Summary:
Fixing the printing of IValue lists, which didn't work previously.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10777

Differential Revision: D9474264

Pulled By: eellison

fbshipit-source-id: 0c7d6e7ecaa3f7908b131ac9f1036f19ac4f8b4f
2018-08-24 16:02:03 -07:00
Elias Ellison
74e6a666b3 If none of the schema match, add ImplicitTensorToNum conversions where needed. (#10180)
Summary:
When matching schema, first try to match without adding TensorToNum conversions. Then make another pass where TensorToNum conversions are allowed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10180

Differential Revision: D9438153

Pulled By: eellison

fbshipit-source-id: 80541b5abd06e9d4187e89dda751f44dab6f58c5
2018-08-24 16:02:00 -07:00
Richard Zou
ca567862b2 Support multidimensional indexing (#10787)
Summary:
Part of #10774.

This PR does the following:
- Support ast.ExtSlice in the frontend. This is done by returning a
  list of ast.Index and ast.Slice.
- Support multidimensional indexing with ints and slices

The general approach is to desugar multidimensional indexing into
at::slice, at::select operations. This is exactly how normal pytorch
does indexing (by desugaring it into at::slice, at::select, and other ops).

I used [this code](https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/python_variable_indexing.cpp) as reference.
We should be able to copy the rest of this to implement the missing
indexing features in script (indexing with ellipses, tensors, sequences, etc).

After I'm done implementing the missing indexing features in future prs, I can try to
templatize python_variable_indexing.cpp so that it can work with both JIT
script and normal pytorch indexing, but right now I'm not sure if that's
a good idea or not.

cc zdevito jamesr66a apaszke wanchaol
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10787

Differential Revision: D9481402

Pulled By: zou3519

fbshipit-source-id: 78c9fa42771a037d157879e23e20b87401cf1837
2018-08-24 08:10:32 -07:00
Zachary DeVito
3d43a82440 Add support for vararg style functions. (#10250)
Summary:
Things like `zeros(1,2,3, dtype=torch.int)` are now supported in the script by altering tryMatchSchema to auto-construct the list `[1,2,3]` when it sees inlined members of the list as the last positional arguments.

I suggest reading the commits individually, since the first two incrementally change how we do tryMatchSchema to get it ready for adding vararg list conversion, while the third actually does the modification.

closes #10632
closes #8516
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10250

Differential Revision: D9478235

Pulled By: zdevito

fbshipit-source-id: 0c48caf7a6184e463d9293d97015e9884758ef9c
2018-08-23 15:10:36 -07:00
Elias Ellison
5c0eece2fd Force types on values returned from if blocks to be equivalent (#10281)
Summary:
When emitting if Branches, check that the types on each value returned are equivalent. As with reassignment of values, tensors are not forced to be the same shape or subtype.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10281

Differential Revision: D9466566

Pulled By: eellison

fbshipit-source-id: 746abdeb34a0f68806b8e73726ad5003b536911c
2018-08-22 19:55:38 -07:00
Adam Paszke
f72e813c2f Allow tracing functions that take tuples of tensors as inputs (#10637)
Summary:
And return tuples.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10637

Reviewed By: eellison

Differential Revision: D9385892

Pulled By: apaszke

fbshipit-source-id: 542f4444d909fb246d7f1d88d6fb98345de2d431
2018-08-22 15:37:10 -07:00
Richard Zou
6c84f7fea0 Relax RHS type assert for augassign (#10730)
Summary:
Augassign (i.e., `x += 1`) gets desugared to an assignment of a binop (`x = x + 1`).
Right now we assert that the RHS of the binop is a tensor,
but it really doesn't have to be because we support scalar/scalar ops and also
list-list ops (i.e., `[1, 2] + [2, 3]`).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10730

Differential Revision: D9465110

Pulled By: zou3519

fbshipit-source-id: 7b118622701f09ce356aca81b8db743d9611097b
2018-08-22 15:10:33 -07:00
James Reed
6fcac354c5 Erase ListConstruct nodes for ONNX export (#10713)
Summary:
ONNX doesn't support this. Instead flatten the inputs to the ListConstruct op and inline it into the subsequent usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10713

Differential Revision: D9458508

Pulled By: jamesr66a

fbshipit-source-id: 0b41e69320e694bb2f304c6221864a39121e4694
2018-08-22 14:39:58 -07:00
Michael Suo
9e75ec11fb Make empty list literals construct empty Tensor[] (#10705)
Summary:
This will make the common case more natural (no need to do `_construct_empty_tensor_list()`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10705

Differential Revision: D9411622

Pulled By: michaelsuo

fbshipit-source-id: 2d91fbc5787426748d6e1c8e7bbeee737544dc96
2018-08-20 18:28:28 -07:00
James Reed
585e6b581f Allow method-style casts on tensors (#10641)
Summary:
Closes https://github.com/pytorch/pytorch/issues/10631
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10641

Differential Revision: D9407598

Pulled By: jamesr66a

fbshipit-source-id: a0331f4e9e55d92718cde7a1112fe8c705206b1f
2018-08-20 14:10:21 -07:00
Richard Zou
f1420adfe3 Move at::chunk into the graph fuser (#10178)
Summary:
... to avoid slow at::chunk (it is slow due to tensor initialization). Picking up from #10026

This is done through the following:

1) Absorb starting chunks into FusionGroup as a part of the graph fuser
pass.
2) When compiling a kernel, emit a `std::vector<ConcatDesc>` that describes if an input (of the original graph) will be chunked.
3) When launching a kernel, `use std::vector<ConcatDesc>` to chunk an
input tensor on the CPU. This chunk directly takes in an at::Tensor and creates
four TensorInfo structs in-place in the argument list, bypassing the creation of intermediate Tensors.

- Expect test and correctness test to see if a single chunk is fused
  by the graph fuser
- Correctness test for a variety of chunks (dimension = beginning,
  middle, end) and tensors (contiguous, non-contiguous, edge case
  (splitSize = 1) for both CPU/CUDA
- Expect test for multiple chunks fused into the same kernel and
  correctness test.

cc zdevito apaszke

LSTM forward pass, 1 layer, 512 hidden size and input size, 100 seq length, requires_grad=False on all inputs and weights.

After changes:
```
thnn    cudnn   jit
8.8468  6.5797  9.3470
```

Before changes:
```
thnn    cudnn   jit
9.9221  6.6539  11.2550
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10178

Differential Revision: D9382661

Pulled By: zou3519

fbshipit-source-id: 1f8a749208fbdd45559775ce98cf4eb9558448f8
2018-08-18 16:10:11 -07:00
Richard Zou
e29b5a1ea8 graph fuser inserts explicit expands where necessary (#10325)
Summary:
Fixes #10096

If the only thing preventing a simple mappable operator from being fused
into a fusion group is that its Tensor inputs are not of the same shape as the
output, then the graph fuser inserts explicit expand nodes for those
inputs.
This helps the graph fuser not miss out on any fusion opportunities
involving simple mappable operations that have Tensor inputs. This PR
doesn't do anything for the scalar case; that can be addressed later.

Test Plan
- Simple expect test case
- Added expect tests for a raw LSTMCell. The expands help speed up the
  forwards pass by allowing more operations to be fused into the LSTMCell's single
  FusionGroup.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10325

Differential Revision: D9379308

Pulled By: zou3519

fbshipit-source-id: 86d2202eb97e9bb16e511667b7fe177aeaf88245
2018-08-17 16:03:46 -07:00
Richard Zou
86c9856d9c Fuse tensor-scalar ops when scalar is constant (#10511)
Summary:
This is on the way to resolving #9940.

Fixes #10501

This PR modifies graph fuser to fuse operations that have constant
scalar arguments. These constant scalar arguments are directly inlined
into the kernel body.

The context for this is that LSTM backward (in particular, sigmoid
backward) has many add(x, 1.) operations. This PR should be sufficient for
LSTM backward to get fused by the graph fuser.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10511

Differential Revision: D9378896

Pulled By: zou3519

fbshipit-source-id: 6a7a2987f5b6e8edaaf4b599cd200df33361650f
2018-08-17 14:10:23 -07:00
Wanchao Liang
52058204d6 Add nn functional tests in JIT (#10409)
Summary:
The PR is the first step to integrate torch.nn library with JIT. It adds the tests for nn functional interfaces in trace/script mode, and tries to find out the different between torch.nn.functional ops and the ATen ops, to see the work need to be done in order to support a full set of nn functional in script mode.

Some statistics in summary:

- Totally 84 useful functions in torch.nn.functional (the number does not include helper funcs and deprecated funcs in torch.nn.functional).

- 7 functions/ops does not support higher gradient, so just excluded from the whole test.

- 36 functions is different with the Aten op for different reasons. Among those 36 functions, bunch of them (roughly around 10-15) are just naming difference and simple transformation using other ops inside the function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10409

Differential Revision: D9350694

Pulled By: wanchaol

fbshipit-source-id: 8fce6f30d8d25ace5a544a57b219fe61f5a092f8
2018-08-17 11:09:49 -07:00
Elias Ellison
e190505e84 Adding support for inlining if branches (#10084)
Summary:
Inlining if branches which have constant inputs.  If an if node gets inlined, the set of mutated variables returned by its ancestors may have changed. In the following example the block should
return a mutated set of (a) and not (a, b).

```
if cond:
  if True:
	 a = a - 1
    else:
	b = b - 1
```
To calculate this we recursively update mutate variables in if branches from the leaf nodes up.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10084

Reviewed By: michaelsuo

Differential Revision: D9340429

Pulled By: eellison

fbshipit-source-id: b0dd638a5cace9fdec3130460428fca655ce4b98
2018-08-17 09:48:47 -07:00
Peter Goldsborough
c101a57a74 Build mechanism for custom operators (#10226)
Summary:
This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I:

1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries
2. Created a ` torch/op.h` header for easy inclusion of necessary headers,
3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op.
    1. It defines an op in `op.{h,cpp}`
    2. Registers it with the JIT using `RegisterOperators`
    3. Builds it into a shared library via a `CMakeLists.txt`
    4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey!

The pure C++ and the Python builds are separate and not coupled in any way.

zdevito soumith dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226

Differential Revision: D9296839

Pulled By: goldsborough

fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0
2018-08-16 18:56:17 -07:00
Owen Anderson
abf85bf0ef Perform CSE across block boundaries. (#10105)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10105

Differential Revision: D9186678

Pulled By: resistor

fbshipit-source-id: 87b63d4fc0c7d394edb4777acdefa8f022a8bf8d
2018-08-16 00:25:36 -07:00
James Reed
32bb4040dd Unified type annotation parsing for script frontends (#10279)
Summary:
After this, all combinations of {String frontend, Python AST Frontend}{Python 3-style type annotations, MyPy-style type comments}{Script method, Script function} should properly accept type annotations.

Possible TODOs:
- Clean up the functions marked HACK
- Clean up the Subscript tree-view to better match the Python AST versions
- Can we use this for Python functions? That's the only place annotations.get_signature() is still needed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10279

Differential Revision: D9319726

Pulled By: jamesr66a

fbshipit-source-id: b13f7d4f066b0283d4fc1421a1abb9305c3b28fa
2018-08-14 18:13:15 -07:00
Richard Zou
b4462511fd Add LSTMCell backward pass expect tests (#10506)
Summary:
- Exposed get_debug_graph for ScriptModule (gets the debug graph for its
  forward Method)
- Added forward/backward expect tests for lstm and milstm cells. These
  are intended to prevent regressions

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10506

Differential Revision: D9316590

Pulled By: zou3519

fbshipit-source-id: 3c2510d8363e9733ccbc5c7cc015cd1d028efecf
2018-08-14 11:39:44 -07:00
Zachary DeVito
61bedc96f0 Schema-based creation of graph nodes (#10198)
Summary:
This commit adds the ability to insert a node with inputs, using the schema to check the inputs are valid types, fill in any default values, and perform standard implicit conversions. Since it is schema based, it will discover and use the right overload.
Constructors to `NamedValue` enable it to be constructed using `IValue` constants so it is possible to use constant values in the input list as well:

```
g.insert(aten::add, {v, 3});
```

Keyword arguments are also supported:

```
g.insert(aten::add, {v}, {{"other", t}, {"scalar", 1}});
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10198

Differential Revision: D9307252

Pulled By: zdevito

fbshipit-source-id: 644620aa85047d1eae1288383a619d50fec44d9b
2018-08-14 10:25:38 -07:00
Richard Zou
fed05cf4cf Fix prim::FusedConcat bug (#10466)
Summary:
Fixes #10456

The graph fuser was fusing together groups with prim::FusedConcat (the producer) with other ops (the consumer) if the consumer is fusable. For example,

```
import torch
torch.jit.script
def fn(x, y, z):
    x1 = x + y
    y1 = x - y
    w = torch.cat([x1, y1])
    return w + z

x = torch.randn(2, 2, dtype=torch.float, device='cpu')
y = torch.randn(2, 2, dtype=torch.float, device='cpu')
z = torch.randn(4, 2, dtype=torch.float, device='cpu')
fn(x, y, z)
fn.graph_for(x, y, z)
```
produced the following graph:
```
graph(%x : Float(2, 2)
      %y : Float(2, 2)
      %z : Float(4, 2)) {
  %3 : int = prim::Constant[value=1]()
  %y1 : Float(2, 2) = aten::sub(%x, %y, %3)
  %8 : int = prim::Constant[value=0]()
  %14 : Float(4, 2) = prim::FusionGroup_0[device=-1](%z, %y1, %x, %y)
  return (%14);
}
with prim::FusionGroup_0 = graph(%1 : Float(4, 2)
      %5 : Float(2, 2)
      %7 : Float(2, 2)
      %8 : Float(2, 2)) {
  %11 : int = prim::Constant[value=1]()
  %9 : int = prim::Constant[value=1]()
  %x1 : Float(2, 2) = aten::add(%7, %8, %9)
  %w : Float(4, 2) = prim::FusedConcat[dim=0](%x1, %5)
  %2 : int = prim::Constant[value=1]()
  %3 : Float(4, 2) = aten::add(%w, %1, %2)
  return (%3);
}
```

this is a problem because it violates two invariants:
1) all inputs to the FusionGroup must have the same size
2) prim::FusedConcat's output must not be used inside the FusionGroup

This PR fixes this problem by checking if the output to a FusionGroup came from a prim::FusedConcat node when deciding whether to fuse the consumer and producer.
If the producer is a value that came from a prim::FusedConcat node in a FusionGroup, then consumer & producer do not get fused.

cc apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10466

Differential Revision: D9296686

Pulled By: zou3519

fbshipit-source-id: ed826fa9c436b42c04ca7d4d790cece804c162bd
2018-08-13 21:09:25 -07:00
iotamudelta
75651d5b58 improve use of ROCm libraries, enable more tests, small fixes (#10406)
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406

Reviewed By: Jorghi12

Differential Revision: D9277093

Pulled By: ezyang

fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
2018-08-13 11:39:43 -07:00
Roy Li
e9ad74357e Use serialization container in ir import export (#10394)
Summary:
Copy of #10191 because these changes didn't land with the diff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10394

Differential Revision: D9260816

Pulled By: li-roy

fbshipit-source-id: 7dc16919cfab6221fda1d44e98c5b900cfb40558
2018-08-10 00:09:30 -07:00
Michael Suo
0950d7a98d support list slicing (#10318)
Summary:
As title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10318

Differential Revision: D9254351

Pulled By: michaelsuo

fbshipit-source-id: be891a584dc295b5e353f7f5257d64a356fb9586
2018-08-09 17:25:13 -07:00
Michael Suo
b6402648f4 fix off-by-one bug in open-ended slicing (#10286)
Summary:
Previously, `tensor[i:]` was transformed to `tensor[i:-1]`. This incorrectly leaves off the last element. Noticed this when implementing slicing for list types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10286

Differential Revision: D9193292

Pulled By: michaelsuo

fbshipit-source-id: df372b815f9a3b8029830dd9e8769f9985a890e7
2018-08-07 00:39:42 -07:00
Michael Suo
5a7c710548 Support some basic list operations (#10225)
Summary:
Support a few basic operators:
- eq
- add
- len
- select (indexing)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10225

Differential Revision: D9172338

Pulled By: michaelsuo

fbshipit-source-id: 6e75ec1453b9589b0fb4698598ecdba5a5fccff9
2018-08-07 00:39:40 -07:00
iotamudelta
a38b572de3 enable unit tests and other changes (#10266)
Summary:
This PR for the ROCm target does the following:
* enable some unit tests on ROCm
* fix a missing static_cast that breaks BatchNorm call on ROCm
* fix BatchNorm to work on ROCm w/ ROCm warp sizes etc
* improve the pyhipify script by introducing kernel scope to some transpilations and other improvements
* fix a linking issue on ROCm
* for more unit test sets: mark currently broken tests broken (to be fixed)
* enable THINLTO (phase one) to parallelize linking
* address the first failing of the elementwise kernel by removing non-working ROCm specialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266

Differential Revision: D9184178

Pulled By: ezyang

fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297
2018-08-06 14:54:01 -07:00
Peter Goldsborough
0c848f4179 Python integration for custom operators (#10149)
Summary:
Adds the Python path to custom operators, including dynamically loading operations into Python.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10149

Reviewed By: ezyang

Differential Revision: D9158380

Pulled By: goldsborough

fbshipit-source-id: 3edffa639e8d2959e9e80d1bd4f20ab4a1b3ca02
2018-08-06 13:54:48 -07:00
Richard Zou
29406a2c4c Fix shared_ptr refcycle in graph executor (#10222)
Summary:
Fixes #10032

When capturing an output, GraphExecutorAutogradFunction creates
SavedVariable with is_output=False and owns it:
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/graph_executor.cpp#L87

Constructing SavedVariable with is_output=False makes it own a copy of
the shared_ptr<GraphExecutorAutogradFunction>, which causes a reference
cycle:
6456b944fd/torch/csrc/autograd/saved_variable.cpp (L27)

The solution in this PR is to construct the SavedVariable with
is_output=True if the captured value is an output.

Test Plan

Turn on cuda memory checking for JitTestCase. If the test's name
includes "cuda" or "gpu" in it, the cuda memory checking test happens.

cc zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10222

Reviewed By: ezyang

Differential Revision: D9162995

Pulled By: zou3519

fbshipit-source-id: aeace85a09160c7a7e79cf35f6ac61eac87cbf66
2018-08-04 11:39:10 -07:00
Wanchao Liang
50cf326158 Allow type cast between int and float in Script (#10168)
Summary:
The PR allows int→float and float→int casts. Current we only allow `tensor→int` and `tensor→float` casts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10168

Differential Revision: D9141163

Pulled By: wanchaol

fbshipit-source-id: 5e5591a98b4985a675641dfc9a385b2a0bf8e208
2018-08-03 10:56:05 -07:00
Michael Suo
13de6e8dfa Make list literals construct ListType (#10193)
Summary:
Previously, `foo = [bar, baz]` would construct a TupleType of fixed arity. This would cause code like:
```
foo = [2]
if True:
    foo = [2, 2]
```
to fail to compile, since `(int)` is not the same as `(int, int)`.

This PR changes things so that list literals construct ListTypes, which can be resized.

Potentially breaking changes introduced:
- Empty list literals are now disallowed, `_constructEmptyFooList()` builtins are required to replace them.
- Iterable variable unpacking where the rhs is a list is now disallowed. (Tuples still work)
- Lists must have a single type.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10193

Differential Revision: D9147166

Pulled By: michaelsuo

fbshipit-source-id: bbd1b97b0b6b7cb0e6f9d6aefa1ee9c731e63039
2018-08-03 00:55:23 -07:00
Roy Li
0e9c6898cb Export modules in ir with google protobuf
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9746

Differential Revision: D9110006

Pulled By: li-roy

fbshipit-source-id: 8b9744c042f822fdfe959a7a7fef3d0baff4f639
2018-08-02 15:54:51 -07:00
Elias Ellison
170d29769b Strings lexing, parsing, implementation in print (#9324)
Summary:
This PR adds strings to the ast and implements them for print statements. Strings are lifted as attributes to the print node. They must be arguments to print itself, not as an argument for an object that is passed to print.  If they are encountered elsewhere a NYI exception will be thrown.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9324

Reviewed By: jramseyer

Differential Revision: D8807128

Pulled By: eellison

fbshipit-source-id: 984401ff458ed18d473c6d1bd86750e56c77d078
2018-08-02 11:09:03 -07:00
James Reed
9c818bfbc7 Refactor PythonValue types + use tryMatchSchema for PythonOp
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10132

Differential Revision: D9121327

Pulled By: jamesr66a

fbshipit-source-id: 6d8bcf6b0dca54106cf9ed740bcff857062a03da
2018-08-02 10:26:58 -07:00
iotamudelta
cfa05706ef ROCm contributions week 29 (#9653)
Summary:
In this changeset:
* improvements to `hipify-python.py`
* marking unit tests broken for ROCm
* reducing the number of jobs for the built to avoid out of memory issues
* switch to Thrust/cub-hip master for the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653

Differential Revision: D9117791

Pulled By: ezyang

fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352
2018-08-02 09:09:00 -07:00