Commit Graph

484 Commits

Author SHA1 Message Date
David Riazati
d429e78a9a Add fractional_max_pool2d to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14591

Differential Revision: D13270755

Pulled By: driazati

fbshipit-source-id: 138a60256795f5ef8d236c75be2cfd929059b98f
2018-12-03 23:49:38 -08:00
Michael Suo
95e5a5ae0c basic testing of builtin alias annotations (#14588)
Summary:
Check whether the codegen'd alias annotations actually track alias creation and writes correctly. This could be made more exhaustive, but it's good enough for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14588

Differential Revision: D13312653

Pulled By: suo

fbshipit-source-id: 98de1610ea86deada71957c75c222fff331a0888
2018-12-03 22:31:02 -08:00
Wanchao Liang
119f9ec291 enable NoneValue parameter assignment for WeakScriptModule (#14715)
Summary:
This PR:

1. Handle None value attr in the WeakScriptModuleProxy
2. add back module tests that now passing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14715

Differential Revision: D13313573

Pulled By: wanchaol

fbshipit-source-id: a6b7892707350290a6d69b6f6270ad089bfc954b
2018-12-03 20:40:55 -08:00
Zachary DeVito
bb546b2e5b WAR for self.training (#14719)
Summary:
To enable self.training in script modules, this PR automatically adds a buffer called 'training' if a script method requests self.training. Assignment to self.training is overloaded to assign both to the boolean property and the tensor value.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14719

Differential Revision: D13310569

Pulled By: zdevito

fbshipit-source-id: 406387bb602f8ce5794eeff37642863c75928be5
2018-12-03 20:32:16 -08:00
Zachary DeVito
78d594f46c Implement Device as a type in the script (#14666)
Summary:
[ note:  stacked on expect files changes, will unstack once they land ]
This adds DeviceObjType (cannot use DeviceType it is already an enum)
to the type hierarchy and an isDevice/toDevice pair to IValue.
Previous hacks which used an int[] to represent Device are removed
and at::Device is used instead.

Note: the behavior or .to is only a subset of python, we need to
fix the aten op so that it accepts Option[Device] and Optional[ScalarType].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14666

Reviewed By: suo

Differential Revision: D13290405

Pulled By: zdevito

fbshipit-source-id: 68b4381b292f5418a6a46aaa077f1c902750b134
2018-12-03 16:54:40 -08:00
Wanchao Liang
4b31572375 Meta programming on If Stmt cond to enable conditional emit blocks (#14533)
Summary:
This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below

```python
import torch

class Test(torch.jit.ScriptModule):
  def __init__(self, b = None):
    self.b = b
  def forward(self, input):
    x = input
    if self.b is not None:
      x = self.b(input)

    return x

  Test()(torch.randn(2, 3))
```
This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533

Differential Revision: D13310526

Pulled By: wanchaol

fbshipit-source-id: 78d1a8127acda5e44d2a8a88f7627c43d29ff244
2018-12-03 15:47:15 -08:00
Michael Suo
9ac845f734 Revert D13280899: [pytorch][PR] Reduce broadcasted inputs in derivative code
Differential Revision:
D13280899

Original commit changeset: 80cc5ec9331b

fbshipit-source-id: 2335093cca8fd7db95470fd83b9299adfa17aa8e
2018-12-03 14:55:02 -08:00
Lu Fang
e0f68671bd Restore device when import jit script module (#14454)
Summary:
We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454

Reviewed By: zrphercule

Differential Revision: D13271956

Pulled By: houseroad

fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49
2018-12-03 14:10:30 -08:00
David Riazati
b8da44dc13 Add linear + pixelshuffle modules to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14654

Differential Revision: D13300968

Pulled By: driazati

fbshipit-source-id: 2c36aab91ea99681687f8da6d318981fee49785b
2018-12-03 14:01:16 -08:00
Adam Paszke
68ffe46991 Reduce broadcasted inputs in derivative code (#14485)
Summary:
Previously symbolic AD formulas assumed that no broadcasting happened,
and would return gradients of incorrect shapes (possibly leading to
silent errors later).

Fixes a few bugs (known and unknown):
- #11736
- ArgumentSpec didn't compute the input types correctly [(it didn't advance the offset for non-tensor args)](https://github.com/pytorch/pytorch/pull/14485/files#diff-4fd3157a056596aefb8cdf41022a208bR153)
- Symbolic AD could suffer from use after free (dangling pointers in grad map), because [`EliminateDeadCode` could have removed nodes](https://github.com/pytorch/pytorch/pull/14485/files#diff-25d33ad1ed6855684dec79d927ca6142L781) that referenced gradients of certain values.
- Undefined behavior in `aten::size`

During my tests I've also found a few new problems, and I have opened issues for them:
- FusionGroup seems to think that cat nodes broadcast their inputs (#14483)
- `prim::ConstantChunk` derivative formula doesn't handle undefined inputs (#14484)

This patch unfortunately deoptimizes some of our code (Fusion doesn't happen past chunk nodes, and outputs more tensors only because we have to get their size). I know how to fix those issues, but wanted to fix this terrible bug quickly.

cc zou3519 zdevito ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14485

Differential Revision: D13280899

Pulled By: soumith

fbshipit-source-id: 80cc5ec9331be80e1bb9ddfe85b81c2b997e0b0c
2018-12-03 13:44:18 -08:00
Michael Suo
b768db0810 Allow DCE to clean up some mutable ops (#14601)
Summary:
This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to.

This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils.

Tested on traced ResNet, and it gets rid of the dead code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601

Differential Revision: D13309118

Pulled By: suo

fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254
2018-12-03 13:31:08 -08:00
Michael Suo
9783ce3825 Revert D13272203: [pytorch][PR] [jit] Meta programming on If Stmt cond to enable conditional emit blocks
Differential Revision:
D13272203

Original commit changeset: 44a545abb766

fbshipit-source-id: 8861eb4810a6c9ea4aba8427b3a07d2fa0d69a15
2018-12-03 13:28:52 -08:00
Wanchao Liang
5a2f5a216f Make convertable to list also accepts optional
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14598

Differential Revision: D13308254

Pulled By: wanchaol

fbshipit-source-id: bd0b6f9f20294d3d589cf68732dbd8c57b67e0e9
2018-12-03 13:09:11 -08:00
Wanchao Liang
4b90702037 Meta programming on If Stmt cond to enable conditional emit blocks (#14533)
Summary:
This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below

```python
import torch

class Test(torch.jit.ScriptModule):
  def __init__(self, b = None):
    self.b = b
  def forward(self, input):
    x = input
    if self.b is not None:
      x = self.b(input)

    return x

  Test()(torch.randn(2, 3))
```
This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533

Differential Revision: D13272203

Pulled By: wanchaol

fbshipit-source-id: 44a545abb766bbd39b762a6e19f9ebaa295e324b
2018-12-03 12:14:52 -08:00
Zachary DeVito
4c11dee0e8 Use Type::str() in Type::operator<< (#14657)
Summary:
Stacked on zip commit because it also changes expect files, read only the last commit.

This reduces the number of ways we can print a Type from 3 (python_str, str, operator<<) to 2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14657

Differential Revision: D13288912

Pulled By: zdevito

fbshipit-source-id: f8dd610cea798c511c1d4327395bba54b1aa1697
2018-12-01 00:53:27 -08:00
Zachary DeVito
170ff7764f Use a zip archive as our container format (#14521)
Summary:
After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format.
miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it.

This replaces our custom container format with a zip archive, preserving all of the
desirable features of our custom format, such as append-oriented writing, and
mmap'able tensor data while adding a bunch of debugging advantages:

1. You can unzip and explore the container to debug what is going on with a model.
2. You can edit the model using a text editor (e.g. change the definition of a method,
   or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress'
   option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug
   serialized models.
3. We can easily enable features like compression in the future.
4. Stock python , without pytorch installed, and other programming languages
   can reasonably consume this format,using json  and zipfile packages, which enables
   people to build tools like visualizers without those visualizers depending on pytorch.
   This will be especially useful if you want to, for instance, write a visualizer in javascript.

Notes:

*  This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained
   library for reading/writing zipfiles that unlike other zip libraries also includes libz
   compatible compress/decompress support. It is a single header and a single C file without
   any other dependencies. Note that the instructions for miniz explicitly state:

   > Please use the files from the releases page in your projects. Do not use the git checkout directly!

   So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable
   to doing zip-align style things to align data.

*  Removes 'size' from RecordRef. This allows you to edit files in the zip archive without
   editing the meta-data file. Very important if you want to print-debug serialized models.

*  PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings)
   However, their implementation is completely swapped out to use miniz.

*  Code exists to check for the old magic number to give a decent warning to our preview users
   after we change the format.

*  Container version information is now put in a stand-alone 'version' file in the archive
   and serves a similar purpose to the other container version info.

*  All files in the zip archive start at 64-byte boundaries, using an approach similar to
   zip-align. Tests check that this property remains true. While the writer does this,
   the reader doesn't depend on it, allowing user-created archives that can use compression,
   and do not have to align data.

*  Added test to check for > 4GB files and archives. Disabled by default because it takes
   almost 2 minutes to run.

*  torchscript files are now optional: if a submodule does not have methods, it will
   not be written.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521

Reviewed By: jamesr66a

Differential Revision: D13252945

Pulled By: zdevito

fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c
2018-11-30 19:19:29 -08:00
Elias Ellison
404ad939e5 Revert existing no_grad_embedding_renorm_ from aten (#14639)
Summary:
Remove no_grad_embedding_renorm_ from aten. Setting the derivatives of the inputs to false has different semantics from calling with no_grad(), because it will not error if an input is modified and then has it's grad accessed.

Instead, make a custom op, and use NoGradGuard.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14639

Differential Revision: D13285604

Pulled By: eellison

fbshipit-source-id: c7d343fe8f22e369669e92799f167674f124ffe7
2018-11-30 16:57:51 -08:00
David Riazati
814b5715ba Move module tests to common_nn (#14578)
Summary:
This moves `new_module_tests` from `test_nn.py` to `common_nn.py` so
that they can be used in `test_jit.py` without running any of
`test_nn.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14578

Differential Revision: D13268286

Pulled By: driazati

fbshipit-source-id: 6e8654a4c29ab754d656ac83820c14d1c1843e03
2018-11-30 12:14:59 -08:00
David Riazati
89c3dbcad8 Add binary cross entropy to standard lib
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14583

Differential Revision: D13269423

Pulled By: driazati

fbshipit-source-id: 7cc1594d8189c3e8f2d4ce0462fdc0a03683006e
2018-11-29 22:23:13 -08:00
James Reed
1975917d0e fix copy_ (#14593)
Summary:
Closes https://github.com/pytorch/pytorch/issues/14590
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14593

Differential Revision: D13272510

Pulled By: jamesr66a

fbshipit-source-id: b6921a98460c371d435277c416dad0b5ab0fec8c
2018-11-29 20:31:53 -08:00
Zachary DeVito
fd31eae9ad Switch import/export to python printing (#14400)
Summary:
Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit.

This changes the way methods are defined in TorchScript archives to use
PythonPrint rather than ONNX protobufs.

It also updates torch.proto to directly document the tensor data
structure actually being serialized.

Notes:
* because PythonPrint prints all the methods at once per module, this
  removes MethodDef in favor of a single torchscript_area and a separate
  caffe2_graphs entry. Note that NetDef's already have method names,
  so there is no need or a separate method name entry.
* This switches cpp/pickle area to RecordRef (references to a file in
  the container format) since it is possible the data in these arenas
  may be large and not suited to json ouput.
* Removes 'annotations' -- annotations should be re-added on the first
  commit that actually has a practical use for them. In the current state
  it is unlikely they are representing the right information.
* Some expect files have changed because PythonPrint is preserving more
  debug name information for parameter names.
* MethodEncoder (the ONNX output format) has been deleted. There is still
  some cleanup possible combining EncoderBase and GraphEncode now that there
  is only a single pathway using EncoderBase.
* This incorporates the changes from #14397
  to define TensorDef
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400

Reviewed By: suo

Differential Revision: D13231800

Pulled By: zdevito

fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571
2018-11-29 17:53:49 -08:00
David Riazati
666d383a00 Add broadcast list default arg support (#14361)
Summary:
To convert `max_unpool` functions to weak script, this PR adds support
for `T` as default arguments for `BroadcastingListN[T]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14361

Differential Revision: D13192231

Pulled By: driazati

fbshipit-source-id: a25b75a0e88ba3dfa22d6a83775e9778d735e249
2018-11-29 15:15:47 -08:00
Adam Paszke
31b3d81714 Broadcast prim::FusedConcat inputs independently when checking kernels (#14503)
Summary:
Fixes #14483.

cc zou3519 mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14503

Differential Revision: D13256343

Pulled By: zou3519

fbshipit-source-id: 1c68a23f425be067a742bada7ee8cdfab7fc3fa2
2018-11-29 13:05:00 -08:00
David Riazati
9e93a02624 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13252887

Pulled By: driazati

fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988
2018-11-28 23:31:25 -08:00
Elias Ellison
6d63e9dbff Support Embedding + EmbeddingBag in Script + (Ignore flakey test) (#14509)
Summary:
Resubmitting PR #14415

The tests added for Embedding + EmbeddingBag had random numbers as input, which affected the random number generator & caused the flakey test to break.

Everything but the last two commits have already been accepted
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14509

Differential Revision: D13247917

Pulled By: eellison

fbshipit-source-id: ea6963c47f666c07687787e2fa82020cddc6aa15
2018-11-28 19:16:38 -08:00
Elias Ellison
105fa58748 pointwise_loss (#14134)
Summary:
Adding pointwise loss ops to weak_script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14134

Differential Revision: D13209455

Pulled By: eellison

fbshipit-source-id: 87fc0222121f34a2f4edb24c2da2a11124b097d8
2018-11-28 18:14:38 -08:00
Edward Yang
5f07b33857 Revert D13219647: [pytorch][PR] Support Embedding + EmbeddingBag in Script
Differential Revision:
D13219647

Original commit changeset: c90706aa6fbd

fbshipit-source-id: d189e717ba0773de43d633876bc3a688830a9303
2018-11-28 13:38:58 -08:00
Elias Ellison
7749804099 Support Embedding + EmbeddingBag in Script (#14415)
Summary:
Add support for Embedding and EmbeddingBag in script. Both functions require with torch.no_grad(), which we don't have any plans to support in the near future. To work around this, I added a embedding_renorm function without derivatives.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14415

Reviewed By: wanchaol

Differential Revision: D13219647

Pulled By: eellison

fbshipit-source-id: c90706aa6fbd48686eb10f3efdb65844be7b8717
2018-11-28 10:52:30 -08:00
David Riazati
3d98810fbd Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit
Differential Revision:
D13192230

Original commit changeset: 36488960b6c9

fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909
2018-11-28 00:23:09 -08:00
David Riazati
4cdcbbf410 Use nn module tests in test_jit (#14238)
Summary:
This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types`

Also depends on #14379
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238

Differential Revision: D13192230

Pulled By: driazati

fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e
2018-11-27 21:19:51 -08:00
David Riazati
662f66ebb9 Add poisson_nll_loss to script
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14420

Differential Revision: D13220726

Pulled By: driazati

fbshipit-source-id: 6c08a0050075beafcc8ba413c9603b273870c70c
2018-11-27 19:39:16 -08:00
David Riazati
d75f751bec Add boolean dispatch for function overloading (#14425)
Summary:
This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See max_pool1d for an example usage.

This is the first step in enabling the use of max_pool functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions.

Fixes #14081
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14425

Differential Revision: D13222104

Pulled By: driazati

fbshipit-source-id: 8cb676b8b13ebcec3262234698edf4a7d7dcbbe1
2018-11-27 19:36:47 -08:00
Zachary DeVito
23f901a737 fix enable_cpu_fuser
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14440

Differential Revision: D13226354

Pulled By: zdevito

fbshipit-source-id: e4ed023eece8b5b670a4a27d24a8688907b36b90
2018-11-27 19:14:10 -08:00
Elias Ellison
82175f31b4 Move Affine grid to C++ (#14392)
Summary:
Port AffineGrid to C++, because script does not support compiling Function classes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14392

Differential Revision: D13219698

Pulled By: eellison

fbshipit-source-id: 3ddad8a84c72010b5a6c6f7f9712be614202faa6
2018-11-27 18:38:11 -08:00
Zachary DeVito
226a01e5a1 Handling of pretty-printing methods (#14378)
Summary:
Stacked on #14176, review only the last commit.
* Print parameters to methods as self.weight rather than as extra inputs.
* Print entire set of methods out as a single string
* Update test code to test the module-at-a-time export/import
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14378

Differential Revision: D13198463

Pulled By: zdevito

fbshipit-source-id: 3fab02e8239cfd6f40d6ab6399047bd02cf0a8c8
2018-11-27 17:10:23 -08:00
zrphercule
ba6c49cb9c Add test of ONNX_ATEN (#14259)
Summary:
In #14239 we fixed ONNX_ATEN.
In order to make sure its correctness in the future, we should add related test case.
We use torch.fmod() to test ONNX_ATEN.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14259

Differential Revision: D13204610

Pulled By: zrphercule

fbshipit-source-id: e4660c346e5edd201f1458b7d74d7dfac49b94c7
2018-11-27 13:51:51 -08:00
David Riazati
1b80644b4d Revert D13192228: [pytorch][PR] [jit] Add boolean dispatch for function overloading
Differential Revision:
D13192228

Original commit changeset: fce33c400c1f

fbshipit-source-id: 75c9991dc7097f9513c6c89d16eff2de6e287c3b
2018-11-27 13:14:42 -08:00
Michael Suo
3fca4bde50 Trace in-place ops (#14254)
Summary:
This PR adds a `try_outplace` option to the tracer. When `try_outplace` is true, the tracer will attempt to out-of-place ops (similar to how things are done today). When it's false, the correct in-place op is emitted.

I made `try_outplace` false by default, but flipped it to true for ONNX export utils. zdevito jamesr66a, anywhere else I should preserve the existing behavior?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14254

Reviewed By: eellison

Differential Revision: D13166691

Pulled By: suo

fbshipit-source-id: ce39fdf73ac39811c55100e567466d53108e856b
2018-11-27 12:40:56 -08:00
Zachary DeVito
e22cc7c072 Print default values and introduce ir view classes (#14176)
Summary:
[Stacked commit, only review the last commit]

This PR adds support for printing default values in python printing as well as the logic
for parsing default values back in using the parser. For simplicity, this PR simply
creates a subgraph of the constant expressions and then runs that graph to generate the defaults.
A more lightweight approach should be possible later, but would require more machinery.

To make reading code in the printer easier, this also add ir_views.h.
Similar to tree_views.h these classes can provide views of some commonly used IR nodes
that have complicated structure and common operations on that structure.

Currently it has only read-only views for prim::If and prim::Loop,
but we should eventually add helpers to manipulate If/Loop nodes as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14176

Differential Revision: D13198455

Pulled By: zdevito

fbshipit-source-id: dc99ab9692804ccaedb60a55040c0b89ac7a6a6d
2018-11-27 11:48:27 -08:00
Thomas Viehmann
8408dff55a Add Type support to the fuser, fuse more (#14336)
Summary:
This adds scalar type support to the fuser, both internally (instead of auto / assuming float) and for the inputs/outputs.
We can now fuse things with input / output of arbitrary scalar type, in particular comparisons and where work well. So it fixes #13384 by returning the right type tensor (and adds a test where byte and double tensors are returned).
The type inference is done by re-calling PropagateTensorShapeOnNode in the compilation, I would venture that it isn't prohibitively expensive compared to the actual compilation. (Propagation was fixed for where to return the second argument's type and amended to handle FusedConcat.)
I'm not sure how to add a check for the code generated by the fuser, but I am not sure we absolutely need to (we'd see if it is invalid / produces wrong results).

Thanks in particular to apaszke, fmassa, mruberry for advice and encouragement! All the errors are my own.

I have discussed order of PRs briefly with mruberry, if this goes in before he submits the PR, he graciously agreed to rebasing his, but I'd happily rebase, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14336

Differential Revision: D13202620

Pulled By: soumith

fbshipit-source-id: 855159e261fa15f21aca3053bfc05fb3f720a8ef
2018-11-27 11:33:11 -08:00
David Riazati
66c8bbf021 Add boolean dispatch for function overloading (#14081)
Summary:
This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See `max_pool1d` for an example usage.

This is the first step in enabling the use of `max_pool` functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions.

Depends on #14232 for `Optional[BroadcastingList[T]]`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14081

Differential Revision: D13192228

Pulled By: driazati

fbshipit-source-id: fce33c400c1fd06e59747d98507c5fdcd8d4c113
2018-11-27 10:51:32 -08:00
Richard Zou
b13f91dbd9 Allow graph fuser to move chunks past multiple nodes. (#14055)
Summary:
Fixes #12290. Also speeds up JIT LSTM forward pass from 8.8ms to 7.8ms; previously, each JIT lstm cell used 2 fused kernels. Now, it only uses one fused kernel (which is how many kernels cudnn uses).

Explanation:

Let f, g, h be fusible ops.
```
x = f(v, w)
z = g(x, y)
a, b = chunk(z)
c = h(a, b)
```
becomes (before this PR):
```
x = f(v, w)
x', y' = broadcast_tensors([x, y])
ax, bx = chunk(x')
ay, by = chunk(y')
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```
The graph fuser then puts g, g, and h into one FusionGroup and is unable
to move `x = f(v, w)` into the FusionGroup.

This PR lets the graph fuser move `x = f(v, w)` into the FusionGroup.
It does this by abstracting the broadcast_tensors + multiple chunk nodes
into one intermediate `prim::BroadcastingChunk[chunks, dim]` node.

A `BroadcastingChunk[chunks, dim](*inputs)` node is equivalent to:
- broadcasting all of *inputs
- chunk-ing each broadcasted input into `chunks` chunks along dim `dim`.

Abstracting the broadcasting chunk behavior away, it is now a lot easier
for the graph fuser to move (broadcast + chunk) past an operation. After
this PR, the above graph becomes:
```
x = f(v, w)
ax, bx, ay, by = BroadcastingChunk(x, y)
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```
Now, to move `x = f(v, w)` after the BroadcastingChunk, one just needs
to add f's operands to the BroadcastingChunk:
```
ay, by, av, bv, aw, bw = BroadcastingChunk(y, v, w)
ax = f(av, aw)
by = f(bv, bw)
a = g(ax, ay)
b = g(bx, by)
c = h(a, b)
```

cc apaszke mruberry zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14055

Differential Revision: D13159259

Pulled By: zou3519

fbshipit-source-id: 134e9e645c950384d9be6a06a883a10e17a73d7d
2018-11-26 12:31:49 -08:00
Michael Suo
2fa3c8327c fix tensor advanced indexing with assignment (#14311)
Summary:
Fix a mishandling of `foo[a] = b` when `a` was a tensor. We were assigning to a copy of `foo`, not a view of it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14311

Differential Revision: D13196109

Pulled By: suo

fbshipit-source-id: c929401fda7c4a27622d3fe2b11278b08a7f17f1
2018-11-26 12:10:48 -08:00
Adam Paszke
a60368982b Batch more matrix multiplies (#13456)
Summary:
This handles the input pre-multiplication in RNNs, yielding pretty significant speedups in backward times. This pass depends on loop unrolling, so we'll batch only as many elements as the unrolling factor allows.

cc mruberry ngimel zou3519 zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13456

Differential Revision: D12920339

Pulled By: zou3519

fbshipit-source-id: 5bcd6d259c054a6dea02ae09a9fdf9f030856443
2018-11-26 09:20:35 -08:00
Wanchao Liang
7fc34a4122 Convert gumbel_softmax, lp pooling weak functions and modules (#14232)
Summary:
1. Support `Optional[BroadcastingList1[int]]` like type annotation to accept a int or a list[int]
2. Convert gumbel_softmax, lp pooling weak functions and modules
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14232

Differential Revision: D13164506

Pulled By: wanchaol

fbshipit-source-id: 6c2a2b9a0613bfe907dbb5934122656ce2b05700
2018-11-21 23:44:24 -08:00
David Riazati
d9cdcc9a3b Add list inequality operator (#14129)
Summary:
This PR adds `aten::neq` for list inequality comparisons and converts
`nll_loss` to weak script
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129

Differential Revision: D13123894

Pulled By: driazati

fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f
2018-11-21 16:32:58 -08:00
Zachary DeVito
788d2e87bd Address jittering issues in python_print (#14064)
Summary:
export - print a method with python_print
import - import a method with import_method

We want to ensure:

    export(g) == export(import(export(g)))

That is after after exporting/importing once, the graph will stay exactly
the same. This is less strict that g == import(export(g)) which would
require us to maintain a lot more information about the structure of the
IR and about the names of debug symbols.

This PR addresses this with the following fixes:
* print out double-precision numbers with high enough precision such
  that they always parse in the same way
* when creating loop-carried dependencies, sort them
  by variable name, ensuring a consistent order
* parse nan correctly
* DCE: remove unused outputs of if statements, and loop-carried dependencies
  in loops that are dead both after the loop and inside the body of the
  loop.
* Do not set uniqueName for variables whose names are _[0-9]+, these
  are probably rare in user code, and we need a way to communicate
  that we do not care about a variable name when re-parsing the graph.
  Otherwise temporary variable names will jitter around.
* Expand the definition of a constant in printing code to None,
  and family.
* Allow re-treeing to work as long as the only thing in its way is a
  constant node. These do not have side effects but are sometimes
  inserted in a different order when tracing compared to how we print them.
* Print all constant nodes out first in the order in which they are used_val
 (or, if they are inlined, ensure they get assigned CONSTANT.cX number
  in a consistent order). Cleanup tuples (this is done in the compiler,
  but not in the tracer, leading to some tuple indexing jitter if not
  done).
* use strtod_l, not std::stod which can throw exceptions

Other:
* Add REL_WITH_DEB_INFO to setup.py. It already existed for the
  cmake files. Threading it into setup.py allows us to turn on
  debug symbols with optimization everywhere.
* enable round trip testing for all generated graphs. This only adds
  ~6 seconds to total build time but tests printing for every graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064

Differential Revision: D13094637

Pulled By: zdevito

fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d
2018-11-21 06:38:29 -08:00
David Riazati
8f20d40bb7 Allow undefined tensors as constants (#14120)
Summary:
This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`:

```python
torch.jit.script
def fn(x=None):
    # type: (Optional[Tensor]) -> Tensor
    return torch.jit._unwrap_optional(x)

torch.jit.script
def fn2():
    # type: () -> Tensor
    return fn()
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120

Differential Revision: D13124625

Pulled By: driazati

fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569
2018-11-20 16:54:27 -08:00
Wanchao Liang
d6bfc53b9e Export BatchNorm functional and module, add necessary JIT support (#14016)
Summary:
This PR did three things:

1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features
2. In the process of export, add necessary compiler support for in_place op aug assign
4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016

Differential Revision: D13112064

Pulled By: wanchaol

fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91
2018-11-20 14:15:06 -08:00
Thomas Viehmann
1256cbaa69 Relax limits for gradients in test_jit's checkGraph (#14094)
Summary:
- This should help TestJit.test_lstm_fusion_concat_cuda
  to be less flaky. (Checked on manual_seed 0..99)
  Fixes: #14026
- Revert the renaming of test_fused_abs that was introduced
  to game the order of tests to avoid the flakiness above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14094

Differential Revision: D13100174

Pulled By: soumith

fbshipit-source-id: 91bb63b07a960a81dddfc0bf25c67696c0f6c46d
2018-11-16 11:43:52 -08:00