pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
David Riazati	a23863fd6f	Add Pooling modules to Script (#14527 ) Summary: Depends on #14584 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14527 Differential Revision: D13270773 Pulled By: driazati fbshipit-source-id: e4acd43ccbce0f4b62d41c30ce8d5c721171e19a	2018-12-03 23:55:04 -08:00
David Riazati	d429e78a9a	Add fractional_max_pool2d to standard lib Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14591 Differential Revision: D13270755 Pulled By: driazati fbshipit-source-id: 138a60256795f5ef8d236c75be2cfd929059b98f	2018-12-03 23:49:38 -08:00
Michael Suo	95e5a5ae0c	basic testing of builtin alias annotations (#14588 ) Summary: Check whether the codegen'd alias annotations actually track alias creation and writes correctly. This could be made more exhaustive, but it's good enough for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14588 Differential Revision: D13312653 Pulled By: suo fbshipit-source-id: 98de1610ea86deada71957c75c222fff331a0888	2018-12-03 22:31:02 -08:00
Wanchao Liang	119f9ec291	enable NoneValue parameter assignment for WeakScriptModule (#14715 ) Summary: This PR: 1. Handle None value attr in the WeakScriptModuleProxy 2. add back module tests that now passing Pull Request resolved: https://github.com/pytorch/pytorch/pull/14715 Differential Revision: D13313573 Pulled By: wanchaol fbshipit-source-id: a6b7892707350290a6d69b6f6270ad089bfc954b	2018-12-03 20:40:55 -08:00
Zachary DeVito	bb546b2e5b	WAR for self.training (#14719 ) Summary: To enable self.training in script modules, this PR automatically adds a buffer called 'training' if a script method requests self.training. Assignment to self.training is overloaded to assign both to the boolean property and the tensor value. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14719 Differential Revision: D13310569 Pulled By: zdevito fbshipit-source-id: 406387bb602f8ce5794eeff37642863c75928be5	2018-12-03 20:32:16 -08:00
Zachary DeVito	78d594f46c	Implement Device as a type in the script (#14666 ) Summary: [ note: stacked on expect files changes, will unstack once they land ] This adds DeviceObjType (cannot use DeviceType it is already an enum) to the type hierarchy and an isDevice/toDevice pair to IValue. Previous hacks which used an int[] to represent Device are removed and at::Device is used instead. Note: the behavior or .to is only a subset of python, we need to fix the aten op so that it accepts Option[Device] and Optional[ScalarType]. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14666 Reviewed By: suo Differential Revision: D13290405 Pulled By: zdevito fbshipit-source-id: 68b4381b292f5418a6a46aaa077f1c902750b134	2018-12-03 16:54:40 -08:00
Wanchao Liang	4b31572375	Meta programming on If Stmt cond to enable conditional emit blocks (#14533 ) Summary: This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below ```python import torch class Test(torch.jit.ScriptModule): def __init__(self, b = None): self.b = b def forward(self, input): x = input if self.b is not None: x = self.b(input) return x Test()(torch.randn(2, 3)) ``` This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533 Differential Revision: D13310526 Pulled By: wanchaol fbshipit-source-id: 78d1a8127acda5e44d2a8a88f7627c43d29ff244	2018-12-03 15:47:15 -08:00
Michael Suo	9ac845f734	Revert D13280899: [pytorch][PR] Reduce broadcasted inputs in derivative code Differential Revision: D13280899 Original commit changeset: 80cc5ec9331b fbshipit-source-id: 2335093cca8fd7db95470fd83b9299adfa17aa8e	2018-12-03 14:55:02 -08:00
Lu Fang	e0f68671bd	Restore device when import jit script module (#14454 ) Summary: We align the restore logic to `torch.load`, we try to restore to the right device, and if the device is not available, an exception is raised. We allow user to remap the device through a parameter `map_location`, it can be 1) a string like 'cuda:0`, `cpu`, 2) a device, torch.device('cpu'), 3) a dict, {'cuda:1', 'cuda:0'}, and a function, and its signature looks like string map_location(tensor, saved_device_string). Pull Request resolved: https://github.com/pytorch/pytorch/pull/14454 Reviewed By: zrphercule Differential Revision: D13271956 Pulled By: houseroad fbshipit-source-id: dfd6b6049b0dc07549ddeddf2dea03ac53ba6d49	2018-12-03 14:10:30 -08:00
David Riazati	b8da44dc13	Add linear + pixelshuffle modules to standard lib Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14654 Differential Revision: D13300968 Pulled By: driazati fbshipit-source-id: 2c36aab91ea99681687f8da6d318981fee49785b	2018-12-03 14:01:16 -08:00
Adam Paszke	68ffe46991	Reduce broadcasted inputs in derivative code (#14485 ) Summary: Previously symbolic AD formulas assumed that no broadcasting happened, and would return gradients of incorrect shapes (possibly leading to silent errors later). Fixes a few bugs (known and unknown): - #11736 - ArgumentSpec didn't compute the input types correctly [(it didn't advance the offset for non-tensor args)](https://github.com/pytorch/pytorch/pull/14485/files#diff-4fd3157a056596aefb8cdf41022a208bR153) - Symbolic AD could suffer from use after free (dangling pointers in grad map), because [`EliminateDeadCode` could have removed nodes](https://github.com/pytorch/pytorch/pull/14485/files#diff-25d33ad1ed6855684dec79d927ca6142L781) that referenced gradients of certain values. - Undefined behavior in `aten::size` During my tests I've also found a few new problems, and I have opened issues for them: - FusionGroup seems to think that cat nodes broadcast their inputs (#14483) - `prim::ConstantChunk` derivative formula doesn't handle undefined inputs (#14484) This patch unfortunately deoptimizes some of our code (Fusion doesn't happen past chunk nodes, and outputs more tensors only because we have to get their size). I know how to fix those issues, but wanted to fix this terrible bug quickly. cc zou3519 zdevito ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/14485 Differential Revision: D13280899 Pulled By: soumith fbshipit-source-id: 80cc5ec9331be80e1bb9ddfe85b81c2b997e0b0c	2018-12-03 13:44:18 -08:00
Michael Suo	b768db0810	Allow DCE to clean up some mutable ops (#14601 ) Summary: This PR makes DCE a little smarter in the presence of mutable ops. Previously mutable ops could never be cleaned up, now they can be cleaned up if we can prove there are no live uses of any alias sets that the op writes to. This behavior is optional; if you pass DCE a block instead of a graph, it will do the same thing as before. Also changed `InlineAutographSubgraph` to use the common subgraph utils. Tested on traced ResNet, and it gets rid of the dead code. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14601 Differential Revision: D13309118 Pulled By: suo fbshipit-source-id: dac2791e7d2ecf219ae717a2759b83c1e927f254	2018-12-03 13:31:08 -08:00
Michael Suo	9783ce3825	Revert D13272203: [pytorch][PR] [jit] Meta programming on If Stmt cond to enable conditional emit blocks Differential Revision: D13272203 Original commit changeset: 44a545abb766 fbshipit-source-id: 8861eb4810a6c9ea4aba8427b3a07d2fa0d69a15	2018-12-03 13:28:52 -08:00
Wanchao Liang	5a2f5a216f	Make convertable to list also accepts optional Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14598 Differential Revision: D13308254 Pulled By: wanchaol fbshipit-source-id: bd0b6f9f20294d3d589cf68732dbd8c57b67e0e9	2018-12-03 13:09:11 -08:00
Wanchao Liang	4b90702037	Meta programming on If Stmt cond to enable conditional emit blocks (#14533 ) Summary: This PR is a part of task to unblock standard library export. Basically we want enable the ability to meta program IF stmt to dynamically emit different branches base on `cond`. This is primarily used to disable certain branch compilation on If, like the below ```python import torch class Test(torch.jit.ScriptModule): def __init__(self, b = None): self.b = b def forward(self, input): x = input if self.b is not None: x = self.b(input) return x Test()(torch.randn(2, 3)) ``` This is also the first step for us to bridge the gap between none simple value and any sugared value in JIT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14533 Differential Revision: D13272203 Pulled By: wanchaol fbshipit-source-id: 44a545abb766bbd39b762a6e19f9ebaa295e324b	2018-12-03 12:14:52 -08:00
Zachary DeVito	4c11dee0e8	Use Type::str() in Type::operator<< (#14657 ) Summary: Stacked on zip commit because it also changes expect files, read only the last commit. This reduces the number of ways we can print a Type from 3 (python_str, str, operator<<) to 2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14657 Differential Revision: D13288912 Pulled By: zdevito fbshipit-source-id: f8dd610cea798c511c1d4327395bba54b1aa1697	2018-12-01 00:53:27 -08:00
Zachary DeVito	170ff7764f	Use a zip archive as our container format (#14521 ) Summary: After consulting with Owen, who pointed out the existence of the miniz library, I decided to take one last shot at using zip as our container format. miniz makes this surprisingly feasible and I think the benefits of using zip are large enough that we should do it. This replaces our custom container format with a zip archive, preserving all of the desirable features of our custom format, such as append-oriented writing, and mmap'able tensor data while adding a bunch of debugging advantages: 1. You can unzip and explore the container to debug what is going on with a model. 2. You can edit the model using a text editor (e.g. change the definition of a method, or editing the json-serialized meta-data), re-zip the file use OSX's native 'Compress' option, and re-load the result into pytorch. Note: this enables you to, e.g., print-debug serialized models. 3. We can easily enable features like compression in the future. 4. Stock python , without pytorch installed, and other programming languages can reasonably consume this format,using json and zipfile packages, which enables people to build tools like visualizers without those visualizers depending on pytorch. This will be especially useful if you want to, for instance, write a visualizer in javascript. Notes: * This add miniz (https://github.com/richgel999/miniz) as a dependency. miniz is a self-contained library for reading/writing zipfiles that unlike other zip libraries also includes libz compatible compress/decompress support. It is a single header and a single C file without any other dependencies. Note that the instructions for miniz explicitly state: > Please use the files from the releases page in your projects. Do not use the git checkout directly! So we have checked in the 'release' source. Miniz supports zip64, and its API is amenable to doing zip-align style things to align data. * Removes 'size' from RecordRef. This allows you to edit files in the zip archive without editing the meta-data file. Very important if you want to print-debug serialized models. * PyTorchStreamReader/PyTorchStreamWriter keep mostly the same API (though keys become strings) However, their implementation is completely swapped out to use miniz. * Code exists to check for the old magic number to give a decent warning to our preview users after we change the format. * Container version information is now put in a stand-alone 'version' file in the archive and serves a similar purpose to the other container version info. * All files in the zip archive start at 64-byte boundaries, using an approach similar to zip-align. Tests check that this property remains true. While the writer does this, the reader doesn't depend on it, allowing user-created archives that can use compression, and do not have to align data. * Added test to check for > 4GB files and archives. Disabled by default because it takes almost 2 minutes to run. * torchscript files are now optional: if a submodule does not have methods, it will not be written. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14521 Reviewed By: jamesr66a Differential Revision: D13252945 Pulled By: zdevito fbshipit-source-id: 01209294c0f6543d0fd716f85a38532249c52f8c	2018-11-30 19:19:29 -08:00
Elias Ellison	404ad939e5	Revert existing no_grad_embedding_renorm_ from aten (#14639 ) Summary: Remove no_grad_embedding_renorm_ from aten. Setting the derivatives of the inputs to false has different semantics from calling with no_grad(), because it will not error if an input is modified and then has it's grad accessed. Instead, make a custom op, and use NoGradGuard. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14639 Differential Revision: D13285604 Pulled By: eellison fbshipit-source-id: c7d343fe8f22e369669e92799f167674f124ffe7	2018-11-30 16:57:51 -08:00
David Riazati	814b5715ba	Move module tests to common_nn (#14578 ) Summary: This moves `new_module_tests` from `test_nn.py` to `common_nn.py` so that they can be used in `test_jit.py` without running any of `test_nn.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14578 Differential Revision: D13268286 Pulled By: driazati fbshipit-source-id: 6e8654a4c29ab754d656ac83820c14d1c1843e03	2018-11-30 12:14:59 -08:00
David Riazati	89c3dbcad8	Add binary cross entropy to standard lib Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14583 Differential Revision: D13269423 Pulled By: driazati fbshipit-source-id: 7cc1594d8189c3e8f2d4ce0462fdc0a03683006e	2018-11-29 22:23:13 -08:00
James Reed	1975917d0e	fix copy_ (#14593 ) Summary: Closes https://github.com/pytorch/pytorch/issues/14590 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14593 Differential Revision: D13272510 Pulled By: jamesr66a fbshipit-source-id: b6921a98460c371d435277c416dad0b5ab0fec8c	2018-11-29 20:31:53 -08:00
Zachary DeVito	fd31eae9ad	Switch import/export to python printing (#14400 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/14378, only look at the last commit. This changes the way methods are defined in TorchScript archives to use PythonPrint rather than ONNX protobufs. It also updates torch.proto to directly document the tensor data structure actually being serialized. Notes: * because PythonPrint prints all the methods at once per module, this removes MethodDef in favor of a single torchscript_area and a separate caffe2_graphs entry. Note that NetDef's already have method names, so there is no need or a separate method name entry. * This switches cpp/pickle area to RecordRef (references to a file in the container format) since it is possible the data in these arenas may be large and not suited to json ouput. * Removes 'annotations' -- annotations should be re-added on the first commit that actually has a practical use for them. In the current state it is unlikely they are representing the right information. * Some expect files have changed because PythonPrint is preserving more debug name information for parameter names. * MethodEncoder (the ONNX output format) has been deleted. There is still some cleanup possible combining EncoderBase and GraphEncode now that there is only a single pathway using EncoderBase. * This incorporates the changes from #14397 to define TensorDef Pull Request resolved: https://github.com/pytorch/pytorch/pull/14400 Reviewed By: suo Differential Revision: D13231800 Pulled By: zdevito fbshipit-source-id: af5c1152d0bd6bca8b06c4703f59b161bb19f571	2018-11-29 17:53:49 -08:00
David Riazati	666d383a00	Add broadcast list default arg support (#14361 ) Summary: To convert `max_unpool` functions to weak script, this PR adds support for `T` as default arguments for `BroadcastingListN[T]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14361 Differential Revision: D13192231 Pulled By: driazati fbshipit-source-id: a25b75a0e88ba3dfa22d6a83775e9778d735e249	2018-11-29 15:15:47 -08:00
Adam Paszke	31b3d81714	Broadcast prim::FusedConcat inputs independently when checking kernels (#14503 ) Summary: Fixes #14483. cc zou3519 mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/14503 Differential Revision: D13256343 Pulled By: zou3519 fbshipit-source-id: 1c68a23f425be067a742bada7ee8cdfab7fc3fa2	2018-11-29 13:05:00 -08:00
David Riazati	9e93a02624	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13252887 Pulled By: driazati fbshipit-source-id: e9638cf74089884a32b8f0f38396cf432c02c988	2018-11-28 23:31:25 -08:00
Elias Ellison	6d63e9dbff	Support Embedding + EmbeddingBag in Script + (Ignore flakey test) (#14509 ) Summary: Resubmitting PR #14415 The tests added for Embedding + EmbeddingBag had random numbers as input, which affected the random number generator & caused the flakey test to break. Everything but the last two commits have already been accepted Pull Request resolved: https://github.com/pytorch/pytorch/pull/14509 Differential Revision: D13247917 Pulled By: eellison fbshipit-source-id: ea6963c47f666c07687787e2fa82020cddc6aa15	2018-11-28 19:16:38 -08:00
Elias Ellison	105fa58748	pointwise_loss (#14134 ) Summary: Adding pointwise loss ops to weak_script Pull Request resolved: https://github.com/pytorch/pytorch/pull/14134 Differential Revision: D13209455 Pulled By: eellison fbshipit-source-id: 87fc0222121f34a2f4edb24c2da2a11124b097d8	2018-11-28 18:14:38 -08:00
Edward Yang	5f07b33857	Revert D13219647: [pytorch][PR] Support Embedding + EmbeddingBag in Script Differential Revision: D13219647 Original commit changeset: c90706aa6fbd fbshipit-source-id: d189e717ba0773de43d633876bc3a688830a9303	2018-11-28 13:38:58 -08:00
Elias Ellison	7749804099	Support Embedding + EmbeddingBag in Script (#14415 ) Summary: Add support for Embedding and EmbeddingBag in script. Both functions require with torch.no_grad(), which we don't have any plans to support in the near future. To work around this, I added a embedding_renorm function without derivatives. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14415 Reviewed By: wanchaol Differential Revision: D13219647 Pulled By: eellison fbshipit-source-id: c90706aa6fbd48686eb10f3efdb65844be7b8717	2018-11-28 10:52:30 -08:00
David Riazati	3d98810fbd	Revert D13192230: [pytorch][PR] [jit] Use nn module tests in test_jit Differential Revision: D13192230 Original commit changeset: 36488960b6c9 fbshipit-source-id: 63b68bd909b9ef0548f52c986c84f549aecb8909	2018-11-28 00:23:09 -08:00
David Riazati	4cdcbbf410	Use nn module tests in test_jit (#14238 ) Summary: This PR adds weak modules for all activation modules and uses `test_nn` module tests to test weak modules that have been annotated with `weak_module` and therefore are in `torch._jit_internal._weak_types` Also depends on #14379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14238 Differential Revision: D13192230 Pulled By: driazati fbshipit-source-id: 36488960b6c91448b38c0fa65422539a93af8c5e	2018-11-27 21:19:51 -08:00
David Riazati	662f66ebb9	Add poisson_nll_loss to script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14420 Differential Revision: D13220726 Pulled By: driazati fbshipit-source-id: 6c08a0050075beafcc8ba413c9603b273870c70c	2018-11-27 19:39:16 -08:00
David Riazati	d75f751bec	Add boolean dispatch for function overloading (#14425 ) Summary: This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See max_pool1d for an example usage. This is the first step in enabling the use of max_pool functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions. Fixes #14081 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14425 Differential Revision: D13222104 Pulled By: driazati fbshipit-source-id: 8cb676b8b13ebcec3262234698edf4a7d7dcbbe1	2018-11-27 19:36:47 -08:00
Zachary DeVito	23f901a737	fix enable_cpu_fuser Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14440 Differential Revision: D13226354 Pulled By: zdevito fbshipit-source-id: e4ed023eece8b5b670a4a27d24a8688907b36b90	2018-11-27 19:14:10 -08:00
Elias Ellison	82175f31b4	Move Affine grid to C++ (#14392 ) Summary: Port AffineGrid to C++, because script does not support compiling Function classes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14392 Differential Revision: D13219698 Pulled By: eellison fbshipit-source-id: 3ddad8a84c72010b5a6c6f7f9712be614202faa6	2018-11-27 18:38:11 -08:00
Zachary DeVito	226a01e5a1	Handling of pretty-printing methods (#14378 ) Summary: Stacked on #14176, review only the last commit. * Print parameters to methods as self.weight rather than as extra inputs. * Print entire set of methods out as a single string * Update test code to test the module-at-a-time export/import Pull Request resolved: https://github.com/pytorch/pytorch/pull/14378 Differential Revision: D13198463 Pulled By: zdevito fbshipit-source-id: 3fab02e8239cfd6f40d6ab6399047bd02cf0a8c8	2018-11-27 17:10:23 -08:00
zrphercule	ba6c49cb9c	Add test of ONNX_ATEN (#14259 ) Summary: In #14239 we fixed ONNX_ATEN. In order to make sure its correctness in the future, we should add related test case. We use torch.fmod() to test ONNX_ATEN. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14259 Differential Revision: D13204610 Pulled By: zrphercule fbshipit-source-id: e4660c346e5edd201f1458b7d74d7dfac49b94c7	2018-11-27 13:51:51 -08:00
David Riazati	1b80644b4d	Revert D13192228: [pytorch][PR] [jit] Add boolean dispatch for function overloading Differential Revision: D13192228 Original commit changeset: fce33c400c1f fbshipit-source-id: 75c9991dc7097f9513c6c89d16eff2de6e287c3b	2018-11-27 13:14:42 -08:00
Michael Suo	3fca4bde50	Trace in-place ops (#14254 ) Summary: This PR adds a `try_outplace` option to the tracer. When `try_outplace` is true, the tracer will attempt to out-of-place ops (similar to how things are done today). When it's false, the correct in-place op is emitted. I made `try_outplace` false by default, but flipped it to true for ONNX export utils. zdevito jamesr66a, anywhere else I should preserve the existing behavior? Pull Request resolved: https://github.com/pytorch/pytorch/pull/14254 Reviewed By: eellison Differential Revision: D13166691 Pulled By: suo fbshipit-source-id: ce39fdf73ac39811c55100e567466d53108e856b	2018-11-27 12:40:56 -08:00
Zachary DeVito	e22cc7c072	Print default values and introduce ir view classes (#14176 ) Summary: [Stacked commit, only review the last commit] This PR adds support for printing default values in python printing as well as the logic for parsing default values back in using the parser. For simplicity, this PR simply creates a subgraph of the constant expressions and then runs that graph to generate the defaults. A more lightweight approach should be possible later, but would require more machinery. To make reading code in the printer easier, this also add ir_views.h. Similar to tree_views.h these classes can provide views of some commonly used IR nodes that have complicated structure and common operations on that structure. Currently it has only read-only views for prim::If and prim::Loop, but we should eventually add helpers to manipulate If/Loop nodes as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14176 Differential Revision: D13198455 Pulled By: zdevito fbshipit-source-id: dc99ab9692804ccaedb60a55040c0b89ac7a6a6d	2018-11-27 11:48:27 -08:00
Thomas Viehmann	8408dff55a	Add Type support to the fuser, fuse more (#14336 ) Summary: This adds scalar type support to the fuser, both internally (instead of auto / assuming float) and for the inputs/outputs. We can now fuse things with input / output of arbitrary scalar type, in particular comparisons and where work well. So it fixes #13384 by returning the right type tensor (and adds a test where byte and double tensors are returned). The type inference is done by re-calling PropagateTensorShapeOnNode in the compilation, I would venture that it isn't prohibitively expensive compared to the actual compilation. (Propagation was fixed for where to return the second argument's type and amended to handle FusedConcat.) I'm not sure how to add a check for the code generated by the fuser, but I am not sure we absolutely need to (we'd see if it is invalid / produces wrong results). Thanks in particular to apaszke, fmassa, mruberry for advice and encouragement! All the errors are my own. I have discussed order of PRs briefly with mruberry, if this goes in before he submits the PR, he graciously agreed to rebasing his, but I'd happily rebase, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14336 Differential Revision: D13202620 Pulled By: soumith fbshipit-source-id: 855159e261fa15f21aca3053bfc05fb3f720a8ef	2018-11-27 11:33:11 -08:00
David Riazati	66c8bbf021	Add boolean dispatch for function overloading (#14081 ) Summary: This PR allows to overload functions based on the value of a parameter (so long as it is a constant). See `max_pool1d` for an example usage. This is the first step in enabling the use of `max_pool` functions for the standard library that can return `Tensor` or `Tuple[Tensor, Tensor]` based on the `return_indices` flag. This will give the JIT identical results to the Python versions of the functions. Depends on #14232 for `Optional[BroadcastingList[T]]` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14081 Differential Revision: D13192228 Pulled By: driazati fbshipit-source-id: fce33c400c1fd06e59747d98507c5fdcd8d4c113	2018-11-27 10:51:32 -08:00
Richard Zou	b13f91dbd9	Allow graph fuser to move chunks past multiple nodes. (#14055 ) Summary: Fixes #12290. Also speeds up JIT LSTM forward pass from 8.8ms to 7.8ms; previously, each JIT lstm cell used 2 fused kernels. Now, it only uses one fused kernel (which is how many kernels cudnn uses). Explanation: Let f, g, h be fusible ops. ``` x = f(v, w) z = g(x, y) a, b = chunk(z) c = h(a, b) ``` becomes (before this PR): ``` x = f(v, w) x', y' = broadcast_tensors([x, y]) ax, bx = chunk(x') ay, by = chunk(y') a = g(ax, ay) b = g(bx, by) c = h(a, b) ``` The graph fuser then puts g, g, and h into one FusionGroup and is unable to move `x = f(v, w)` into the FusionGroup. This PR lets the graph fuser move `x = f(v, w)` into the FusionGroup. It does this by abstracting the broadcast_tensors + multiple chunk nodes into one intermediate `prim::BroadcastingChunk[chunks, dim]` node. A `BroadcastingChunk[chunks, dim](inputs)` node is equivalent to: - broadcasting all of inputs - chunk-ing each broadcasted input into `chunks` chunks along dim `dim`. Abstracting the broadcasting chunk behavior away, it is now a lot easier for the graph fuser to move (broadcast + chunk) past an operation. After this PR, the above graph becomes: ``` x = f(v, w) ax, bx, ay, by = BroadcastingChunk(x, y) a = g(ax, ay) b = g(bx, by) c = h(a, b) ``` Now, to move `x = f(v, w)` after the BroadcastingChunk, one just needs to add f's operands to the BroadcastingChunk: ``` ay, by, av, bv, aw, bw = BroadcastingChunk(y, v, w) ax = f(av, aw) by = f(bv, bw) a = g(ax, ay) b = g(bx, by) c = h(a, b) ``` cc apaszke mruberry zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/14055 Differential Revision: D13159259 Pulled By: zou3519 fbshipit-source-id: 134e9e645c950384d9be6a06a883a10e17a73d7d	2018-11-26 12:31:49 -08:00
Michael Suo	2fa3c8327c	fix tensor advanced indexing with assignment (#14311 ) Summary: Fix a mishandling of `foo[a] = b` when `a` was a tensor. We were assigning to a copy of `foo`, not a view of it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14311 Differential Revision: D13196109 Pulled By: suo fbshipit-source-id: c929401fda7c4a27622d3fe2b11278b08a7f17f1	2018-11-26 12:10:48 -08:00
Adam Paszke	a60368982b	Batch more matrix multiplies (#13456 ) Summary: This handles the input pre-multiplication in RNNs, yielding pretty significant speedups in backward times. This pass depends on loop unrolling, so we'll batch only as many elements as the unrolling factor allows. cc mruberry ngimel zou3519 zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/13456 Differential Revision: D12920339 Pulled By: zou3519 fbshipit-source-id: 5bcd6d259c054a6dea02ae09a9fdf9f030856443	2018-11-26 09:20:35 -08:00
Wanchao Liang	7fc34a4122	Convert gumbel_softmax, lp pooling weak functions and modules (#14232 ) Summary: 1. Support `Optional[BroadcastingList1[int]]` like type annotation to accept a int or a list[int] 2. Convert gumbel_softmax, lp pooling weak functions and modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/14232 Differential Revision: D13164506 Pulled By: wanchaol fbshipit-source-id: 6c2a2b9a0613bfe907dbb5934122656ce2b05700	2018-11-21 23:44:24 -08:00
David Riazati	d9cdcc9a3b	Add list inequality operator (#14129 ) Summary: This PR adds `aten::neq` for list inequality comparisons and converts `nll_loss` to weak script Pull Request resolved: https://github.com/pytorch/pytorch/pull/14129 Differential Revision: D13123894 Pulled By: driazati fbshipit-source-id: 8c1edf7c163217ec00eb653f95d196db3998613f	2018-11-21 16:32:58 -08:00
Zachary DeVito	788d2e87bd	Address jittering issues in python_print (#14064 ) Summary: export - print a method with python_print import - import a method with import_method We want to ensure: export(g) == export(import(export(g))) That is after after exporting/importing once, the graph will stay exactly the same. This is less strict that g == import(export(g)) which would require us to maintain a lot more information about the structure of the IR and about the names of debug symbols. This PR addresses this with the following fixes: * print out double-precision numbers with high enough precision such that they always parse in the same way * when creating loop-carried dependencies, sort them by variable name, ensuring a consistent order * parse nan correctly * DCE: remove unused outputs of if statements, and loop-carried dependencies in loops that are dead both after the loop and inside the body of the loop. * Do not set uniqueName for variables whose names are _[0-9]+, these are probably rare in user code, and we need a way to communicate that we do not care about a variable name when re-parsing the graph. Otherwise temporary variable names will jitter around. * Expand the definition of a constant in printing code to None, and family. * Allow re-treeing to work as long as the only thing in its way is a constant node. These do not have side effects but are sometimes inserted in a different order when tracing compared to how we print them. * Print all constant nodes out first in the order in which they are used_val (or, if they are inlined, ensure they get assigned CONSTANT.cX number in a consistent order). Cleanup tuples (this is done in the compiler, but not in the tracer, leading to some tuple indexing jitter if not done). * use strtod_l, not std::stod which can throw exceptions Other: * Add REL_WITH_DEB_INFO to setup.py. It already existed for the cmake files. Threading it into setup.py allows us to turn on debug symbols with optimization everywhere. * enable round trip testing for all generated graphs. This only adds ~6 seconds to total build time but tests printing for every graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14064 Differential Revision: D13094637 Pulled By: zdevito fbshipit-source-id: 0a1c6912194d965f15d6b0c6cf838ccc551f161d	2018-11-21 06:38:29 -08:00
David Riazati	8f20d40bb7	Allow undefined tensors as constants (#14120 ) Summary: This PR inserts `prim::None` constants for undefined tensors. This comes in the standard library if an `Optional[Tensor]` is statically determined to be `None`: ```python torch.jit.script def fn(x=None): # type: (Optional[Tensor]) -> Tensor return torch.jit._unwrap_optional(x) torch.jit.script def fn2(): # type: () -> Tensor return fn() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/14120 Differential Revision: D13124625 Pulled By: driazati fbshipit-source-id: 9eaa82e478c49c503f68ed89d8c770e8273ea569	2018-11-20 16:54:27 -08:00
Wanchao Liang	d6bfc53b9e	Export BatchNorm functional and module, add necessary JIT support (#14016 ) Summary: This PR did three things: 1. It export the BatchNorm functional and module, and rewrite some of the components to stay align with the current supported JIT features 2. In the process of export, add necessary compiler support for in_place op aug assign 4. change the test_jit behavior in add_module_test to utilize a single rng state during module initialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/14016 Differential Revision: D13112064 Pulled By: wanchaol fbshipit-source-id: 31e3aee5fbb509673c781e7dbb6d8884cfa55d91	2018-11-20 14:15:06 -08:00
Thomas Viehmann	1256cbaa69	Relax limits for gradients in test_jit's checkGraph (#14094 ) Summary: - This should help TestJit.test_lstm_fusion_concat_cuda to be less flaky. (Checked on manual_seed 0..99) Fixes: #14026 - Revert the renaming of test_fused_abs that was introduced to game the order of tests to avoid the flakiness above. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14094 Differential Revision: D13100174 Pulled By: soumith fbshipit-source-id: 91bb63b07a960a81dddfc0bf25c67696c0f6c46d	2018-11-16 11:43:52 -08:00
David Riazati	0d29846d5e	Convert more weak functions (#14003 ) Summary: Same deal as #13707 Pull Request resolved: https://github.com/pytorch/pytorch/pull/14003 Differential Revision: D13076403 Pulled By: driazati fbshipit-source-id: eb3cb3b2c31caf1de591b613bdc4c9a6ed4e1767	2018-11-15 16:45:50 -08:00
Zachary DeVito	0573169e23	Import a method from an python_print string (#13959 ) Summary: * Add hooks to get a callback whenever a valid graph is produced in the compiler or through tracing. These hooks can be used to pretty_print and then reparse every graph our tests produce to check that the serialization function works correctly. Currently this is guarded by an environment variable since there are a few remaining failures. * Fix printing bugs: True and False rather than 1 and 0, print 0. for floating point zero * Change behavior of NoneType. It is now no longer a subtype of Optional but instead implicitly converts to it, returning a prim::Node with an Option[T] type for some specific T. This allows functions like `_unwrap_optional` to correctly match against a None while still deriving the right type. * Fix a bug where empty blocks did not correctly emit "pass" in printer. * Fix a bug where prim::Undefine sometimes cannot be printed as None because it is being used in a schema-less op. This should be fixable once Optional[T] always uses the same None object. * Other minor printing bugs Pull Request resolved: https://github.com/pytorch/pytorch/pull/13959 Reviewed By: jamesr66a Differential Revision: D13073519 Pulled By: zdevito fbshipit-source-id: 4167a6b614f2e87b4d21823275a26be5ba4fc3dd	2018-11-15 16:11:37 -08:00
Thomas Viehmann	c7e0db140e	use fabs instead of absf in fuser code for aten::abs (#13985 ) Summary: absf didn't work for CUDA Fixes: #13971 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13985 Differential Revision: D13084601 Pulled By: soumith fbshipit-source-id: 0027ee719ae2b6a2bfce9c26f21db9c5e6159686	2018-11-15 13:23:59 -08:00
Xiang Gao	143ba72264	Move cosine_similarity to ATen (#12199 ) Summary: I'm now traveling and don't have access to a good computer to compile test by myself. Will see the outcome of CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12199 Differential Revision: D13062326 Pulled By: nairbv fbshipit-source-id: 85873525caa94906ccaf2c739eb4cd55a72a4ffd	2018-11-14 10:41:44 -08:00
Zachary DeVito	30676bdcd3	Finish up TODOs in python printer (#13879 ) Summary: * Correctly adds annotate when needed for lists * Parser/Emitter handles octal escapes so we do not fail for some strings. * more complete keyword list in pretty printer * floating point numbers are always printed with a decimal to ensure we never mistake them in parsing Pull Request resolved: https://github.com/pytorch/pytorch/pull/13879 Differential Revision: D13037860 Pulled By: zdevito fbshipit-source-id: f09ab174fc33402a429b21a5bfaf72e15c802cad	2018-11-13 16:39:46 -08:00
Elias Ellison	f649d8b3a9	add floordiv and bitwise ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13873 Reviewed By: driazati, wanchaol Differential Revision: D13033709 Pulled By: eellison fbshipit-source-id: df7edee0f790038fb2a806d20640ad25c70b50eb	2018-11-13 16:32:22 -08:00
David Riazati	5163a28917	Convert more weak functions (#13707 ) Summary: Convert some more functions to match up with features added. Some conversions were unsuccessful but the type line was left in for later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13707 Differential Revision: D13030210 Pulled By: driazati fbshipit-source-id: 02d5712779b83b7f18d0d55539e336321335e0cc	2018-11-13 13:50:57 -08:00
David Riazati	53bc5fb043	Support nn.Sequential in script (#13889 ) Summary: This PR makes weak modules in `nn.Sequential` get properly compiled when used Pull Request resolved: https://github.com/pytorch/pytorch/pull/13889 Differential Revision: D13039559 Pulled By: driazati fbshipit-source-id: d3266305f0e206b2a19b63230ac2ab8f02faa603	2018-11-13 13:48:58 -08:00
Elias Ellison	686e83223f	add ops between float & int, and change list equality output to be a boolean Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13793 Reviewed By: wanchaol Differential Revision: D13010872 Pulled By: eellison fbshipit-source-id: 2c8248f30b51eab1a87290711f99b7ceb6df2009	2018-11-12 14:39:47 -08:00
David Riazati	0c375571f5	Support OptionalType export and type match (#13647 ) Summary: * Adds `OptionalType` support for import/export * Optionals get exported along with their contained type, i.e. 'Optional[int]' * Allows concrete types and `None` to be passed to an op that takes an optional * Converts `softmax` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13647 Differential Revision: D12954672 Pulled By: driazati fbshipit-source-id: 159e9bfb7f3e398bec3912d414c393098cc7455a	2018-11-12 12:15:25 -08:00
Zachary DeVito	aef9e76283	Get pretty printer ready for use as a serialization format (#13616 ) Summary: Get pretty printer ready for use as a serialization format This PR adds a bunch of functionality to the pretty printer (now called python_printer to reflect the fact that it will be used to output valid python source). The idea is to get the printer ready for use as serialization format. This PR does not have tests beyond what the pretty printer already had. PRs stacked on this one will do round-trip export/import to test this functionality more robustly. Notes: * PythonPrinter is an evolution of the original pretty printer. However, much of it has changed so it is best just to read it as a new implementation. Trying to correlate it to the original implementation is probably not much help. * The printer tries to get reasonably close to how the original function was likely written, such as writing expressions rather than making intermediates when possible. We may decide to turn this off for the actual serialization, but it is useful for pretty printing. * tensor field access was changed so that prim::device and family have schema * fixed a bug in the compiler where setUniqueName gets called even when a value already has one. this sometimes assigned really poor names to graph inputs * Graph::insert gains an optional range argument to make range-preserving inserts easier. * prim:: ops that can have schema now have schema. This is because when we parse them back in, we will need the schema to correctly set their output types. * there is code in the python printer to complain if you try to add a prim op and do not update the printer. * BuiltinModule is generalized to take an operator namespace and a version number for work in future commits. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13616 Reviewed By: goldsborough Differential Revision: D13008252 Pulled By: zdevito fbshipit-source-id: 32b33bc6410d6ca1c6f02bd6e050f8d5eea32083	2018-11-12 10:21:30 -08:00
Wanchao Liang	79ceecec8e	Optional undefined tensor support (#13650 ) Summary: This PR is a part of task to unblock standard library export. * we treat None differently from Tensor and other types, when passing None as Tensor, it's an undefined tensor rather than the None IValue. * Refine the type system so that we have correct tensor types hierarchy (Dynamic/Tensor/CompleteTensor), Dynamic should be at the top of the inheritance hierarchy. * It also tries to export bilinear as an example of undefined tensor(None) input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13650 Differential Revision: D12967026 Pulled By: wanchaol fbshipit-source-id: 6aedccc7ce2a12fadd13d9e620c03e1260103a5a	2018-11-09 11:29:57 -08:00
Thomas Viehmann	9ffabcfcaa	Use nested variant of getValueTrace to allow more flexible tracing script modules (#13597 ) Summary: When tracing scripted functions, we used to only allow Tensor arguments. This enables tracing script modules with List[Tensor] or Tuple[Tensor, Tensor] arguments (passing tuples). Fixes: #13566 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13597 Differential Revision: D12990464 Pulled By: soumith fbshipit-source-id: fdce3afcb1e09f3c26d6ce834c01bf18d261f47c	2018-11-09 06:24:02 -08:00
James Sun	dca3c2c60f	Save and execute futures in a task queue (#13212 ) Summary: Upon calling wait(), save the forked thread and the current thread to a task queue. A idling thread (which currently is single threaded) should pick a ready task and run till there is nothing in the task queue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13212 Differential Revision: D12884522 Pulled By: highker fbshipit-source-id: b3942a0ee63c148e05f5f41bdc73007fa3c3368e	2018-11-09 01:46:35 -08:00
Zachary DeVito	44fb23a2f5	Add ability to annotate jit types inside function (#13752 ) Summary: This adds torch.jit.annotate for annotating the type of an intermediate. This is Py2/3 compatible, e.g.: ``` from torch.jit import annotate from typing import List torch.jit.script def foo(): a = annotate(List[int], []) ``` This is needed to output valid python programs from our IR. It removes the need for the empty list constructors. A future patch can add support to the C++ parser and Python 3, via desugaring: ``` a : int = b a = anntoate(int, b) ``` But this functionality is not required for serialization so is not added in this patch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13752 Differential Revision: D12989885 Pulled By: zdevito fbshipit-source-id: 161573a7352094543dc0d33a892f2a3b9103d847	2018-11-08 20:25:00 -08:00
James Reed	85bde3801b	Tracer now records Python variable names (#13441 ) Summary: This is probably slow but it should make the traces more understandable and make debugging easier. Any suggestions for how to make it faster (i.e. make it so we don't have to traverse all of locals() and globals()) would be appreciated Pull Request resolved: https://github.com/pytorch/pytorch/pull/13441 Differential Revision: D12879763 Pulled By: jamesr66a fbshipit-source-id: b84133dc2ef9ca6cfbfaf2e3f9106784cc42951e	2018-11-08 13:08:42 -08:00
David Riazati	556ff8e7b7	Add builtins for `size()` and list with defaults (#13639 ) Summary: * `aten::size()` to match `torch.Tensor.size` * `aten::list_with_default` for semantics of `torch.nn.modules.utils.list_with_default` * converts `adaptive_avg_pool2d` and `adaptive_avg_pool3d` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13639 Differential Revision: D12954670 Pulled By: driazati fbshipit-source-id: 68c30af0efc02c60af5fb8c9715b2435cc01a0d9	2018-11-08 11:26:35 -08:00
David Riazati	4472ad3b2f	Move functional _Reduction to its own module (#13401 ) Summary: To support `_Reduction` in the jit this PR moves it out to a new file so that it goes through the paths for python modules in the script compiler and converts `F.ctc_loss` to weak script Depends on #13484 for saving rng state Pull Request resolved: https://github.com/pytorch/pytorch/pull/13401 Differential Revision: D12868501 Pulled By: driazati fbshipit-source-id: 23cec0fb135744578c73e31ac825e238db495d27	2018-11-08 01:04:10 -08:00
Michael Suo	21991c05a9	Support assignment to subscripted lhs expr (#13486 ) Summary: Support things like `foo[0] = bar` in script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13486 Differential Revision: D12964550 Pulled By: suo fbshipit-source-id: 3dda8ffd683d1b045787c65bfa0c7d43b0455658	2018-11-07 23:07:57 -08:00
Zachary DeVito	c8bb665b5d	Fix a bug in tuple assignment (#13656 ) Summary: Previously, we did not distinguish between `a = b` (simple assignment), and `a, = b` (tuple destructuring of a singleton tuple). The second case would fail in the string frontend, and would not unpack in the python frontend. This patch fixes both issues and also cleans up the error reporting for unexpected expressions on the LHS. Will likely conflict with #13486 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13656 Differential Revision: D12964566 Pulled By: zdevito fbshipit-source-id: 992b19e5068aef59a78cd23cb0e59a9eeb7755d1	2018-11-07 16:44:22 -08:00
Peter Goldsborough	9403eddce4	Fix tracing bug for custom ops (#13654 ) Summary: Due to a logic bug, tracing is broken for custom ops. Unfortunately, there also weren't any tests for tracing custom ops. The fix is a single line change of moving `pop(stack, std::get<Is>(arguments)...);` before `node = getTracedNode<Is...>(schema, arguments);`. Other changes are added tests and improved commenting/formatting. Fixes https://github.com/pytorch/pytorch/issues/13564 CC The controller you requested could not be found. fmassa zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/13654 Differential Revision: D12952887 Pulled By: goldsborough fbshipit-source-id: 87d256576f787c58e8d8f5c13a0fecd0ec62a602	2018-11-07 09:22:44 -08:00
Gregory Chanan	7341ab0a33	Fix range of target examples and JIT test case for CTC loss. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13644 Differential Revision: D12949733 Pulled By: gchanan fbshipit-source-id: 1c4cacbb6a50d5002165bdd0a7881883db5c8249	2018-11-07 07:04:31 -08:00
Alex Şuhan	a132a7d9ce	Add autodiff support for a few additional operators (#13288 ) Summary: Added aten::{avg_pool2d, log_softmax, max_pool2d_with_indices, threshold}, enabled aten::{expand, view}. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13288 Differential Revision: D12954929 Pulled By: soumith fbshipit-source-id: 6fba58af82cafbc7446705d8c8145cdeaf4954ca	2018-11-06 23:24:12 -08:00
David Riazati	dbc467545f	Update weak script modules to match fns (#13631 ) Summary: Add weak modules for those that use weak script functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/13631 Differential Revision: D12945328 Pulled By: driazati fbshipit-source-id: 6cb235763bf5ab35c7b32e0f734f08d22418594f	2018-11-06 21:22:52 -08:00
Elias Ellison	6cf450744f	propagate python op error msg (#13624 ) Summary: Correctly propagate the error msg from a python op to the JIT interpreter. In the interpreter we wrap the exception and re-throw it as a Runtime Exception. Potentially in a future diff we can throw the same type of python exception as was originally thrown. Fix for https://github.com/pytorch/pytorch/issues/13560 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13624 Differential Revision: D12948756 Pulled By: eellison fbshipit-source-id: 94cdf4c376143c5e40dcb9716aefb3c1e2d957db	2018-11-06 16:28:39 -08:00
Elias Ellison	137150be88	add unwrap optional operator (#13599 ) Summary: Add a builtin to refine the type of Optional[T] -> T. This is a short-term solution to unblock porting of the the standard library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13599 Reviewed By: driazati, wanchaol Differential Revision: D12943193 Pulled By: eellison fbshipit-source-id: 31c893a78d813313bbbc1d8212b5c04e403cfb4d	2018-11-06 11:54:56 -08:00
Soumith Chintala	a7ee632dff	Various Test and build fixes (#13556 ) Summary: - fixes weights-contiguous requirement for THCUNN Convolutions - Add tests that conv backward pass works for non-contiguous weights - fix RNN tests / error messages to be consistent and pass - relax weight grad precision for fp16 for a particular test - fix regression of CMAKE_PREFIX_PATH not passing through - add missing skipIfNoLapack annotations where needed Differential Revision: D12918456 Pulled By: soumith fbshipit-source-id: 8642d36bffcc6f2957800d6afa1e10bef2a91d05	2018-11-06 07:13:47 -08:00
David Riazati	fc6a9a19ea	Add torch._C._nn built-in, more weak fns (#13322 ) Summary: This PR adds functions defined in `torch._C._nn` as builtin functions (including inplace variants). This allows for the conversion of more functions to weak script NB: many `torch.nn.functional` functions will have to be slightly rewritten to avoid early returns (as with `threshold` in this PR) Converts these functions to weak script: * `threshold` * `relu` * `hardtanh` * `relu6` * `elu` * `selu` * `celu` * `leaky_relu` * `rrelu` * `tanh` * `sigmoid` Pull Request resolved: https://github.com/pytorch/pytorch/pull/13322 Differential Revision: D12852203 Pulled By: driazati fbshipit-source-id: 220670df32cb1ff39d120bdc04aa1bd41209c809	2018-11-05 21:02:18 -08:00
Wanchao Liang	af4a228426	Fix erase_number_type pass, negative indices in c2 and some onnx symbolics (#12888 ) Summary: The PR did two things: 1. fix the bug in erase_number_type on node inputs 2. handle negative indices for dim-reduce in caffe2 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12888 Reviewed By: houseroad Differential Revision: D12833486 Pulled By: wanchaol fbshipit-source-id: c3ceb400d91f0173b73ad95e392b010c3c14db7d	2018-11-05 19:13:49 -08:00
David Riazati	1969898647	Convert functional dropouts to weak script (#13484 ) Summary: To convert `nn.functional.dropout` * `_VF` had to be exposed as a Python module so this PR adds a module class to forward to `torch._C._VariableFunctions` * rng state between calls in the tests needed to be made consistent Pull Request resolved: https://github.com/pytorch/pytorch/pull/13484 Differential Revision: D12929622 Pulled By: driazati fbshipit-source-id: 78b455db9c8856b94d2dda573fb7dc74d5784f56	2018-11-05 17:13:07 -08:00
David Riazati	23e3a12d5e	Add `pass` support to script (#13535 ) Summary: This PR adds basic support for `pass` statements Pull Request resolved: https://github.com/pytorch/pytorch/pull/13535 Differential Revision: D12929529 Pulled By: driazati fbshipit-source-id: 70c7c52630d46e76366c4caa875d6c5419a1e03f	2018-11-05 17:13:06 -08:00
David Riazati	df67d4180a	Validate schema with no returns (#13525 ) Summary: If there is no return type then the returns of the schema are not checked against the returns in the graph, so this PR adds an error if that case is detected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13525 Differential Revision: D12929524 Pulled By: driazati fbshipit-source-id: da562e979482393098830bbded26729a2499152a	2018-11-05 16:51:55 -08:00
Adam Paszke	e988dc621b	Stop depending on static analysis of tensor types in graph fuser (#13387 ) Summary: Built on top of #13108, so please review only the last commit. This makes the graph fuser ignore input types (device/scalar type) when considering graphs for fusion, making it much more robust to shape-prop failures. Those properties are now checked at run time, as part of the kernel validation. This should enable graph fusions in `jit_premul` and `jit_multilayer` timelines in our benchmarks. One regression is that I've disabled fusions of comparison ops (and `type_as`). That's because there's really no good way to ensure that those are really valid, and are a source of bugs (I filed #13384). cc ngimel mruberry zdevito zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13387 Differential Revision: D12888104 Pulled By: zou3519 fbshipit-source-id: c233ea599679c34ac70fb4d8b8497c60aad9e480	2018-11-05 06:32:08 -08:00
Zachary DeVito	86192301b3	Fix a few bugs in format and vararg handling (#13492 ) Summary: There are a couple subtle bugs in the way varargs is implemented: 1. it fails if you pass 0 arguments, because it doesn't handle the case when there are 0 varargs, and because Operator::matches was not updated. 2. it breaks all the named-based lookups on nodes. For instance node->get<int>(attr::value) will return a single entry of the varargs if you look it up by name. Furthermore it complicates some assumptions about the positional arguments (e.g. they use to be 1-to-1 with node inputs but with varargs they are not). Because varargs are only being used for format, this diff instead just allows format to take any value as input, regardless of type. It just provides a way to set is_vararg from the schema but does not restrict the type of the varargs things. This is inline with the pre-existing behavior for is_vararg so it doesn't require Operator::matches changes. This also keeps format inline with how print works, and is closer to the python implementation of format. Note that the implementation of format already worked with arbitrary IValues so restricting to strings was just making it more conservative than needed. This also fixes the implementation of format to work when there are 0 arguments or text before and after a format string, where it would not print things. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13492 Differential Revision: D12896989 Pulled By: zdevito fbshipit-source-id: 21425bac8edc81709030a7408180494edea0a54b	2018-11-02 00:07:00 -07:00
Michael Suo	5fbaf0eaf8	add augmented assignment ops (#13364 ) Summary: This PR changes the compiler to correctly emit in-place operators for augmented assignments (`+=` and friends). - To better match the Python AST structure, add an `AugAssign` tree view and make `Assign` apply only to `=` assignments. - Emit those `AugAssign` exprs in the compiler, dispatching to in-place aten ops for tensors and lowering to simple assignments for scalar types. - In order to preserve (suspect) ONNX export semantics, add a pass to lower the in-place operators to out-of-place operators. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13364 Differential Revision: D12899734 Pulled By: suo fbshipit-source-id: bec83be0062cb0235eb129aed78d6110a9e2c146	2018-11-02 00:01:07 -07:00
Wanchao Liang	0fd176fea4	Add operator is, not, is not to script (#13336 ) Summary: As titled, this PR is a part of tasks to unblock exporting the standard library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13336 Differential Revision: D12888912 Pulled By: wanchaol fbshipit-source-id: 6213a17a75a593ae45999994fd9562f29b7d42df	2018-11-01 16:55:28 -07:00
Elias Ellison	421f3f3e52	add npair builtins (#13473 ) Summary: Add npair builtins to unblock standard library. As with broadcasting list, the only occurrences are with int/floats. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13473 Differential Revision: D12890844 Pulled By: eellison fbshipit-source-id: c360bb581d0f967cb51b858b6f964c300992d62a	2018-11-01 15:42:52 -07:00
Elias Ellison	edc6d721e0	fix flake (#13463 ) Summary: fix flake on test/test_jit.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/13463 Differential Revision: D12886532 Pulled By: eellison fbshipit-source-id: 1cd2a736663d5037bb4bdcd1d8ca1f201cf6a1cf	2018-11-01 13:39:39 -07:00
David Riazati	99ce499bfe	Revert D12852205: [pytorch][PR] [jit] Add str() builtin Differential Revision: D12852205 Original commit changeset: 3e0e9218afdf fbshipit-source-id: 114b4873504109394fe9d489200d39764ecc638e	2018-11-01 12:48:48 -07:00
David Riazati	8f2bc1bc56	Add str() builtin (#13278 ) Summary: Allow casting to string from any IValue type Pull Request resolved: https://github.com/pytorch/pytorch/pull/13278 Differential Revision: D12852205 Pulled By: driazati fbshipit-source-id: 3e0e9218afdf27569da3ebf155f25e77e9f12984	2018-11-01 12:01:50 -07:00
Elias Ellison	70db53661b	expose fixed length list argument (#13142 ) Summary: Arguments have an optional fixed length list field which allows either a list or a single element that will be broadcast to a fixed length. This PR exposes that as a denotable argument, mostly to cover the many instances in which this used in the standard library. It appears in the standard library with ints & floats. Since this is not really a pattern we want to promote moving forward, I did not expose this for booleans or tensors. We could consider making the optional static length part of the list type, instead of the argument, which would make some of this code much nicer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13142 Differential Revision: D12876047 Pulled By: eellison fbshipit-source-id: e7359d2a878b4627fc2b9ebc090f9849ee524693	2018-11-01 10:34:52 -07:00
Elias Ellison	a5b627a0bf	add assert statements (#13408 ) Summary: Adding assert statements to unblock standard library. The same limitations that apply to the existing implementation of Exceptions apply to this as well (No control-flow logic, & we ignore the specific Exception thrown). Pull Request resolved: https://github.com/pytorch/pytorch/pull/13408 Reviewed By: driazati Differential Revision: D12876451 Pulled By: eellison fbshipit-source-id: 767ba5a50ba7c5dd6a857ed4845ac076a81cf305	2018-11-01 10:01:07 -07:00
David Riazati	f9c0a08eed	Fix len() for tensors (#13398 ) Summary: Fixes #13376, `len(tensor)` was converting tensor to a 1 element list and returning 1 every time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13398 Differential Revision: D12867630 Pulled By: driazati fbshipit-source-id: 28f3580a072d763df0980b3149c49d1894842ec9	2018-10-31 13:13:21 -07:00
David Riazati	404f8660e7	Add string.format() (#13157 ) Summary: This PR adds `aten::format` as a builtin op for strings with the basic formatting semantics of Python. It also adds varargs to the schema parser (with the limitation that the varargs item is the last argument, i.e. `(args, *kwargs)` is not supported) and to the compiler Pull Request resolved: https://github.com/pytorch/pytorch/pull/13157 Differential Revision: D12832537 Pulled By: driazati fbshipit-source-id: 17c1a5615bb286c648fc9e38f2ebe501b064c732	2018-10-31 12:50:56 -07:00
David Riazati	bc74ec80d0	Add support for torch.backends.cudnn.enabled (#13057 ) Summary: This is used commonly in `nn` functions. This PR adds it as a weak module (and also alters the conversion of weak modules to strong modules to accept ordinary `object`s) Pull Request resolved: https://github.com/pytorch/pytorch/pull/13057 Differential Revision: D10846618 Pulled By: driazati fbshipit-source-id: 028b9f852d40e2e53ee85b93282c98cef8cd336b	2018-10-31 09:31:09 -07:00
Elias Ellison	59f8e8ada7	First step at adding exceptions (#12789 ) Summary: This is a first step towards adding exceptions. We need minimal support in order to begin converting the torch library to weak script mode (which is the main goal here). Some limitations (that are documented in the tests & compiler): 1. Cannot assign exceptions to variables 2. Any name after raise is being treated as a valid Exception 3. No control flow analysis yet. Below a will be undefined: if True: a = 1 else: raise Exception("Hi") return a Pull Request resolved: https://github.com/pytorch/pytorch/pull/12789 Differential Revision: D12848936 Pulled By: eellison fbshipit-source-id: 1f60ceef2381040486123ec797e97d65b074862d	2018-10-30 20:25:50 -07:00
James Reed	7d9ab140bf	Fix aten::to symbolic + add expand_as (#13325 ) Summary: https://github.com/pytorch/pytorch/pull/13146 broke some cases of ONNX export, this fixes them Pull Request resolved: https://github.com/pytorch/pytorch/pull/13325 Differential Revision: D12844294 Pulled By: jamesr66a fbshipit-source-id: f98dd0685820b2a1e5fcd49733cfa5c19c48a4e7	2018-10-30 17:28:15 -07:00
David Riazati	ac64724ed9	Add support for tuple constants (#13086 ) Summary: Depends on #13072 Adds support for tuples as variables instead of just as literals. Before, tuples would give the error `python value of type 'tuple' cannot be used as a value`. This PR adds a flag on `SugaredValue` to determine in a value is a tuple or not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13086 Differential Revision: D10846632 Pulled By: driazati fbshipit-source-id: 7b5d6ae9426ca3dd476fee3f929357d7b180faa7	2018-10-30 09:01:17 -07:00
Richard Zou	8c2d0c831f	Speed up tensor.storage_offset (#13267 ) Summary: This PR special cases tensor.storage_offset to avoid dispatches in the common case. tensor.storage_offset is important for torch.as_strided performance, because as_strided(sizes, strides) shares an implementation with as_strided(sizes, strides, storage_offset) and it might not be the best if there were two separate implementations (including backward implementations). This PR reduces times on a tensor.storage_offset microbenchmark from 22ns to 2ns (these numbers are pretty stable). For a torch.as_strided benchmark, this PR reduces numbers from 1042 to 928ns, a 100ns improvement, but this number is noisy and goes up and down. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13267 Reviewed By: ezyang Differential Revision: D12829828 Pulled By: zou3519 fbshipit-source-id: df907731e2398ce2baf1c8b1860a561ccc456f78	2018-10-30 07:36:21 -07:00
mruberry	955a01562d	Removes debug spew in test_jit.py (#13280 ) Summary: Looks like a print() snuck in by accident with a recent PR and it's printing a lot of spew when the tests are run. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13280 Differential Revision: D12833449 Pulled By: michaelsuo fbshipit-source-id: 5b50fd4b03bb73e5ca44cabdc99609c10017ff55	2018-10-29 18:25:30 -07:00
James Reed	db0b5c7ab7	ArgumentStash for int64_t arguments (#12939 ) Summary: Closes https://github.com/pytorch/pytorch/issues/12906. https://github.com/pytorch/pytorch/issues/12580 is still open because the schema is marked as `traceable=false` in the arg parser constructor, I think. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12939 Differential Revision: D10492031 Pulled By: jamesr66a fbshipit-source-id: ca5376de3997b5fb62b493e2e6a9bb0d6c3b9687	2018-10-29 13:55:24 -07:00
Elias Ellison	9e6a695116	Add string equality test, string concat (#12992 ) Summary: Adding string equality comparison, and concat. Both are used in the standard library Pull Request resolved: https://github.com/pytorch/pytorch/pull/12992 Differential Revision: D10513681 Pulled By: eellison fbshipit-source-id: 1f845ef50be7850fdd3366951b20dc2a805c21fd	2018-10-29 10:13:21 -07:00
James Sun	4d62eef505	Add Future to IValue (#12976 ) Summary: Future now is an IValue. prim::Wait now is replaced by aten::wait This PR is built on top of #12925 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12976 Differential Revision: D10861483 Pulled By: highker fbshipit-source-id: 9e17926a625bc502fb12335ef9ce819f25776be7	2018-10-27 10:00:35 -07:00
Zachary DeVito	dae7616078	Shard all of tests based on how many tests exist. (#13160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13160 Reduces pytorch_core build from 2 hours to 30 minutes Reviewed By: soumith, dzhulgakov Differential Revision: D10524261 fbshipit-source-id: 97270ac73404b5ea4c264cd0e9d8d4b1be79b0e9	2018-10-26 18:20:34 -07:00
Wanchao Liang	7ca995c815	Add optional default type annotation to support JIT None default value (#13161 ) Summary: As titled, this PR is a part of tasks to unblock exporting the standard library Pull Request resolved: https://github.com/pytorch/pytorch/pull/13161 Differential Revision: D10866927 Pulled By: wanchaol fbshipit-source-id: 50038dbe6840b097b98cbed9d46a189a64e82302	2018-10-26 11:38:50 -07:00
Zachary DeVito	ce0d3e9b35	Bind inplace and _out variants into JIT (#13093 ) Summary: This commit is a minimial initial pass at adding inplace and _out variants to the JIT. It changes gen_jit_dispatch.py to add bindings for these operators, and it also supplements the FunctionSchema with alias information for these operators and for viewing operators. Tests are very minimal and will need to be improved in future commits. Notes: * Custom operator tests needed to be changed since _out variants add overloads, which the custom operator pipeline does not handle when called from python. This commit registers special test ops in the _test namespace for this purpose. * Extends the schema parser to parse alias annotations more robustly. * Extends FunctionSchema with `writes()` a set of alias set names that the op will write to, and `annotatedType()` which will return AnnotatedType objects which contain the alias_set information that was parsed from the schema. * Disables all optimizations in graph executor when a mutable operator is found. This is something that will be improved in the future but is necessary for correctness now. * Adds annotate_ops to gen_jit_dispatch which adds aliasing information to all of the aten ops. * Adds AnnotatedType to the type hierarchy which is used to mark List and Tensor types with their alias_set. These types only appear in schema when you call annotatedType and are erased from types in normal use. * Extends jit::Type with .containedTypes() and .withContained(new_types). The first returns all types contained within the type (e.g. T for T[], or {T,L} for a tuple (T, L)). The second constructs a new version of the same type, replacing the contained types with new_types. This simplifies a lot of logic for recursively cleaning up types. * Refactor List[T] into a common part that is shared with Annotated[T] and can be shared with Optional[T] and Future[T] when they are merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13093 Differential Revision: D10848176 Pulled By: zdevito fbshipit-source-id: d057f23eeb99cde8881129b42d3f151ed5e7655d	2018-10-26 10:37:20 -07:00
Richard Zou	efab8e8fdf	Speed up tensor.get_device(), is_cuda(), is_sparse() by avoiding dispatches (#12841 ) Summary: `tensor.get_device()` went through two dispatches: once to the native function `get_device()`, and another when `get_device` calls `_th_get_device()`. This PR avoids the dispatch by directly implementing the `get_device` function as a method on Tensor. Future Work: - Investigate caching Device on TensorImpl. This will probably bring the tensor.get_device down to 2ns, but I'm not sure it's worth it. before: ``` ------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------ BM_TensorTypeId 0 ns 0 ns 1000000000 BM_TensorType 8 ns 8 ns 89407911 BM_TensorIsCuda 24 ns 24 ns 29313017 BM_TensorIsSparse 27 ns 27 ns 26083160 BM_TensorTypeIsCuda 11 ns 11 ns 65128120 BM_TensorNumel 11 ns 11 ns 68314492 BM_TensorGetDevice 71 ns 71 ns 9633125 BM_DeviceGuardCtor 173 ns 173 ns 4067173 BM_DeviceGuard 232 ns 232 ns `3009690` ``` after: ``` ------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------ BM_TensorTypeId 0 ns 0 ns 1000000000 BM_TensorType 10 ns 10 ns 69803872 BM_TensorIsCuda 2 ns 2 ns 321626683 BM_TensorIsSparse 6 ns 6 ns 177045382 BM_TensorNumel 12 ns 12 ns 58770533 BM_TensorGetDevice 4 ns 4 ns 128113396 BM_DeviceGuardCtor 52 ns 52 ns 14997278 BM_DeviceGuard 158 ns 158 ns 5767248 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12841 Differential Revision: D10489353 Pulled By: zou3519 fbshipit-source-id: a596bc77352f21d5d35433c6de02c2f65aab5f9e	2018-10-25 19:57:52 -07:00
Wanchao Liang	4e1c64caee	Add c10::optional to type syntax (#12582 ) Summary: This PR adds optional type to ATen native, autograd, JIT schema and Python Arg parser, closes #9513. It allows us to use optional default values (including None) for function signature and implementations like clamp, etc., and also let us remove the python_default_init hack. Follow up: remove python_default_init completely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12582 Differential Revision: D10417423 Pulled By: wanchaol fbshipit-source-id: 1c80f0727bb528188b47c595629e2996be269b89	2018-10-25 16:08:29 -07:00
David Riazati	14ea4bf0d1	Make 7 nn modules into weak modules (#12966 ) Summary: Depends on #12682 ([stacked diff](https://github.com/driazati/pytorch/compare/weak_mod...driazati:mod_conv1)) * Adds tests for weak module conversion that creates a `ScriptModule` that uses the weak module and checks its graph * Adds `torch._jit_internal.weak_module` tags to modules that already work * `Sigmoid` * `Tanh` * `Hardshrink` * `PReLU` * `Softsign` * `Tanhshrink` * `PairwiseDistance` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12966 Differential Revision: D10559557 Pulled By: driazati fbshipit-source-id: dc4bea3aa744b3c44d4fa7dceefd97e951f824d0	2018-10-25 13:59:34 -07:00
David Riazati	eac3e7ab7c	improve constants error message (#13072 ) Summary: Adds the attribute name to the error message and fixes the corresponding test to actually run Pull Request resolved: https://github.com/pytorch/pytorch/pull/13072 Differential Revision: D10846622 Pulled By: driazati fbshipit-source-id: a7eee6320c28140c4937ede3d4e4685cfce08d84	2018-10-25 10:45:42 -07:00
David Riazati	6727133f3d	Support warnings.warn (#12964 ) Summary: `warnings.warn` is used commonly thoughout `nn.functional`, so this adds support for it by forwarding its arguments to `print` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12964 Differential Revision: D10559427 Pulled By: driazati fbshipit-source-id: 5b591f6f446c906418f9fc7730c17e301f263d9b	2018-10-24 16:48:02 -07:00
Soumith Chintala	cf235e0894	fix lint after new flake8 release added new style constraints (#13047 ) Summary: fix lint after new flake8 release added new style constraints Pull Request resolved: https://github.com/pytorch/pytorch/pull/13047 Differential Revision: D10527804 Pulled By: soumith fbshipit-source-id: 6f4d02662570b6339f69117b61037c8394b0bbd8	2018-10-24 09:03:38 -07:00
Elias Ellison	f9b7ce9c99	Add tuple indexing support for constant integers (#11492 ) Summary: Add support indexing tuples with constant integers by creating a new prim::TupleIndex operator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11492 Differential Revision: D9811996 Pulled By: eellison fbshipit-source-id: a458c2522b3c81476252d920e27a8d6c7b9a036b	2018-10-23 17:52:03 -07:00
David Riazati	af78d4cd49	Add weak script modules (#12682 ) Summary: Adds support for weak script modules created that get compiled to `ScriptModule`s once added as a submodule of a `ScriptModule`: ```python weak_module class Test(torch.nn.Module): ... weak_script_method def forward(self, x): ... ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12682 Differential Revision: D10458626 Pulled By: driazati fbshipit-source-id: 10ae23cb83cdafc4646cee58f399e14b2e60acd4	2018-10-23 09:06:02 -07:00
Edward Yang	bc1d96ca98	Add support for inline expect tests. (#12825 ) Summary: expecttest and test_expecttest are the implementation and tests for this functionality. I wired it up to the --accept flag, but there's also a new environment variable EXPECTTEST_ACCEPT which may be more convenient to trigger. Haven't tested if this works in fbcode. There may be a few expect tests which will benefit from inline treatment, but I just did one to show it works. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/12825 Reviewed By: teng-li Differential Revision: D10448630 Pulled By: ezyang fbshipit-source-id: 3d339f82e2d00891309620a60e13039fa1ed8b46	2018-10-22 19:29:04 -07:00
David Riazati	1e8064dec0	Convert 2 nn.functional functions to weak script (#12723 ) Summary: * Moves `weak_script` annotation to `torch/_jit_internal.py` folder to resolve dependency issue between `torch.jit` and `torch.nn` * Add `torch._jit.weak_script` to `tanhshrink` and `softsign`, their tests now pass instead of giving an `unknown builtin op` error * Blacklist converted `torch.nn.functional` functions from appearing in the builtin op list if they don't actually have corresponding `aten` ops Pull Request resolved: https://github.com/pytorch/pytorch/pull/12723 Differential Revision: D10452986 Pulled By: driazati fbshipit-source-id: c7842bc2d3ba0aaf7ca6e1e228523dbed3d63c36	2018-10-21 14:09:55 -07:00
Elias Ellison	f3e1fe5ca5	add string as supported input / output of script functions (#12731 ) Summary: Add strings to our set of built-in types for annotations. This is used in the the functional library. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12731 Differential Revision: D10453153 Pulled By: eellison fbshipit-source-id: f54177c0c529f2e09f7ff380ddb476c3545ba5b0	2018-10-19 11:17:19 -07:00
Zachary DeVito	87d3d209a6	Enable JIT tests in fbcode (#12777 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12777 Enables JIT tests in FBCode. Changes pybind11 code to avoid mixing py::args with positinally matched arguments because old versions of PyBind11 leak memory in this case. Reviewed By: jamesr66a Differential Revision: D10419708 fbshipit-source-id: 74bc466001b5d363132d1af32e96841b38601827	2018-10-18 18:18:37 -07:00
James Sun	f4944f0f8a	Rename test/common.py to test/common_utils.py (#12794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794 common.py is used in base_module for almost all tests in test/. The name of this file is so common that can easily conflict with other dependencies if they happen to have another common.py in the base module. Rename the file to avoid conflict. Reviewed By: orionr Differential Revision: D10438204 fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380	2018-10-17 23:04:29 -07:00
Sepehr Sameni	cffeb03a2d	fix forward and backward for norm with negative infinity norm (#12722 ) Summary: I found a bug in norm() and fixed it (and added tests to make sure it's fixed) here is how to reproduce it: ```python import torch x = torch.FloatTensor([[10, 12, 13], [4, 0, 12]]) print(torch.norm(x, -40, dim=0, keepdim=True)) #output is tensor([[ 4.0000, 0.0000, 11.9853]]) print(torch.norm(x, float('-inf'), dim=0, keepdim=True)) #output is tensor([[1., 1., 1.]]) which is wrong! from numpy.linalg import norm as np_norm x = x.numpy() print(np_norm(x, ord=-40, axis=0)) #output is array([[4., 0., 11.985261]]) print(np_norm(x, ord=float('-inf'), axis=0)) #output is array([[4., 0., 12.0]]) ``` it's related to [#6817](https://github.com/pytorch/pytorch/issues/6817) and [#6969](https://github.com/pytorch/pytorch/pull/6969) Pull Request resolved: https://github.com/pytorch/pytorch/pull/12722 Differential Revision: D10427687 Pulled By: soumith fbshipit-source-id: 936a7491d1e2625410513ee9c39f8c910e8e6803	2018-10-17 21:07:43 -07:00
Zachary DeVito	c8ac878b98	Fix bug in script for where (#12385 ) Summary: Where is declared as: ``` where(Tensor condition, Tensor self, Tensor other) ``` Previously the compiler assumed that self must be the first argument. But this is not true in practice for `where` and for a few other exceptions. This changes the compiler to take an explicit self argument which gets matched to the `self` that appears in the schema. Note that this requires renaming a variant of pow, which referred to an exponent Tensor as `self` because otherwise that would cause `t^3` to match against `t` being the exponent. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12385 Differential Revision: D10364658 Pulled By: zdevito fbshipit-source-id: 39e030c6912dd19b4b0b9e35fcbabc167b4cc255	2018-10-16 21:05:14 -07:00
Natalia Gimelshein	a98958d3bd	dtype option for softmax (#11719 ) Summary: Add dtype argument to softmax/log_softmax functions. Computing softmax in fp32 precision is necessary for mixed precision training, and converting output of the previous layer into fp32 and then reading it as fp32 in softmax is expensive, memory and perf-wise, this PR allows one to avoid it. For most input data/dtype combinations, input data is converted to dtype and then softmax is computed. If input data is half type and dtype is fp32, kernels with the corresponding template arguments are called. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11719 Reviewed By: ezyang Differential Revision: D10175514 Pulled By: zou3519 fbshipit-source-id: 06d285af91a0b659932236d41ad63b787eeed243	2018-10-13 17:57:10 -07:00
Xiang Gao	97eec33f80	Allow tensor.device, tensor.dtype, and tensor.shape in JIT (#12363 ) Summary: Closes https://github.com/pytorch/pytorch/issues/12364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12363 Differential Revision: D10362491 Pulled By: ezyang fbshipit-source-id: f2716e656977370c5ec51cb15f62b6376798e617	2018-10-12 11:29:04 -07:00
James Reed	2279299c6c	Implement aten::contiguous (#12541 ) Summary: Implement contiguous as `aten::contiguous` so it can be recorded during tracing. This was causing issues with both the trace checker as well as when a `contiguous()`-ed tensor was used downstream in a view that expected certain strides Pull Request resolved: https://github.com/pytorch/pytorch/pull/12541 Differential Revision: D10304028 Pulled By: jamesr66a fbshipit-source-id: dc4c878771d052f5a0e9674f610fdec3c6782c41	2018-10-11 23:39:39 -07:00
David Riazati	eb5fdc5fb5	Add default values in script (#12345 ) Summary: Add support for default values on script functions and Modules Followup to #11962 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12345 Reviewed By: michaelsuo Differential Revision: D10263613 Pulled By: driazati fbshipit-source-id: 9b380d8c3f8c4abb2d24c33b23c00ec5896ca372	2018-10-11 20:49:23 -07:00
Richard Zou	a1487bf874	Smarter differentiable subgraph slicing (#12175 ) Summary: If any inputs require_grad then the graph executor does differential subgraph slicing. The existing algorithm combines adjacent differentiable Node. There are two major motivations. The first is improving fusion opportunities: the graph fusion pass runs after differential subgraph slicing. This means that only nodes that are a part of the same differential subgraph may be considered for fusion. If something like the following happens, ``` y = f(x) k = not_differentiable_op(m) z = g(y) ``` and f and g are both fusible and differentiable operations, then they will be inserted into different differential subgraphs and not fused together. The second is to enable JIT optimizations on backward passes for things like an (automatically) unrolled LSTM. Right now, in an unrolled LSTM, we see something like the following: ``` lstm_cell() non_differentiable_list_op() lstm_cell() non_differentiable_list_op() lstm_cell() non_differentiable_list_op() ``` Each lstm_cell itself is differentiable and gets put into a separate differential subgraph. During the backwards pass, each prim::DifferentiableSubgraph has its own graph executor: these graph executors cannot talk to each other. It is better if we combined all of the lstm_cells (where applicable) into one differential subgraph so their backward passes are combined into one graph executor that can perform better optimizations than several separate graph executors. Think about the computation graph as a DAG where edges are data dependencies and vertices are operations (the nodes). Each vertex is either black or red; a vertex is colored black if it is differentiable and red otherwise. The goal is to contract edges (merge nodes) to have the fewest black vertices remaining such that the graph is still a DAG. The algorithm is the following: - Take the Graph& and create a shadow "DynamicDAG" object to wrap Node and edges. Each Vertex holds multiple Node* (but starts out holding one Node) and each edge is a data dependency. - Greedily contract vertices in the DynamicDAG if they are "differentiable". This operation is unrelated to the Graph&. - A Vertex is "differentiable" if all the nodes it holds is differentiable. - When contracting vertices, combine their Node contents. - The DynamicDAG keeps its vertices in topological order and complains if the contraction is invalid so everything is good. - Take the DynamicDAG: reorder the nodes in the Graph& to match the topological order in the DynamicDAG. - Finally, go through each Vertex in the DynamicDAG: if it contains multiple Node* then merge all of them into a prim::DifferentiableGraph. The DynamicDAG is based off of the dynamic top sort algorithm in [this paper](https://www.doc.ic.ac.uk/~phjk/Publications/DynamicTopoSortAlg-JEA-07.pdf) by Pearce and Kelly. Each contractEdge(producer, consumer) call is `O(\|AR\| log \|AR\| * min(\|out_edges(producer)\|, \|in_edges(consumer)\|)` where `AR` is the "affected region" (defined as the set of nodes that, in topological order, are between producer and consumer). By only considering contractions such that `\|ord(producer) - ord(consumer)\| < threshold1` and `\|out_edges(producer)\| < threshold2` we can make each contractEdge(producer, consumer) call take constant time. The resulting algorithm is linear in the number of nodes. Added a lot of small test cases. Looking for suggestions on the following: - what big computation graphs should I run this on to test how fast or slow it is? - what things other than correctness should I be thinking about when I test this? cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/12175 Differential Revision: D10302564 Pulled By: zou3519 fbshipit-source-id: 8a94d130d82f8a1713cc28483afef9a72d83d61a	2018-10-11 16:20:53 -07:00
James Reed	0f9807ee61	Enable addmm fusion for ONNX export only (#12538 ) Summary: There's some action at a distance issues and not having this is disabling quantization in C2 for prod use cases ref T34831022 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12538 Differential Revision: D10302931 Pulled By: jamesr66a fbshipit-source-id: 700dc8c5c4297e942171992266ffb67b815be754	2018-10-11 13:57:50 -07:00
James Reed	a4120fa132	Get rid of emitApplyIdent (#12504 ) Summary: And reroute builtin/CompilationUnit function resolution through one resolution pathway Pull Request resolved: https://github.com/pytorch/pytorch/pull/12504 Differential Revision: D10319920 Pulled By: jamesr66a fbshipit-source-id: 3ab9877664dd32b97136a7625d0688e1adc0c022	2018-10-11 10:53:53 -07:00
Roy Li	1a0d82e4f4	fix import for script module with control flow blocks (#12351 ) Summary: The value_info proto field was being processed in BuildGraph, but control flow blocks used buildBlocks instead. This PR moves moves that step to BuildBlock. I removed DecoderBase because it was making the code confusing and we never needed it in the first place. closes #12319 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12351 Differential Revision: D10212411 Pulled By: li-roy fbshipit-source-id: 47f289a462a1ab7391ff57368185401673980233	2018-10-08 22:25:14 -07:00
Elias Ellison	00aedfc0e2	constant pooling pass (#12222 ) Summary: Add a pass to move all constants to the beginning of the graph, and deduplicate. This extends https://github.com/pytorch/pytorch/pull/10231 to also handle constants introduced in inlining, constant propagation, etc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12222 Reviewed By: driazati Differential Revision: D10201616 Pulled By: eellison fbshipit-source-id: bc9c5be26868c8b5414257a0d4462de025aeb9bd	2018-10-08 11:55:02 -07:00
David Riazati	92b0e7026e	Add weak script mode for script functions (#11963 ) Summary: This PR is the start of weak script mode for functions Weak scripts allow you to compile a graph from Python code at runtime by annotating with `torch.jit.weak_script` for use in the JIT without affecting eager execution. Scripts are compiled lazily on the first call in a graph to avoid long Python startup times. apaszke zdevito ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11963 Differential Revision: D10183451 Pulled By: driazati fbshipit-source-id: 128750994d5eb148a984f8aba4113525c3e248c8	2018-10-05 18:55:49 -07:00
Zachary DeVito	b937cbb776	Fix a bug that would resize tensor storage on export Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12377 Differential Revision: D10219213 Pulled By: zdevito fbshipit-source-id: 85cfa4467c672ff5a718e58cfae7e8c8b1cfc532	2018-10-05 16:24:54 -07:00
David Riazati	f0b73ff790	Pretty printer improvements (#12179 ) Summary: * Replaces `prim::PythonOp` with the name of the function being called * Delays printing values used in `prim::Return` nodes until the return node itself if that is the only place the value is used to remove some useless assigns zdevito apaszke ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/12179 Differential Revision: D10132661 Pulled By: driazati fbshipit-source-id: cbc4ac34137ed5872049082e25d19eb1ebc71208	2018-10-04 15:14:51 -07:00
David Riazati	c9f9df002d	Properly catch errors in PythonOps (#12243 ) Summary: If a PythonOp throws an error it raises an exception to the interpreter and also releases the GIL which causes [pybind to segfault](https://github.com/potassco/clingo/issues/42) This fix catches pybind errors while the GIL is still held and throws a `python_error` to re-capture the GIL Fixes #12118 apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/12243 Differential Revision: D10182787 Pulled By: driazati fbshipit-source-id: 719d4a7c3294af201e061cf7141bec3ca0fb1f04	2018-10-03 17:25:03 -07:00
David Riazati	d1ac1eba3b	Add `bool` type to IR (#11834 ) Summary: This PR adds a bool type to `IValue` and puts it into place. * changes conds for `prim::If` and `prim::Loop` to use `bool` type * changes operators that take `bool`s to match their native ops * fixes ambiguous `aten` ops `aten::std` and `aten::var` * fixes tests in `test_jit.py TestJitGenerated` ``` 'test_std_dim', 'test_std_dim_1d', 'test_std_dim_1d_neg0', 'test_std_dim_neg0', 'test_var_dim', 'test_var_dim_1d', 'test_var_dim_1d_neg0', 'test_var_dim_neg0' ``` * adds `prim::BoolToTensor` and `prim::TensorToBool` apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834 Differential Revision: D9928570 Pulled By: driazati fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb	2018-10-03 12:40:03 -07:00
Elias Ellison	fed91f873f	(Very small) allow trailing commas in assign or tuples (#11723 ) Summary: Allow trailing commas in assign statements or tuples, which also allows single element tuples. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11723 Differential Revision: D10052162 Pulled By: eellison fbshipit-source-id: 344d908a3ad942a23ebd9f341794bc9734226aa8	2018-10-01 10:10:13 -07:00
iotamudelta	a2ebbccc9f	fix unit tests on CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12187 Differential Revision: D10118483 Pulled By: bddppq fbshipit-source-id: 986c8fb48d61e00103c713548a50e74489a0e442	2018-09-28 23:11:55 -07:00
mruberry	7b2c0a09e4	Adds support for NaN, +inf, -inf float scalars to CPU and CUDA fusers (#12070 ) Summary: In current upstream float scalars are always written into kernels with: `out << std::scientific << v << "f";` When the floats are special values like NaN, +inf, or -inf this produces nonsense that causes compilation to fail. This fix updates the conversion of float scalars to device-specific special values. The appropriate macros are added to the CPU and CUDA resource strings. Note that a NAN macro was not necessary on the CPU since math.h defines NAN. To verify this fix I updated the test_clamp_fusion test in test_jit.py. I wanted to test -inf, too, but -inf is not currently accepted by the interpreter. Edit: Forgot to mention, this partially addresses issue #12067. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12070 Reviewed By: ezyang Differential Revision: D10044704 Pulled By: soumith fbshipit-source-id: 8f4a930862d66a7d37d985e3f6a6fb724579e74c	2018-09-28 14:11:49 -07:00
Luca Antiga	5be0baefa2	Use streams in JIT serialization, allow JIT serialization to/from buffer (#11932 ) Summary: This PR replaces the use of `std::FILE` with `istream`/`ostream` for JIT serialization. It uses this mechanism to add the possibility to serialize to/from binary buffers, in addition to files, both in `libtorch` and from Python. `getExportImportCopy` in `test_jit.py` has been updated so that both file and buffer codepaths are exercised during tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11932 Differential Revision: D10084303 Pulled By: apaszke fbshipit-source-id: b850801b3932922fa1dbac6fdaed5063d58bc20d	2018-09-28 07:54:27 -07:00
Michael Suo	7f35e92af2	mutable lists (#10700 ) Summary: This PR implements the design that we discussed. Changes: - Added a World token IValue and type. The IValue is basically a dummy struct for now, in the future we may extend it (say, add thread-local state). - Effectful ops explicitly declare they are mutable by having World tokens as inputs and outputs in their schema. - Purely functional ops that use mutable values will get "fenced" and the world token will be threaded through the fences - AnnotateEffects pass which wires up all the world tokens together. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10700 Reviewed By: eellison Differential Revision: D9547881 Pulled By: michaelsuo fbshipit-source-id: ebbd786c31f15bf45e2ddb0c188438ff2f5f3c88	2018-09-27 19:25:13 -07:00
Zachary DeVito	478803a75f	Introduce type variables to implement generic list operators (#12040 ) Summary: We generate specialized list operations for int, float, and Tensor lists so that small lists of integers like the arguments to conv do not involve tons of boxing code. This PR adds a fallback GenericList for List types that contain any other type. It does so by adding type variables to `jit::Type`, and machinery for matching/replacing the type variables during `tryMatchSchema` and operator lookup. It also modifies the builtin list ops to include a fallback that works on a GenericList object that simply holds IValues. This is distinguished from IValue's tuple type so that conversion to/from Python still happens losslessly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12040 Differential Revision: D10037098 Pulled By: zdevito fbshipit-source-id: 0c5f2864d12e7d33554bf34cc29e5fb700dde150	2018-09-26 17:02:51 -07:00
Adam Paszke	18f9c07b18	Enable tracing of tensor factories with an out argument Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12051 Differential Revision: D10044890 Pulled By: apaszke fbshipit-source-id: 2d794bf408875600bc71f354f0b4961d6b715094	2018-09-26 09:40:34 -07:00
Richard Zou	c8a0b11b7f	add autodiff expressions for common operations (#11832 ) Summary: This PR does a few things: Previously test_jit.py only tested autograd on backward graphs. This is because we borrow from test_autograd and construct graphs with a small number of nodes. Because the number of nodes is small (typically 1-2), those graph do not end up containing autodiff subgraphs, so autodiff never gets tested. This PR enables autodiff testing by doing the following: - added disableDebugAutodiffSubgraphInlining fn to graph_executor to disable autodiff subgraph inlining. - (implementation) added autodiffSubgraphNodeThreshold and autodiffSubgraphInlineThreshold. These are set to their default values (2, 5) but disableDebugAutodiffSubgraphInlining() sets both to 1, disabling subgraph inlining and allowing 1-node autodiff subgraphs. - The relevant backward jit tests disable autodiff subgraph inlining so they will test the autodiff versions of the operators instead of autograd whenever an autodiff variant exists. - We don't run the tests that do inline autodiff subgraphs anymore. This has no impact on testing correctness because the assumption is that autograd functions are correct and are tested in test_autograd.py This allows the graph fuser to work better because a lot of these ops were previously not autodiff-compatible but fusible. On a more concrete example, lstm backward contains a lot of tensor-scalar operations; these autodiff formulas help its double backward pass. Included: - arithmetic overloads - abs, acos, asin, atan, ceil, cos, cosh, exp, expm1, floor, fmod, frac, log, log10, log1p, log2 reciprocal, remainder, round, sin, sinh, tan, trunc, rsqrt TestJitGenerated tests autodiff for all of the added operations. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11832 Differential Revision: D10031256 Pulled By: zou3519 fbshipit-source-id: 9daf9900a5ad187743609cd0fbbd10b15411ad93	2018-09-26 08:10:04 -07:00
Adam Paszke	a830964007	Eliminate no-op adds and muls in peephole pass (#11801 ) Summary: Because we emit a lot of them in our symbolic AD. This brings down the backward time of an LSTM I'm testing from 14.2ms to 12.5ms (a 15% improvement). Pull Request resolved: https://github.com/pytorch/pytorch/pull/11801 Differential Revision: D9916815 Pulled By: apaszke fbshipit-source-id: 2d9cb886c424ccd43b9f996aad89950d3bddf494	2018-09-24 17:48:48 -07:00
Adam Paszke	51414822f5	Stop moving constants into DifferentiableSubgraphs (#11809 ) Summary: Or even taking them as inputs. This prevents optimizations to happen either inside the differentiable subgraphs, or in the surrounding graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11809 Differential Revision: D10009680 Pulled By: apaszke fbshipit-source-id: face638566228e470a6deec48dc2aa3a1cce26d4	2018-09-24 13:24:53 -07:00
Richard Zou	b5f60af94c	Shape prop view/reshape/as_strided through prim::ListConstructs (#11877 ) Summary: Previously, aten::view returned a Dynamic type when attr::size is a prim::ListConstruct. See [this for a repro](https://gist.github.com/zou3519/cbd610472ba3369f556fa612a7d93b28). This prevented a pre-multipled lstm input graph from being fusible (aten::view is necessary to do premultiplication). If aten::view is passed an output of a prim::ListConstruct node, then shape prop should be able to figure out its TensorType because we statically know the number of inputs to prim::ListConstruct. This PR implements that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11877 Differential Revision: D9972356 Pulled By: zou3519 fbshipit-source-id: cb87786f6e7f222d4b8f07d8f2a9de34859cb6a5	2018-09-21 14:20:01 -07:00
Adam Paszke	7efbf3a827	Specialize ArgumentSpecs on tuple elements too (#11863 ) Summary: This is pretty important because a common situation of passing LSTM hidden states as a tuple completely trashes performance of a network. Cleans up all our propagation/undef specialization passes, at a cost of increased complexity of `ArgumentSpec` and `GraphExecutor`. An alternative would be to simply flatten all tuple inputs to a graph ahead of time, but that might just end up being confusing in the future (you never know if you're working with a graph that can have tuple or not). Pull Request resolved: https://github.com/pytorch/pytorch/pull/11863 Differential Revision: D9992814 Pulled By: apaszke fbshipit-source-id: 0a565a3b23e32f8fa72c0534e07c1ce6187739fc	2018-09-21 14:19:58 -07:00
Adam Paszke	1ad7e0c5ec	Minor JIT improvements (#11654 ) Summary: - Disable addmm fusion. The reason for this is explained in the comment. - Tiny change in `stack.h` that lets us avoid constructing an unnecessary temporary `IValue` on the (C++) stack (it will only get created on the interpreter stack directly). - Fixed a correctness issue in requires grad propagation Pull Request resolved: https://github.com/pytorch/pytorch/pull/11654 Reviewed By: colesbury Differential Revision: D9813739 Pulled By: apaszke fbshipit-source-id: 23e83bc8605802f39bfecf447efad9239b9421c3	2018-09-21 14:19:54 -07:00
David Riazati	4e65fbfee5	Remove tests from EXCLUDE_SCRIPT that pass (#11916 ) Summary: Spruriously added in #11261 I had a PR to catch these automatically (#11279), but it had some issues passing on some CI environments but not others (e.g. for `test_nn_group_norm`), any ideas? Pull Request resolved: https://github.com/pytorch/pytorch/pull/11916 Differential Revision: D9992065 Pulled By: driazati fbshipit-source-id: 05cfa8ed9af939e8ffd5827847ee7bfe0be799b2	2018-09-21 14:19:50 -07:00
Luca Antiga	58d28a5f12	Fix saving loaded module (#11915 ) Summary: This PR fixes #11913. In order to test for this, the model is serialized twice in `getExportImportCopy`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11915 Differential Revision: D9984697 Pulled By: soumith fbshipit-source-id: ae0250c179000c03db1522b99410f6ecb9681297	2018-09-21 06:58:16 -07:00
yya007	b91b15d86e	Implementing Matrix Norm for torch.norm (#11261 ) Summary: Currently, norm function only supports vector norm. This PR extends vector norm to matrix norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11261 Reviewed By: li-roy Differential Revision: D9652379 Pulled By: yya007 fbshipit-source-id: 519b3fb80b563c17c56a24675c7b0e46bf5a3a1c	2018-09-20 14:43:13 -07:00
Thomas Viehmann	068eac255b	Jit fuse clamp (#11574 ) Summary: This patch adds fused forward and backward for clamp to the jit. This is one item of #11118 . If it's OK, I'd be happy to also add some more of #11118 . The patch depends on #11150 , which I merged into master as a base. I'll rebase it when that or #10981 is merged. This is first serious jit patch, thank you, ngimel and the others for their guidance. All errors are my own. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11574 Differential Revision: D9943090 Pulled By: apaszke fbshipit-source-id: c40954b8c28c374baab8d3bd89acc9250580dc67	2018-09-20 14:43:10 -07:00
Richard Zou	8f4601fbac	renable test_scalar_fusion Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11378 Differential Revision: D9943578 Pulled By: zou3519 fbshipit-source-id: fb9e4303e844d5e2515acce7869bcbe11526ab56	2018-09-20 07:56:25 -07:00
David Riazati	a79f5d77ad	Add pretty printer for JIT IR (#10319 ) Summary: Adds some pretty-printing capability to the IR graph to make debugging easier/more human readable, see `torch/csrc/jit/test_jit.cpp:925` and onwards for example outputs. Results aren't perfect yet but it's a start. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10319 Reviewed By: zdevito Differential Revision: D9558402 Pulled By: driazati fbshipit-source-id: 1d61c02818daa4c9bdca36d1477d1734cfc7d043	2018-09-18 17:39:44 -07:00
Wanchao Liang	d4e1fa45d0	allow no-alpha add/sub in onnx symbolic (#10972 ) Summary: The PR fixes #10873 The context is aten::add and aten::sub ST overloads don't have alpha, so onnx symbolic does not match. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10972 Reviewed By: jamesr66a Differential Revision: D9724224 Pulled By: wanchaol fbshipit-source-id: eb5d1b09fa8f1604b288f4a62b8d1f0bc66611af	2018-09-18 13:55:39 -07:00
David Riazati	7671f4ab1c	Add `math` to scope when using inf in tests (#11302 ) Summary: This fixes #8515 which was mostly issues in the test themselves. As long as `math` is imported in the scope in which the script runs it resolves to a `prim::Constant` with value `inf` correctly. This PR adds this to the `test_jit.py` tests involving `inf` and adds a test to demonstrate `inf` in a non-generated test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11302 Differential Revision: D9684336 Pulled By: driazati fbshipit-source-id: 73df2848dfdb45ab50690a7c88df8fda269a64eb	2018-09-17 14:08:32 -07:00
Natalia Gimelshein	336323f53c	return aten::gt to the list of fusable operations, add expected graphs (#11150 ) Summary: Fixes one of #11118 issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11150 Differential Revision: D9861372 Pulled By: apaszke fbshipit-source-id: 98b196b89e991d3936360b30568360367fd32e8b	2018-09-17 13:40:41 -07:00
Mike Ruberry	96d3f968eb	Splits CPU and CUDA fusion compilers (#10981 ) Summary: This PR splits the CPU and CUDA fusion compilers, putting them into a new jit/fusers/ directory with jit/fusers/common for common components. In particular: - A fusion interface is created that allows "fusion handles" to be requested - The CPU and CUDA fusers implement this interface, with dispatch determined by device - The fusion compilers, fusion function specializations and resource strings are split - CPU-specific classes like TempFile and DynamicLibrary are in the CPU fuser - Common classes likes TensorDesc and the base fusion function class are in jit/fusers/common - There is still some specialization in jit/fusers/common, but these specializations are small(-ish) - Updates the build system to remove the dummy interface on Windows and minimize the use of macros This structure should allow in-flight PRs to easily rebase while providing a clear interface to the fusers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10981 Reviewed By: soumith Differential Revision: D9701999 Pulled By: apaszke fbshipit-source-id: 3b6bec7b97e0444b2a93caa38d9b897f2e68c1b3	2018-09-14 14:05:34 -07:00
James Reed	278e304c18	Implement elif in string frontend (#11667 ) Summary: Closes #11625 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11667 Differential Revision: D9828145 Pulled By: jamesr66a fbshipit-source-id: c72dc41cb310a4211b4e4c6b33f7e2c1fb3581a0	2018-09-14 10:09:46 -07:00
Adam Paszke	98e04db955	Implement requires_grad propagation in the JIT (#11586 ) Summary: Previously, we would pretty much assume that all floating point tensors do require grad, which might result in some unnecessary compute. I don't really like the fact that `TensorType` uses `tensor.is_variable() && tensor.requires_grad()` to infer the value of `requires_grad`, but changing constants to keep variables turns out to be pretty hard. I got halfway there, but it would still need some more work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11586 Reviewed By: ezyang Differential Revision: D9813648 Pulled By: apaszke fbshipit-source-id: 77f77756d18ff7632fca3aa68ce855e1d7f3bdb8	2018-09-13 19:25:26 -07:00
James Reed	0f1ca569ce	End-to-end dynamic slicing with ONNX DynamicSlice experimental operator (#11255 ) Summary: Requires https://github.com/onnx/onnx/pull/1377 This PR makes it so that slices with dynamic boundary values can be exported from pytorch and run in caffe2 via ONNX. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11255 Differential Revision: D9790216 Pulled By: jamesr66a fbshipit-source-id: 6adfcddc5788df4d34d7ca98341077140402a3e2	2018-09-13 12:39:52 -07:00
Roy Li	75f49befeb	move instance_norm to aten (#10792 ) Summary: This also removes the usage of torch.onnx.symbolic_override in instance_norm. Fixes #8439. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10792 Differential Revision: D9800643 Pulled By: li-roy fbshipit-source-id: fa13a57de5a31fbfa2d4d02639d214c867b9e1f1	2018-09-13 12:26:22 -07:00
Richard Zou	45e9ee096e	Fix test_mnist_training_leaks_no_memory_cuda warning (#11639 ) Summary: Before this PR it would warn that "dropout is non deterministic and can cause problems when checking trace", so I disabled the trace checking. cc zdevito apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/11639 Differential Revision: D9812493 Pulled By: zou3519 fbshipit-source-id: fab86928a5fba8b218b47543533aaf7c82a10b4a	2018-09-13 12:09:20 -07:00
David Riazati	6f53b4efea	Remove implicit bool casts (#11503 ) Summary: In order to comply with Python's rules on implicit casting of non-booleans to booleans, this PR removes implicit casting in favor of explicit casts via `bool()` cc zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11503 Differential Revision: D9780869 Pulled By: driazati fbshipit-source-id: c753acaca27f4e79dddf424c6b04674f44a6aad9	2018-09-13 11:26:45 -07:00
Zachary DeVito	ab3a2d25fb	Improve error messages when trying to use nested lists. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11606 Differential Revision: D9806949 Pulled By: zdevito fbshipit-source-id: c38abc4ce745a63d26a64f6aa1b41350e4b1acd5	2018-09-13 11:10:38 -07:00
Roy Li	a861573e36	fix tensor export bug in IR export (#11613 ) Differential Revision: D9811094 Pulled By: li-roy fbshipit-source-id: 012792dbedc70bd3fa242fdf2e39da0b21ce158d	2018-09-13 11:10:35 -07:00
Elias Ellison	77f6998e54	Guard against inputting or returning sparse tensors (#11550 ) Summary: Add guards against using sparse tensor by checking the conversion from IValue -> PyObject & PyObject -> IValue. This diff also changes the behavior in constant propagation to not run python ops even if all ops are constant because of possible mutation to global state. This came up in trying to run get_sparse(), and I'm including it here to make it easier to land. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11550 Differential Revision: D9804712 Pulled By: eellison fbshipit-source-id: 9fe7daf721c6d6e48df4925c0f9c775873bcdc77	2018-09-13 08:58:29 -07:00
Wanchao Liang	44b2b6b150	clean up jit generated tests (#11403 ) Summary: Clean up some generated tests after we have newly nice features like var args. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11403 Differential Revision: D9800545 Pulled By: wanchaol fbshipit-source-id: e9973b113f78dc38cf99a81b6ede3fa3485f1cfa	2018-09-12 22:55:03 -07:00
Wanchao Liang	739e6af869	Add reminder % to the jit Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11557 Reviewed By: apaszke Differential Revision: D9784642 Pulled By: wanchaol fbshipit-source-id: b7c60c3e9534555c9d7db83769965b3f2f277cdf	2018-09-12 12:40:38 -07:00
Zachary DeVito	ad7936e108	Fix reloading modules back into python (#11552 ) Summary: This changes the way module import works so that when a module is reloaded in python it becomes a ScriptModule and not a _C.ScriptModule Pull Request resolved: https://github.com/pytorch/pytorch/pull/11552 Differential Revision: D9782751 Pulled By: zdevito fbshipit-source-id: 9576850b75494b228ce3def94c0d371a4a44b11d	2018-09-12 12:25:15 -07:00
Richard Zou	13b05c8c78	Add EndToEndHybridModel CUDA tests (#11544 ) Summary: Also adds two additional tests that check for memory leaks while the relevant graph executors are alive: - (minimal test): Create a ScriptModule, keep it alive, and test that it does not leak memory while it is alive - (large test) Do MNIST training with a traced MNIST module and test that no memory is leaked while the traced module (with graph executor) is alive cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11544 Reviewed By: apaszke Differential Revision: D9778479 Pulled By: zou3519 fbshipit-source-id: 2d6cdea81dd1264f2c0396b662f70fdafecb3647	2018-09-12 11:25:18 -07:00
Adam Paszke	62c9d4ac96	Make .to() methods native functions (to fix JIT tracing) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11491 Differential Revision: D9771121 Pulled By: apaszke fbshipit-source-id: 08d11101fb12093f8cf913b06359adddf3af9da7	2018-09-11 21:55:42 -07:00
Adam Paszke	8b196d671b	Allow tracing random functions (only when using default generators) (#11539 ) Summary: Fixes #11504. zdevito, neerajprad, fritzo Pull Request resolved: https://github.com/pytorch/pytorch/pull/11539 Differential Revision: D9777897 Pulled By: apaszke fbshipit-source-id: 56983260f5b93da7d5540a6242769ea7bd50eb06	2018-09-11 17:56:39 -07:00
Zachary DeVito	289a8c9b7d	Allow train/eval, and non-Tensor arguments to python functions (#11505 ) Summary: This whitelists train/eval functions in script modules, and tests that nested nn.Modules still work. This also changes the code for calling python functions from script to allow non-tensor inputs/outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11505 Differential Revision: D9765466 Pulled By: zdevito fbshipit-source-id: 1177bff931324422b69e18fa0bbaa82e3c98ec69	2018-09-11 15:05:09 -07:00
James Reed	deac304b6b	Bugfix for basic slicing Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11428 Differential Revision: D9753999 Pulled By: jamesr66a fbshipit-source-id: cfc4163a5a06b41beb808a4e24650d71f5d91f4f	2018-09-11 09:39:29 -07:00
Adam Paszke	120d769432	Add support for tracing strings (#11506 ) Summary: This enabled `torch.einsum` both in tracing and in script mode. It's used all over Pyro at the moment, and is needed for any use of the JIT in there. Fixes #11157. zdevito fritzo neerajprad Pull Request resolved: https://github.com/pytorch/pytorch/pull/11506 Differential Revision: D9764787 Pulled By: apaszke fbshipit-source-id: 9b5251b9e7c5897034602bd07ff67b425d33326c	2018-09-11 06:02:41 -07:00
Adam Paszke	0ddbe668cd	Improve shape analysis to cover all most commonly used ops (#11358 ) Summary: [Here's a list](https://gist.github.com/apaszke/f0821840bdcc67a977832dc58acc1b85) of ops that are in `register_aten_ops.cpp`, but aren't supported in shape prop. Everything else should work now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11358 Differential Revision: D9753693 Pulled By: apaszke fbshipit-source-id: efeae0126ce16cb56b8797fc5246405588bcae3c	2018-09-11 06:02:39 -07:00
James Reed	3ad67c60f0	Traceable explicit Variable instantiation (#11463 ) Summary: There's a bunch of legacy code where people are explicitly instantiating Variable, and these call-sites have thus far been untraceable (appearing as prim::Constant nodes with the tensor value at the time of tracing). This makes it so that the new variable inherits the traced Value* from the tensor it's being constructed from Pull Request resolved: https://github.com/pytorch/pytorch/pull/11463 Differential Revision: D9756529 Pulled By: jamesr66a fbshipit-source-id: da99c6a7621957a305f2699ec9cb9def69b1b2d7	2018-09-10 17:03:24 -07:00
Adam Paszke	3e665cc29b	Improve support for tracing sizes, add more tracer warnings (#11288 ) Summary: Many constructors like `torch.zeros` or `torch.randn` didn't support size tracing correctly which is fixed by this pass. Same issue has been fixed in legacy tensor constructors. Additionally, new tensor constructors, which do not participate in tracing (most notably `torch.tensor`, `torch.as_tensor` and `torch.from_numpy`) raise a warning when they are used. Finally, entering a traceable operation disables the tracing in its body. This is needed because zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11288 Reviewed By: ezyang Differential Revision: D9751183 Pulled By: apaszke fbshipit-source-id: 51444a39d76a3e164adc396c432fd5ee3c8d5f7f	2018-09-10 15:22:48 -07:00
Elias Ellison	2158f4a9c8	add export import test to TestJitGenerated (#10982 ) Summary: Checking assertExportImport for all of the generated test jit tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10982 Differential Revision: D9636935 Pulled By: eellison fbshipit-source-id: f3f1ce77d454848098f2ac7e0fa18bf8564890be	2018-09-10 11:37:05 -07:00
Tongzhou Wang	d3f98b5ffc	Add matrix power (#11421 ) Summary: vishwakftw Your patch needed some updates because the default native function dispatches changed from `[function, method]` to `[function]`. The CI was run before that change happened so it still shows green, but the internal test caught it. I did some changes when rebasing and updating so I didn't just force push to your branch. Let's see if this passes CI and internal test. If it does, let me know if you want me to force push to your branch or use this PR instead. Note to reviewers: patch was already approved at #10068 . cc yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11421 Differential Revision: D9733407 Pulled By: SsnL fbshipit-source-id: cf2ed293bb9942dcc5158934ff4def2f63252599	2018-09-08 15:25:56 -07:00
James Reed	47c1de25e8	Test exporting batch norm, dropout, RNN Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11126 Differential Revision: D9727689 Pulled By: jamesr66a fbshipit-source-id: f142257a2fba27d86844bf33084174f1f68a8ca5	2018-09-07 19:41:39 -07:00
James Reed	4ae16c9ad9	Recursive descent for validation + convert expands in ATen fal… (#11356 ) Summary: …lback Pull Request resolved: https://github.com/pytorch/pytorch/pull/11356 Differential Revision: D9721002 Pulled By: jamesr66a fbshipit-source-id: eeb50b56f8a72e929860c5e459a5ab50ac624814	2018-09-07 16:39:36 -07:00
David Riazati	4bf5fc44c8	Fix split_size test failures (#11051 ) Summary: ~~This PR fixes #8525 by renaming `split_with_sizes` to `split` so that 2 `aten::split` ops are generated (previously `aten::split(self, int, int)` and `aten::split_with_sizes(self, int[], int)` were generated)~~ ~~`split_with_sizes` was made in PR #5443, but I don't see a reason for it to have a different name than `split` rather than just overload `split`.~~ This PR fixes #8525 by adding `register_special_ops.cpp` to mirror Python dispatching from `split` to `split` and `split_with_sizes` in [tensor.py](https://github.com/pytorch/pytorch/blob/master/torch/tensor.py#L279). It also fixes #8520 by adding an `int[]` wherever it sees `torch.Size` In a follow up PR this could also be used to fix some of the other `unknown builtin op` test errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11051 Differential Revision: D9582443 Pulled By: driazati fbshipit-source-id: d27201f85937d72e45e851eaa1460dd3dd1b61a9	2018-09-07 15:39:24 -07:00
Wanchao Liang	69b4b45f91	enable missing nn tests with single grad check, minor refactor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11366 Differential Revision: D9723305 Pulled By: wanchaol fbshipit-source-id: 9e7e2e7e68cb4919610bccfbf76fa33b647f6eb7	2018-09-07 14:27:46 -07:00
Edward Yang	2946b021e3	Disable flaky test, see #11360 (#11361 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/11361 Reviewed By: yf225 Differential Revision: D9696524 Pulled By: ezyang fbshipit-source-id: f6801d6f4f34090d467b16810db9cf576d5d519b	2018-09-06 20:40:00 -07:00
Richard Zou	4d678790c5	enable advanced indexing with tensors (#10862 ) Summary: On the way to #10774 This PR adds advanced indexing with tensors. The approach is to desugar advanced indexing into an at::index op. This is exactly how normal pytorch does it. [(I used this code as reference)](https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/python_variable_indexing.cpp) Supporting sequences is a little tricky because JIT script doesn't have an easy way to turn arbitrary n-dimensional python lists into a tensor (it would be easy if we supported `torch.tensor`), so that'll come in a future PR. cc jamesr66a zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10862 Differential Revision: D9659449 Pulled By: zou3519 fbshipit-source-id: 56d293720d44c0fd27909e18327ab3985ddfced6	2018-09-06 16:41:45 -07:00
Richard Zou	1ad61a18b2	Rename cuda tests to have 'cuda' in their names (#11332 ) Summary: Not a lot changed Pull Request resolved: https://github.com/pytorch/pytorch/pull/11332 Differential Revision: D9683680 Pulled By: zou3519 fbshipit-source-id: 95f444e54049dd268fc10effe425ef2df79c6467	2018-09-06 11:57:52 -07:00
Elias Ellison	4ae95738b2	Ignore FuseGraph Call on Windows (#11015 ) Summary: Fusion is NYI implemented on Windows, so ignore FuseGraph call instead of failing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11015 Differential Revision: D9619121 Pulled By: eellison fbshipit-source-id: ad09aeaa41b7fdeb9ca7bf5e1c166923ca405b15	2018-09-06 09:54:51 -07:00
Richard Zou	656e81db93	Fix scalar tensor assert in fusion compiler (#10952 ) Summary: Fixes #8560. Unblocks #10715. The assert (nDim <= uncompressedDims) was being triggered for a scalar tensor because we compute nDim to be 1 for a scalar tensor but uncompressedDim = 0. This PR changes it so that we compute nDim to be 0 for a scalar tensor. This works because indexing in a kernel depends on nDim. If nDim = 0, then offset is always 0, which is what we want. Some other (small) changes were necessary to make this work: - One cannot define a 0-length array `IndexType arr[0]` so the code guards against that - Needed to change some of the maxTensorInfoSize logic to handle the case when uncompressedDim == 0. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10952 Differential Revision: D9544607 Pulled By: zou3519 fbshipit-source-id: 2b873f47e2377125e1f94eb1b310a95cda51476c	2018-09-06 07:54:57 -07:00
Richard Zou	68c2e014cb	Handling for py2/py3 division differences (#11016 ) Summary: - In Python 2, use of `/` (regardless of int/float/Tensor) causes a compiler error if `from __future__ import division` is not imported in the file. - The / operator is universally set to do "true" division for integers - Added a `prim::FloorDiv` operator because it is used in loop unrolling. The error if users use '/' in python 2 without importing from __future__ occurs when building the JIT AST. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11016 Differential Revision: D9613527 Pulled By: zou3519 fbshipit-source-id: 0cebf44d5b8c92e203167733692ad33c4ec9dac6	2018-09-05 14:57:38 -07:00
Roy Li	9fc22cb772	Add import export step to end to end tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10717 Differential Revision: D9562888 Pulled By: li-roy fbshipit-source-id: 8f5d62fd0a44aca0a41dc10438e7bb91cc2a972a	2018-09-05 09:39:47 -07:00
Adam Paszke	6d6655e6be	Port PackedSequences functions to C++ (#11224 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11224 Differential Revision: D9652703 Pulled By: apaszke fbshipit-source-id: 558e39457e590cad07516e5bb2ecb12789564950	2018-09-05 06:35:15 -07:00
Adam Paszke	b7038f7c37	Treat numerical differences as warnings instead of errors when tracing (#11246 ) Summary: Also, make `torch.isclose` work with integral tensors and refactor `_check_trace` a bit. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11246 Differential Revision: D9652701 Pulled By: apaszke fbshipit-source-id: fb0bdbfd1952e45e153541e4d471b423a5659f25	2018-09-05 06:35:13 -07:00
Zachary DeVito	1eed7d5f0b	Report an error when trying to record a mutable operator when (#11129 ) Summary: there are multiple views of the tensor live. Also adds recording for copy_ because this is the critical in place op where these views will cause LHS indexing to fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11129 Differential Revision: D9600195 Pulled By: zdevito fbshipit-source-id: bfd8f5befa47377e36d704dbdb11023c608fe9a3	2018-09-04 13:40:51 -07:00
Elias Ellison	539579aa9a	Logical short circuit (#11116 ) Summary: Adding short circuit evaluation to AND or OR. The second expression of and AND or OR gets lifted into an if branch, which is conditionally evaluated. BatchOps was using the expression `dims = dims1 or dims2`, where dims is often an empty tensor. This nows throws an error, because dims1 gets cast to a boolean, and you can't convert an empty tensor to a scalar. It now matches the behavior of pytorch in python. One thing that came up is if the second expression in an and/or in python gets returned, it does not get coerced to a boolean. `tensor == (False or tensor)` `tensor == (True and tensor)` We do not currently support this. edit: wording Pull Request resolved: https://github.com/pytorch/pytorch/pull/11116 Differential Revision: D9618168 Pulled By: eellison fbshipit-source-id: 93b202be2f222d41f85d38d9c95f04d1749e8343	2018-09-04 09:25:13 -07:00
iotamudelta	33c7cc13ca	improve docker packages, fix bugs, enable tests, enable FFT (#10893 ) Summary: * improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs) * integrate rocFFT (i.e., enable Fourier functionality) * fix bugs in ROCm caused by wrong warp size * enable more test sets, skip the tests that don't work on ROCm yet * don't disable asserts any longer in hipification * small improvements Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893 Differential Revision: D9615053 Pulled By: ezyang fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b	2018-09-02 08:54:42 -07:00
James Reed	43e73f85ad	Dont optimize slicing dispatch when we are tracing (#11156 ) Summary: Previously when we had a slicing expression like `x[0:5, 0]`, where the sliced tensor was of size `5` in dimension 0, we would skip dispatching the actual slice call as an optimization. This caused incorrect behavior under tracing, as we would not record the slice op and thus if we encountered an input with a different shape while running the trace, we would get incorrect results. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11156 Differential Revision: D9622252 Pulled By: jamesr66a fbshipit-source-id: 822f2e8f01504e131f53bd9ef51c171c7913a7cc	2018-09-01 17:13:03 -07:00
James Reed	03c06ec93d	Traceable detach (#11038 ) Summary: This makes it so `detach` and `detach_` are traceable and also adds a pass to erase them before ONNX export Pull Request resolved: https://github.com/pytorch/pytorch/pull/11038 Differential Revision: D9588038 Pulled By: jamesr66a fbshipit-source-id: 263dd3147e24fcb0c716743f37fdb9f84c0015e7	2018-08-31 16:40:42 -07:00
Adam Paszke	780d2792c5	Warn about non-traceable behavior when tracing (#11088 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11088 Differential Revision: D9585527 Pulled By: apaszke fbshipit-source-id: 29a03cb152d83b626f748fff4501ac9e139994c2	2018-08-31 14:27:00 -07:00
Adam Paszke	82aeebb3d9	Fix a bug in addmm fusion in the JIT (#11100 ) Summary: Fixes #10839. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11100 Differential Revision: D9585533 Pulled By: apaszke fbshipit-source-id: 19e2710c8fc113f577faf14c080d8c89afbe23c4	2018-08-31 07:24:34 -07:00
Adam Paszke	00df09b65d	Change specialization rules in GraphExecutors (#10977 ) Summary: Review last commit only. Stacked on top of #10949. This commit fixes a number of issues connected to caching differentiability status of graphs inside graph executors, and changes the rules for optimization of differentiable subgraphs. Previously every one of those was instantiated as a separate graph executor, but now they are simply heavier-optimized graph regions, and graph executors are only instantiated for their backward. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977 Differential Revision: D9600626 Pulled By: apaszke fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3	2018-08-30 22:11:01 -07:00
Adam Paszke	f3c3127c67	Don't flatten output lists in the JIT IR (#10949 ) Summary: Operators like aten::chunk used to return a number of tensors, but now return a list. To make it easier to do shape prop through aten::chunk and fuse it, I've also introduced prim::ConstantChunk, which behaves like the previous implementation (has a variable length output list). The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10949 Reviewed By: zdevito Differential Revision: D9556823 Pulled By: apaszke fbshipit-source-id: 33e63b17fc7247cac6cfc05eb7eb9bf069b499ee	2018-08-30 19:54:39 -07:00
Zachary DeVito	93bd291e55	Change torch.jit.trace to no longer be a decorator (#11069 ) Summary: This was done because it surprising for a decorator to run a function rather than wrap it, and not simplify the syntax for tracing modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11069 Reviewed By: jamesr66a Differential Revision: D9583192 Pulled By: zdevito fbshipit-source-id: b914b7ab4c73c255086465a6576eef3a22de1e13	2018-08-30 13:56:05 -07:00
Erik Brinkman	611a608517	Add ATen pdist CPU kernel (#10782 ) Summary: Also add single grad whitelist to the jit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/10782 Reviewed By: ezyang Differential Revision: D9583378 Pulled By: erikbrinkman fbshipit-source-id: 069e5ae68ea7f3524dec39cf1d5fe9cd53941944	2018-08-30 11:55:27 -07:00
Zachary DeVito	ae635b16f7	Record tensor factory functions in trace (#10935 ) Summary: Things like torch.zeros now appear in traces rather than constants. To continue to support our current level of ONNX export, we run constant prop to turn these back into constants where possible before export. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10935 Differential Revision: D9527427 Pulled By: zdevito fbshipit-source-id: 552a8bcc01b911251dab7d7026faafdd7a3c758a	2018-08-29 17:10:24 -07:00
Adam Paszke	d9b74f6540	Make it possible to disable JIT using env variables (#10867 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10867 Differential Revision: D9556882 Pulled By: apaszke fbshipit-source-id: 04c0ca875d15d37dd9ac05ac7b515cd899ddb7e4	2018-08-29 15:11:05 -07:00
James Reed	beeec47041	Sanity checks for tracing (#10841 ) Summary: TODO: integrate into torch.onnx.export -- separate PR Problem: We have a facility to trace PyTorch operations on Python code, but there are several failure modes where the trace is not representative of the actual underlying computation: * The tracer encountered dynamic control flow * Some computation escaped the tracer, and appeared as a Constant tensor node in the graph * Some stateful function was traced, e.g. someone did an optimization in Python by memoizing function outputs Objective: In an ideal world, this whole process would be automated and the user can trust that the system will magically capture the intended semantics from the program. Realistically speaking, we will likely have to settle with a human-in-the-loop error reporting system, allowing for the user to identify problems and modify the source code to allow for tracing. Stage 1 (this PR): Output-level checking & graph diff. torch.jit.trace gains a kwarg 'check_inputs', which is a list of tuples of input arguments. We will iterate through the list and trace the function again for each set of check inputs. We'll also interpret the original trace with these inputs and compare output values and graphs, printing a diff of the graph if there is a difference. Examples: ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 5),)]) def foo(x): y = torch.arange(0, x.shape[0]).float() return x + y.unsqueeze(1) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ %2 : int = prim::Constant[value=0]() %3 : Dynamic = aten::_cast_Float(%1, %2) %4 : int = prim::Constant[value=1]() %5 : Dynamic = aten::unsqueeze(%3, %4) %6 : int = prim::Constant[value=1]() %7 : Dynamic = aten::add(%0, %5, %6) return (%7); } Node diff: - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ Trace source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Check source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper dank.py(3): <module> ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() Source Location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (shapes (3,), (4,) mismatch) x: array([0, 1, 2]) y: array([0, 1, 2, 3]) ``` == ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): y = x.data return x + y ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value=<Tensor>]() Source Location: dank.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.397137, 0.956105, 0.169478, 0.560292, 0.392568, 0.108441, 0.97645 , 0.34412 , 0.951246, 0.793061, 0.557595, 0.770245], dtype=float32) y: array([0.243178, 0.315964, 0.972041, 0.0215 , 0.927751, 0.457512, 0.951092, 0.97883 , 0.048688, 0.118066, 0.779345, 0.271272], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 4),)]) def foo(x): for _ in range(x.size(0)): x = torch.neg(x) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { %1 : Dynamic = aten::neg(%0) %2 : Dynamic = aten::neg(%1) %3 : Dynamic = aten::neg(%2) + %4 : Dynamic = aten::neg(%3) - return (%3); ? ^ + return (%4); ? ^ } ``` == ``` import torch def foo(x): if not hasattr(foo, 'cache'): foo.cache = torch.neg(x) return x + foo.cache traced = torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])(foo) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() %2 : int = prim::Constant[value=1]() %3 : Dynamic = aten::add(%0, %1, %2) return (%3); } Node diff: - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() Trace source location: test.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper test.py(8): <module> Check source location: test.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper test.py(8): <module> ``` The following two examples show instances where program semantics are lost in the Python -> trace transformation, and repeated invocation does not give us useful debug information. Further design in underway for catching these scenarios. ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): for i in range(3): x[i, :] = torch.zeros(4) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.830221, 0.915481, 0.940281, 0.555241], dtype=float32) y: array([0., 0., 0., 0.], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(5, 6),)]) def foo(x): x.view(-1).add_(-x.view(-1)) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.734441, 0.445327, 0.640592, 0.30076 , 0.891674, 0.124771], dtype=float32) y: array([0., 0., 0., 0., 0., 0.], dtype=float32) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10841 Differential Revision: D9499945 Pulled By: jamesr66a fbshipit-source-id: 1f842a32d0b0645259cc43b29700b86d99c59a45	2018-08-28 20:25:26 -07:00
Zachary DeVito	22c9bc3117	Resolve builtins using a dict rather than by name (#10927 ) Summary: Changes the approach for resolving builtin ops so that the following works ``` add = torch.add script def foo(x): return add(x, x) ``` This handles cases when people alias torch and torch.nn.functional to shorter names. This works by building a table of id -> builtin name for the know builtin ops in torch, torch.nn.functional, and for any user-defined op created by accessing in torch.ops.foo.bar This allows us to clean up many SugaredValue types in the compiler. Notes: * we now consider any attributes on python modules to be constants (e.g. math.pi, and torch.double). * fixes a bug where we incorrectly allowed attribute lookup on arbitrary pyton objects. It is now restricted to modules only. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10927 Differential Revision: D9527522 Pulled By: zdevito fbshipit-source-id: 0280422af08b4b0f48f302766d5a9c0deee47660	2018-08-28 11:25:11 -07:00
Elias Ellison	58b145f515	Fix negative indices in tracer (#10560 ) Summary: Previously when tracing slicing & select negative indices would get normalized, fixing the index to the size of the traced tensor. This makes the behavior the same as script so aten::select with negative indices is emitted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10560 Differential Revision: D9493614 Pulled By: eellison fbshipit-source-id: ce7a8bae59863723247208d86b9f2948051ccc6c	2018-08-27 15:19:41 -07:00
Zachary DeVito	6ce799edd6	Tuples/Lists can now be inputs/outputs to script and other simple fixes. (#10812 ) Summary: * Fix the necessary pathways so that tuples and lists can be inputs to the script. * prevent linear algebra functions from being run in shape prop because they frequently will error out for nonsense data. * favor schema-driven python input conversion where possible. remaining cases where we directly create Stacks without schema are only for debugging * Make the error messages when calling script/trace functions more pythonic * Simplify FlattenTuples -- now that tuples are supported we can choose to only flatten tuples when needed. This may have to be revisited pending onnx test results, but is necessary for making tuple io work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10812 Differential Revision: D9477982 Pulled By: zdevito fbshipit-source-id: ed06fc426e6ef6deb404602a26c435a7fc40ea0c	2018-08-27 14:40:40 -07:00
Richard Zou	35beecfe17	fix xfails involving literals (#10905 ) Summary: I missed these in #10900 cc apaszke jamesr66a zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10905 Differential Revision: D9516748 Pulled By: zou3519 fbshipit-source-id: a5c3e3b65a33c339d5c4e9fc160462c3d35705f3	2018-08-27 12:41:06 -07:00
Richard Zou	67f6f930a8	Remove FIXME_zerol() from test_jit.py (#10900 ) Summary: The scalar situation has gotten a lot better and now we can remove all instances of FIXME_zerol(). cc zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10900 Differential Revision: D9514206 Pulled By: zou3519 fbshipit-source-id: e4e522f324126c5454cd6de14b832d2d1f6cb0ce	2018-08-27 08:55:08 -07:00
Adam Paszke	c8b246abf3	Prevent JIT from overspecializing to every single size configuration (#10844 ) Summary: Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details. Summary of changes: - Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail. - Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`. - Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`. - Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`. zdevito ezyang mruberry ngimel csarofeen Pull Request resolved: https://github.com/pytorch/pytorch/pull/10844 Differential Revision: D9498705 Pulled By: apaszke fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61	2018-08-26 09:54:48 -07:00
Elias Ellison	0ef5cfd28c	fix ivalue printing for lists (#10777 ) Summary: Fixing the printing of IValue lists, which didn't work previously. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10777 Differential Revision: D9474264 Pulled By: eellison fbshipit-source-id: 0c7d6e7ecaa3f7908b131ac9f1036f19ac4f8b4f	2018-08-24 16:02:03 -07:00
Elias Ellison	74e6a666b3	If none of the schema match, add ImplicitTensorToNum conversions where needed. (#10180 ) Summary: When matching schema, first try to match without adding TensorToNum conversions. Then make another pass where TensorToNum conversions are allowed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10180 Differential Revision: D9438153 Pulled By: eellison fbshipit-source-id: 80541b5abd06e9d4187e89dda751f44dab6f58c5	2018-08-24 16:02:00 -07:00
Richard Zou	ca567862b2	Support multidimensional indexing (#10787 ) Summary: Part of #10774. This PR does the following: - Support ast.ExtSlice in the frontend. This is done by returning a list of ast.Index and ast.Slice. - Support multidimensional indexing with ints and slices The general approach is to desugar multidimensional indexing into at::slice, at::select operations. This is exactly how normal pytorch does indexing (by desugaring it into at::slice, at::select, and other ops). I used [this code](https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/python_variable_indexing.cpp) as reference. We should be able to copy the rest of this to implement the missing indexing features in script (indexing with ellipses, tensors, sequences, etc). After I'm done implementing the missing indexing features in future prs, I can try to templatize python_variable_indexing.cpp so that it can work with both JIT script and normal pytorch indexing, but right now I'm not sure if that's a good idea or not. cc zdevito jamesr66a apaszke wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/10787 Differential Revision: D9481402 Pulled By: zou3519 fbshipit-source-id: 78c9fa42771a037d157879e23e20b87401cf1837	2018-08-24 08:10:32 -07:00
Zachary DeVito	3d43a82440	Add support for vararg style functions. (#10250 ) Summary: Things like `zeros(1,2,3, dtype=torch.int)` are now supported in the script by altering tryMatchSchema to auto-construct the list `[1,2,3]` when it sees inlined members of the list as the last positional arguments. I suggest reading the commits individually, since the first two incrementally change how we do tryMatchSchema to get it ready for adding vararg list conversion, while the third actually does the modification. closes #10632 closes #8516 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10250 Differential Revision: D9478235 Pulled By: zdevito fbshipit-source-id: 0c48caf7a6184e463d9293d97015e9884758ef9c	2018-08-23 15:10:36 -07:00
Elias Ellison	5c0eece2fd	Force types on values returned from if blocks to be equivalent (#10281 ) Summary: When emitting if Branches, check that the types on each value returned are equivalent. As with reassignment of values, tensors are not forced to be the same shape or subtype. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10281 Differential Revision: D9466566 Pulled By: eellison fbshipit-source-id: 746abdeb34a0f68806b8e73726ad5003b536911c	2018-08-22 19:55:38 -07:00
Adam Paszke	f72e813c2f	Allow tracing functions that take tuples of tensors as inputs (#10637 ) Summary: And return tuples. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10637 Reviewed By: eellison Differential Revision: D9385892 Pulled By: apaszke fbshipit-source-id: 542f4444d909fb246d7f1d88d6fb98345de2d431	2018-08-22 15:37:10 -07:00
Richard Zou	6c84f7fea0	Relax RHS type assert for augassign (#10730 ) Summary: Augassign (i.e., `x += 1`) gets desugared to an assignment of a binop (`x = x + 1`). Right now we assert that the RHS of the binop is a tensor, but it really doesn't have to be because we support scalar/scalar ops and also list-list ops (i.e., `[1, 2] + [2, 3]`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/10730 Differential Revision: D9465110 Pulled By: zou3519 fbshipit-source-id: 7b118622701f09ce356aca81b8db743d9611097b	2018-08-22 15:10:33 -07:00
James Reed	6fcac354c5	Erase ListConstruct nodes for ONNX export (#10713 ) Summary: ONNX doesn't support this. Instead flatten the inputs to the ListConstruct op and inline it into the subsequent usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/10713 Differential Revision: D9458508 Pulled By: jamesr66a fbshipit-source-id: 0b41e69320e694bb2f304c6221864a39121e4694	2018-08-22 14:39:58 -07:00
Michael Suo	9e75ec11fb	Make empty list literals construct empty Tensor[] (#10705 ) Summary: This will make the common case more natural (no need to do `_construct_empty_tensor_list()`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/10705 Differential Revision: D9411622 Pulled By: michaelsuo fbshipit-source-id: 2d91fbc5787426748d6e1c8e7bbeee737544dc96	2018-08-20 18:28:28 -07:00
James Reed	585e6b581f	Allow method-style casts on tensors (#10641 ) Summary: Closes https://github.com/pytorch/pytorch/issues/10631 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10641 Differential Revision: D9407598 Pulled By: jamesr66a fbshipit-source-id: a0331f4e9e55d92718cde7a1112fe8c705206b1f	2018-08-20 14:10:21 -07:00
Richard Zou	f1420adfe3	Move at::chunk into the graph fuser (#10178 ) Summary: ... to avoid slow at::chunk (it is slow due to tensor initialization). Picking up from #10026 This is done through the following: 1) Absorb starting chunks into FusionGroup as a part of the graph fuser pass. 2) When compiling a kernel, emit a `std::vector<ConcatDesc>` that describes if an input (of the original graph) will be chunked. 3) When launching a kernel, `use std::vector<ConcatDesc>` to chunk an input tensor on the CPU. This chunk directly takes in an at::Tensor and creates four TensorInfo structs in-place in the argument list, bypassing the creation of intermediate Tensors. - Expect test and correctness test to see if a single chunk is fused by the graph fuser - Correctness test for a variety of chunks (dimension = beginning, middle, end) and tensors (contiguous, non-contiguous, edge case (splitSize = 1) for both CPU/CUDA - Expect test for multiple chunks fused into the same kernel and correctness test. cc zdevito apaszke LSTM forward pass, 1 layer, 512 hidden size and input size, 100 seq length, requires_grad=False on all inputs and weights. After changes: ``` thnn cudnn jit 8.8468 6.5797 9.3470 ``` Before changes: ``` thnn cudnn jit 9.9221 6.6539 11.2550 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10178 Differential Revision: D9382661 Pulled By: zou3519 fbshipit-source-id: 1f8a749208fbdd45559775ce98cf4eb9558448f8	2018-08-18 16:10:11 -07:00
Richard Zou	e29b5a1ea8	graph fuser inserts explicit expands where necessary (#10325 ) Summary: Fixes #10096 If the only thing preventing a simple mappable operator from being fused into a fusion group is that its Tensor inputs are not of the same shape as the output, then the graph fuser inserts explicit expand nodes for those inputs. This helps the graph fuser not miss out on any fusion opportunities involving simple mappable operations that have Tensor inputs. This PR doesn't do anything for the scalar case; that can be addressed later. Test Plan - Simple expect test case - Added expect tests for a raw LSTMCell. The expands help speed up the forwards pass by allowing more operations to be fused into the LSTMCell's single FusionGroup. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10325 Differential Revision: D9379308 Pulled By: zou3519 fbshipit-source-id: 86d2202eb97e9bb16e511667b7fe177aeaf88245	2018-08-17 16:03:46 -07:00
Richard Zou	86c9856d9c	Fuse tensor-scalar ops when scalar is constant (#10511 ) Summary: This is on the way to resolving #9940. Fixes #10501 This PR modifies graph fuser to fuse operations that have constant scalar arguments. These constant scalar arguments are directly inlined into the kernel body. The context for this is that LSTM backward (in particular, sigmoid backward) has many add(x, 1.) operations. This PR should be sufficient for LSTM backward to get fused by the graph fuser. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10511 Differential Revision: D9378896 Pulled By: zou3519 fbshipit-source-id: 6a7a2987f5b6e8edaaf4b599cd200df33361650f	2018-08-17 14:10:23 -07:00
Wanchao Liang	52058204d6	Add nn functional tests in JIT (#10409 ) Summary: The PR is the first step to integrate torch.nn library with JIT. It adds the tests for nn functional interfaces in trace/script mode, and tries to find out the different between torch.nn.functional ops and the ATen ops, to see the work need to be done in order to support a full set of nn functional in script mode. Some statistics in summary: - Totally 84 useful functions in torch.nn.functional (the number does not include helper funcs and deprecated funcs in torch.nn.functional). - 7 functions/ops does not support higher gradient, so just excluded from the whole test. - 36 functions is different with the Aten op for different reasons. Among those 36 functions, bunch of them (roughly around 10-15) are just naming difference and simple transformation using other ops inside the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10409 Differential Revision: D9350694 Pulled By: wanchaol fbshipit-source-id: 8fce6f30d8d25ace5a544a57b219fe61f5a092f8	2018-08-17 11:09:49 -07:00
Elias Ellison	e190505e84	Adding support for inlining if branches (#10084 ) Summary: Inlining if branches which have constant inputs. If an if node gets inlined, the set of mutated variables returned by its ancestors may have changed. In the following example the block should return a mutated set of (a) and not (a, b). ``` if cond: if True: a = a - 1 else: b = b - 1 ``` To calculate this we recursively update mutate variables in if branches from the leaf nodes up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10084 Reviewed By: michaelsuo Differential Revision: D9340429 Pulled By: eellison fbshipit-source-id: b0dd638a5cace9fdec3130460428fca655ce4b98	2018-08-17 09:48:47 -07:00
Peter Goldsborough	c101a57a74	Build mechanism for custom operators (#10226 ) Summary: This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I: 1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries 2. Created a ` torch/op.h` header for easy inclusion of necessary headers, 3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op. 1. It defines an op in `op.{h,cpp}` 2. Registers it with the JIT using `RegisterOperators` 3. Builds it into a shared library via a `CMakeLists.txt` 4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey! The pure C++ and the Python builds are separate and not coupled in any way. zdevito soumith dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226 Differential Revision: D9296839 Pulled By: goldsborough fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0	2018-08-16 18:56:17 -07:00
Owen Anderson	abf85bf0ef	Perform CSE across block boundaries. (#10105 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10105 Differential Revision: D9186678 Pulled By: resistor fbshipit-source-id: 87b63d4fc0c7d394edb4777acdefa8f022a8bf8d	2018-08-16 00:25:36 -07:00
James Reed	32bb4040dd	Unified type annotation parsing for script frontends (#10279 ) Summary: After this, all combinations of {String frontend, Python AST Frontend}{Python 3-style type annotations, MyPy-style type comments}{Script method, Script function} should properly accept type annotations. Possible TODOs: - Clean up the functions marked HACK - Clean up the Subscript tree-view to better match the Python AST versions - Can we use this for Python functions? That's the only place annotations.get_signature() is still needed Pull Request resolved: https://github.com/pytorch/pytorch/pull/10279 Differential Revision: D9319726 Pulled By: jamesr66a fbshipit-source-id: b13f7d4f066b0283d4fc1421a1abb9305c3b28fa	2018-08-14 18:13:15 -07:00
Richard Zou	b4462511fd	Add LSTMCell backward pass expect tests (#10506 ) Summary: - Exposed get_debug_graph for ScriptModule (gets the debug graph for its forward Method) - Added forward/backward expect tests for lstm and milstm cells. These are intended to prevent regressions cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10506 Differential Revision: D9316590 Pulled By: zou3519 fbshipit-source-id: 3c2510d8363e9733ccbc5c7cc015cd1d028efecf	2018-08-14 11:39:44 -07:00
Zachary DeVito	61bedc96f0	Schema-based creation of graph nodes (#10198 ) Summary: This commit adds the ability to insert a node with inputs, using the schema to check the inputs are valid types, fill in any default values, and perform standard implicit conversions. Since it is schema based, it will discover and use the right overload. Constructors to `NamedValue` enable it to be constructed using `IValue` constants so it is possible to use constant values in the input list as well: ``` g.insert(aten::add, {v, 3}); ``` Keyword arguments are also supported: ``` g.insert(aten::add, {v}, {{"other", t}, {"scalar", 1}}); ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10198 Differential Revision: D9307252 Pulled By: zdevito fbshipit-source-id: 644620aa85047d1eae1288383a619d50fec44d9b	2018-08-14 10:25:38 -07:00
Richard Zou	fed05cf4cf	Fix prim::FusedConcat bug (#10466 ) Summary: Fixes #10456 The graph fuser was fusing together groups with prim::FusedConcat (the producer) with other ops (the consumer) if the consumer is fusable. For example, ``` import torch torch.jit.script def fn(x, y, z): x1 = x + y y1 = x - y w = torch.cat([x1, y1]) return w + z x = torch.randn(2, 2, dtype=torch.float, device='cpu') y = torch.randn(2, 2, dtype=torch.float, device='cpu') z = torch.randn(4, 2, dtype=torch.float, device='cpu') fn(x, y, z) fn.graph_for(x, y, z) ``` produced the following graph: ``` graph(%x : Float(2, 2) %y : Float(2, 2) %z : Float(4, 2)) { %3 : int = prim::Constant[value=1]() %y1 : Float(2, 2) = aten::sub(%x, %y, %3) %8 : int = prim::Constant[value=0]() %14 : Float(4, 2) = prim::FusionGroup_0[device=-1](%z, %y1, %x, %y) return (%14); } with prim::FusionGroup_0 = graph(%1 : Float(4, 2) %5 : Float(2, 2) %7 : Float(2, 2) %8 : Float(2, 2)) { %11 : int = prim::Constant[value=1]() %9 : int = prim::Constant[value=1]() %x1 : Float(2, 2) = aten::add(%7, %8, %9) %w : Float(4, 2) = prim::FusedConcat[dim=0](%x1, %5) %2 : int = prim::Constant[value=1]() %3 : Float(4, 2) = aten::add(%w, %1, %2) return (%3); } ``` this is a problem because it violates two invariants: 1) all inputs to the FusionGroup must have the same size 2) prim::FusedConcat's output must not be used inside the FusionGroup This PR fixes this problem by checking if the output to a FusionGroup came from a prim::FusedConcat node when deciding whether to fuse the consumer and producer. If the producer is a value that came from a prim::FusedConcat node in a FusionGroup, then consumer & producer do not get fused. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10466 Differential Revision: D9296686 Pulled By: zou3519 fbshipit-source-id: ed826fa9c436b42c04ca7d4d790cece804c162bd	2018-08-13 21:09:25 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
Roy Li	e9ad74357e	Use serialization container in ir import export (#10394 ) Summary: Copy of #10191 because these changes didn't land with the diff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10394 Differential Revision: D9260816 Pulled By: li-roy fbshipit-source-id: 7dc16919cfab6221fda1d44e98c5b900cfb40558	2018-08-10 00:09:30 -07:00
Michael Suo	0950d7a98d	support list slicing (#10318 ) Summary: As title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10318 Differential Revision: D9254351 Pulled By: michaelsuo fbshipit-source-id: be891a584dc295b5e353f7f5257d64a356fb9586	2018-08-09 17:25:13 -07:00
Michael Suo	b6402648f4	fix off-by-one bug in open-ended slicing (#10286 ) Summary: Previously, `tensor[i:]` was transformed to `tensor[i:-1]`. This incorrectly leaves off the last element. Noticed this when implementing slicing for list types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10286 Differential Revision: D9193292 Pulled By: michaelsuo fbshipit-source-id: df372b815f9a3b8029830dd9e8769f9985a890e7	2018-08-07 00:39:42 -07:00
Michael Suo	5a7c710548	Support some basic list operations (#10225 ) Summary: Support a few basic operators: - eq - add - len - select (indexing) Pull Request resolved: https://github.com/pytorch/pytorch/pull/10225 Differential Revision: D9172338 Pulled By: michaelsuo fbshipit-source-id: 6e75ec1453b9589b0fb4698598ecdba5a5fccff9	2018-08-07 00:39:40 -07:00
iotamudelta	a38b572de3	enable unit tests and other changes (#10266 ) Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297	2018-08-06 14:54:01 -07:00
Peter Goldsborough	0c848f4179	Python integration for custom operators (#10149 ) Summary: Adds the Python path to custom operators, including dynamically loading operations into Python. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10149 Reviewed By: ezyang Differential Revision: D9158380 Pulled By: goldsborough fbshipit-source-id: 3edffa639e8d2959e9e80d1bd4f20ab4a1b3ca02	2018-08-06 13:54:48 -07:00
Richard Zou	29406a2c4c	Fix shared_ptr refcycle in graph executor (#10222 ) Summary: Fixes #10032 When capturing an output, GraphExecutorAutogradFunction creates SavedVariable with is_output=False and owns it: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/graph_executor.cpp#L87 Constructing SavedVariable with is_output=False makes it own a copy of the shared_ptr<GraphExecutorAutogradFunction>, which causes a reference cycle: `6456b944fd/torch/csrc/autograd/saved_variable.cpp (L27)` The solution in this PR is to construct the SavedVariable with is_output=True if the captured value is an output. Test Plan Turn on cuda memory checking for JitTestCase. If the test's name includes "cuda" or "gpu" in it, the cuda memory checking test happens. cc zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10222 Reviewed By: ezyang Differential Revision: D9162995 Pulled By: zou3519 fbshipit-source-id: aeace85a09160c7a7e79cf35f6ac61eac87cbf66	2018-08-04 11:39:10 -07:00
Wanchao Liang	50cf326158	Allow type cast between int and float in Script (#10168 ) Summary: The PR allows int→float and float→int casts. Current we only allow `tensor→int` and `tensor→float` casts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10168 Differential Revision: D9141163 Pulled By: wanchaol fbshipit-source-id: 5e5591a98b4985a675641dfc9a385b2a0bf8e208	2018-08-03 10:56:05 -07:00
Michael Suo	13de6e8dfa	Make list literals construct ListType (#10193 ) Summary: Previously, `foo = [bar, baz]` would construct a TupleType of fixed arity. This would cause code like: ``` foo = [2] if True: foo = [2, 2] ``` to fail to compile, since `(int)` is not the same as `(int, int)`. This PR changes things so that list literals construct ListTypes, which can be resized. Potentially breaking changes introduced: - Empty list literals are now disallowed, `_constructEmptyFooList()` builtins are required to replace them. - Iterable variable unpacking where the rhs is a list is now disallowed. (Tuples still work) - Lists must have a single type. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10193 Differential Revision: D9147166 Pulled By: michaelsuo fbshipit-source-id: bbd1b97b0b6b7cb0e6f9d6aefa1ee9c731e63039	2018-08-03 00:55:23 -07:00
Roy Li	0e9c6898cb	Export modules in ir with google protobuf Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9746 Differential Revision: D9110006 Pulled By: li-roy fbshipit-source-id: 8b9744c042f822fdfe959a7a7fef3d0baff4f639	2018-08-02 15:54:51 -07:00
Elias Ellison	170d29769b	Strings lexing, parsing, implementation in print (#9324 ) Summary: This PR adds strings to the ast and implements them for print statements. Strings are lifted as attributes to the print node. They must be arguments to print itself, not as an argument for an object that is passed to print. If they are encountered elsewhere a NYI exception will be thrown. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9324 Reviewed By: jramseyer Differential Revision: D8807128 Pulled By: eellison fbshipit-source-id: 984401ff458ed18d473c6d1bd86750e56c77d078	2018-08-02 11:09:03 -07:00
James Reed	9c818bfbc7	Refactor PythonValue types + use tryMatchSchema for PythonOp Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10132 Differential Revision: D9121327 Pulled By: jamesr66a fbshipit-source-id: 6d8bcf6b0dca54106cf9ed740bcff857062a03da	2018-08-02 10:26:58 -07:00
iotamudelta	cfa05706ef	ROCm contributions week 29 (#9653 ) Summary: In this changeset: * improvements to `hipify-python.py` * marking unit tests broken for ROCm * reducing the number of jobs for the built to avoid out of memory issues * switch to Thrust/cub-hip master for the CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653 Differential Revision: D9117791 Pulled By: ezyang fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352	2018-08-02 09:09:00 -07:00

... 3 4 5 6 7 ...

685 Commits