pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Roy Li	9fc22cb772	Add import export step to end to end tests Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10717 Differential Revision: D9562888 Pulled By: li-roy fbshipit-source-id: 8f5d62fd0a44aca0a41dc10438e7bb91cc2a972a	2018-09-05 09:39:47 -07:00
Adam Paszke	6d6655e6be	Port PackedSequences functions to C++ (#11224 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11224 Differential Revision: D9652703 Pulled By: apaszke fbshipit-source-id: 558e39457e590cad07516e5bb2ecb12789564950	2018-09-05 06:35:15 -07:00
Adam Paszke	b7038f7c37	Treat numerical differences as warnings instead of errors when tracing (#11246 ) Summary: Also, make `torch.isclose` work with integral tensors and refactor `_check_trace` a bit. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11246 Differential Revision: D9652701 Pulled By: apaszke fbshipit-source-id: fb0bdbfd1952e45e153541e4d471b423a5659f25	2018-09-05 06:35:13 -07:00
Zachary DeVito	1eed7d5f0b	Report an error when trying to record a mutable operator when (#11129 ) Summary: there are multiple views of the tensor live. Also adds recording for copy_ because this is the critical in place op where these views will cause LHS indexing to fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11129 Differential Revision: D9600195 Pulled By: zdevito fbshipit-source-id: bfd8f5befa47377e36d704dbdb11023c608fe9a3	2018-09-04 13:40:51 -07:00
Elias Ellison	539579aa9a	Logical short circuit (#11116 ) Summary: Adding short circuit evaluation to AND or OR. The second expression of and AND or OR gets lifted into an if branch, which is conditionally evaluated. BatchOps was using the expression `dims = dims1 or dims2`, where dims is often an empty tensor. This nows throws an error, because dims1 gets cast to a boolean, and you can't convert an empty tensor to a scalar. It now matches the behavior of pytorch in python. One thing that came up is if the second expression in an and/or in python gets returned, it does not get coerced to a boolean. `tensor == (False or tensor)` `tensor == (True and tensor)` We do not currently support this. edit: wording Pull Request resolved: https://github.com/pytorch/pytorch/pull/11116 Differential Revision: D9618168 Pulled By: eellison fbshipit-source-id: 93b202be2f222d41f85d38d9c95f04d1749e8343	2018-09-04 09:25:13 -07:00
iotamudelta	33c7cc13ca	improve docker packages, fix bugs, enable tests, enable FFT (#10893 ) Summary: * improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs) * integrate rocFFT (i.e., enable Fourier functionality) * fix bugs in ROCm caused by wrong warp size * enable more test sets, skip the tests that don't work on ROCm yet * don't disable asserts any longer in hipification * small improvements Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893 Differential Revision: D9615053 Pulled By: ezyang fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b	2018-09-02 08:54:42 -07:00
James Reed	43e73f85ad	Dont optimize slicing dispatch when we are tracing (#11156 ) Summary: Previously when we had a slicing expression like `x[0:5, 0]`, where the sliced tensor was of size `5` in dimension 0, we would skip dispatching the actual slice call as an optimization. This caused incorrect behavior under tracing, as we would not record the slice op and thus if we encountered an input with a different shape while running the trace, we would get incorrect results. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11156 Differential Revision: D9622252 Pulled By: jamesr66a fbshipit-source-id: 822f2e8f01504e131f53bd9ef51c171c7913a7cc	2018-09-01 17:13:03 -07:00
James Reed	03c06ec93d	Traceable detach (#11038 ) Summary: This makes it so `detach` and `detach_` are traceable and also adds a pass to erase them before ONNX export Pull Request resolved: https://github.com/pytorch/pytorch/pull/11038 Differential Revision: D9588038 Pulled By: jamesr66a fbshipit-source-id: 263dd3147e24fcb0c716743f37fdb9f84c0015e7	2018-08-31 16:40:42 -07:00
Adam Paszke	780d2792c5	Warn about non-traceable behavior when tracing (#11088 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11088 Differential Revision: D9585527 Pulled By: apaszke fbshipit-source-id: 29a03cb152d83b626f748fff4501ac9e139994c2	2018-08-31 14:27:00 -07:00
Adam Paszke	82aeebb3d9	Fix a bug in addmm fusion in the JIT (#11100 ) Summary: Fixes #10839. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/11100 Differential Revision: D9585533 Pulled By: apaszke fbshipit-source-id: 19e2710c8fc113f577faf14c080d8c89afbe23c4	2018-08-31 07:24:34 -07:00
Adam Paszke	00df09b65d	Change specialization rules in GraphExecutors (#10977 ) Summary: Review last commit only. Stacked on top of #10949. This commit fixes a number of issues connected to caching differentiability status of graphs inside graph executors, and changes the rules for optimization of differentiable subgraphs. Previously every one of those was instantiated as a separate graph executor, but now they are simply heavier-optimized graph regions, and graph executors are only instantiated for their backward. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977 Differential Revision: D9600626 Pulled By: apaszke fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3	2018-08-30 22:11:01 -07:00
Adam Paszke	f3c3127c67	Don't flatten output lists in the JIT IR (#10949 ) Summary: Operators like aten::chunk used to return a number of tensors, but now return a list. To make it easier to do shape prop through aten::chunk and fuse it, I've also introduced prim::ConstantChunk, which behaves like the previous implementation (has a variable length output list). The downside of this PR is that the introduction of more lists to the IR causes the LSTM and MiLSTM graphs to be considered as non-differentiable by the graph executor. I verified that they are still optimize correctly, and my next patch (that changes how the specializations/differentiation works) will restore those. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10949 Reviewed By: zdevito Differential Revision: D9556823 Pulled By: apaszke fbshipit-source-id: 33e63b17fc7247cac6cfc05eb7eb9bf069b499ee	2018-08-30 19:54:39 -07:00
Zachary DeVito	93bd291e55	Change torch.jit.trace to no longer be a decorator (#11069 ) Summary: This was done because it surprising for a decorator to run a function rather than wrap it, and not simplify the syntax for tracing modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11069 Reviewed By: jamesr66a Differential Revision: D9583192 Pulled By: zdevito fbshipit-source-id: b914b7ab4c73c255086465a6576eef3a22de1e13	2018-08-30 13:56:05 -07:00
Erik Brinkman	611a608517	Add ATen pdist CPU kernel (#10782 ) Summary: Also add single grad whitelist to the jit test Pull Request resolved: https://github.com/pytorch/pytorch/pull/10782 Reviewed By: ezyang Differential Revision: D9583378 Pulled By: erikbrinkman fbshipit-source-id: 069e5ae68ea7f3524dec39cf1d5fe9cd53941944	2018-08-30 11:55:27 -07:00
Zachary DeVito	ae635b16f7	Record tensor factory functions in trace (#10935 ) Summary: Things like torch.zeros now appear in traces rather than constants. To continue to support our current level of ONNX export, we run constant prop to turn these back into constants where possible before export. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10935 Differential Revision: D9527427 Pulled By: zdevito fbshipit-source-id: 552a8bcc01b911251dab7d7026faafdd7a3c758a	2018-08-29 17:10:24 -07:00
Adam Paszke	d9b74f6540	Make it possible to disable JIT using env variables (#10867 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10867 Differential Revision: D9556882 Pulled By: apaszke fbshipit-source-id: 04c0ca875d15d37dd9ac05ac7b515cd899ddb7e4	2018-08-29 15:11:05 -07:00
James Reed	beeec47041	Sanity checks for tracing (#10841 ) Summary: TODO: integrate into torch.onnx.export -- separate PR Problem: We have a facility to trace PyTorch operations on Python code, but there are several failure modes where the trace is not representative of the actual underlying computation: * The tracer encountered dynamic control flow * Some computation escaped the tracer, and appeared as a Constant tensor node in the graph * Some stateful function was traced, e.g. someone did an optimization in Python by memoizing function outputs Objective: In an ideal world, this whole process would be automated and the user can trust that the system will magically capture the intended semantics from the program. Realistically speaking, we will likely have to settle with a human-in-the-loop error reporting system, allowing for the user to identify problems and modify the source code to allow for tracing. Stage 1 (this PR): Output-level checking & graph diff. torch.jit.trace gains a kwarg 'check_inputs', which is a list of tuples of input arguments. We will iterate through the list and trace the function again for each set of check inputs. We'll also interpret the original trace with these inputs and compare output values and graphs, printing a diff of the graph if there is a difference. Examples: ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 5),)]) def foo(x): y = torch.arange(0, x.shape[0]).float() return x + y.unsqueeze(1) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ %2 : int = prim::Constant[value=0]() %3 : Dynamic = aten::_cast_Float(%1, %2) %4 : int = prim::Constant[value=1]() %5 : Dynamic = aten::unsqueeze(%3, %4) %6 : int = prim::Constant[value=1]() %7 : Dynamic = aten::add(%0, %5, %6) return (%7); } Node diff: - %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() ? ^ + %1 : Dynamic = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() ? +++ ^ Trace source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Check source location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper dank.py(3): <module> ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value= 0 1 2 [ CPULongType{3} ]]() Source Location: dank.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (shapes (3,), (4,) mismatch) x: array([0, 1, 2]) y: array([0, 1, 2, 3]) ``` == ``` torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): y = x.data return x + y ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code. Node: %1 : Dynamic = prim::Constant[value=<Tensor>]() Source Location: dank.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper dank.py(3): <module> Comparison exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.397137, 0.956105, 0.169478, 0.560292, 0.392568, 0.108441, 0.97645 , 0.34412 , 0.951246, 0.793061, 0.557595, 0.770245], dtype=float32) y: array([0.243178, 0.315964, 0.972041, 0.0215 , 0.927751, 0.457512, 0.951092, 0.97883 , 0.048688, 0.118066, 0.779345, 0.271272], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(4, 4),)]) def foo(x): for _ in range(x.size(0)): x = torch.neg(x) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { %1 : Dynamic = aten::neg(%0) %2 : Dynamic = aten::neg(%1) %3 : Dynamic = aten::neg(%2) + %4 : Dynamic = aten::neg(%3) - return (%3); ? ^ + return (%4); ? ^ } ``` == ``` import torch def foo(x): if not hasattr(foo, 'cache'): foo.cache = torch.neg(x) return x + foo.cache traced = torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)])(foo) ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. ERROR: Graphs differed across invocations! Graph diff: graph(%0 : Dynamic) { - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() %2 : int = prim::Constant[value=1]() %3 : Dynamic = aten::add(%0, %1, %2) return (%3); } Node diff: - %1 : Dynamic = aten::neg(%0) + %1 : Dynamic = prim::Constant[value=<Tensor>]() Trace source location: test.py(5): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(402): wrapper test.py(8): <module> Check source location: test.py(6): foo /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(281): check_trace /Users/jamesreed/onnx-fairseq/pytorch/torch/jit/__init__.py(408): wrapper test.py(8): <module> ``` The following two examples show instances where program semantics are lost in the Python -> trace transformation, and repeated invocation does not give us useful debug information. Further design in underway for catching these scenarios. ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(3, 4),)]) def foo(x): for i in range(3): x[i, :] = torch.zeros(4) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.830221, 0.915481, 0.940281, 0.555241], dtype=float32) y: array([0., 0., 0., 0.], dtype=float32) ``` == ``` import torch torch.jit.trace(torch.rand(3, 4), check_inputs=[(torch.rand(5, 6),)]) def foo(x): x.view(-1).add_(-x.view(-1)) return x ``` ``` torch.jit.TracingCheckError: Tracing failed sanity checks! ERROR: Traced function outputs do not match the Python function outputs. Exception: Not equal to tolerance rtol=1e-07, atol=0 (mismatch 100.0%) x: array([0.734441, 0.445327, 0.640592, 0.30076 , 0.891674, 0.124771], dtype=float32) y: array([0., 0., 0., 0., 0., 0.], dtype=float32) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10841 Differential Revision: D9499945 Pulled By: jamesr66a fbshipit-source-id: 1f842a32d0b0645259cc43b29700b86d99c59a45	2018-08-28 20:25:26 -07:00
Zachary DeVito	22c9bc3117	Resolve builtins using a dict rather than by name (#10927 ) Summary: Changes the approach for resolving builtin ops so that the following works ``` add = torch.add script def foo(x): return add(x, x) ``` This handles cases when people alias torch and torch.nn.functional to shorter names. This works by building a table of id -> builtin name for the know builtin ops in torch, torch.nn.functional, and for any user-defined op created by accessing in torch.ops.foo.bar This allows us to clean up many SugaredValue types in the compiler. Notes: * we now consider any attributes on python modules to be constants (e.g. math.pi, and torch.double). * fixes a bug where we incorrectly allowed attribute lookup on arbitrary pyton objects. It is now restricted to modules only. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10927 Differential Revision: D9527522 Pulled By: zdevito fbshipit-source-id: 0280422af08b4b0f48f302766d5a9c0deee47660	2018-08-28 11:25:11 -07:00
Elias Ellison	58b145f515	Fix negative indices in tracer (#10560 ) Summary: Previously when tracing slicing & select negative indices would get normalized, fixing the index to the size of the traced tensor. This makes the behavior the same as script so aten::select with negative indices is emitted. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10560 Differential Revision: D9493614 Pulled By: eellison fbshipit-source-id: ce7a8bae59863723247208d86b9f2948051ccc6c	2018-08-27 15:19:41 -07:00
Zachary DeVito	6ce799edd6	Tuples/Lists can now be inputs/outputs to script and other simple fixes. (#10812 ) Summary: * Fix the necessary pathways so that tuples and lists can be inputs to the script. * prevent linear algebra functions from being run in shape prop because they frequently will error out for nonsense data. * favor schema-driven python input conversion where possible. remaining cases where we directly create Stacks without schema are only for debugging * Make the error messages when calling script/trace functions more pythonic * Simplify FlattenTuples -- now that tuples are supported we can choose to only flatten tuples when needed. This may have to be revisited pending onnx test results, but is necessary for making tuple io work. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10812 Differential Revision: D9477982 Pulled By: zdevito fbshipit-source-id: ed06fc426e6ef6deb404602a26c435a7fc40ea0c	2018-08-27 14:40:40 -07:00
Richard Zou	35beecfe17	fix xfails involving literals (#10905 ) Summary: I missed these in #10900 cc apaszke jamesr66a zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10905 Differential Revision: D9516748 Pulled By: zou3519 fbshipit-source-id: a5c3e3b65a33c339d5c4e9fc160462c3d35705f3	2018-08-27 12:41:06 -07:00
Richard Zou	67f6f930a8	Remove FIXME_zerol() from test_jit.py (#10900 ) Summary: The scalar situation has gotten a lot better and now we can remove all instances of FIXME_zerol(). cc zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10900 Differential Revision: D9514206 Pulled By: zou3519 fbshipit-source-id: e4e522f324126c5454cd6de14b832d2d1f6cb0ce	2018-08-27 08:55:08 -07:00
Adam Paszke	c8b246abf3	Prevent JIT from overspecializing to every single size configuration (#10844 ) Summary: Please review the expects carefully to make sure there are no regressions. I tried to go over them one by one when they changed, but it's sometimes easy to miss finer details. Summary of changes: - Renamed `TensorType` to `CompleteTensorType`. Added a new `TensorType` which records only the scalar type, number of dimensions, and device of a value. The argument behind the rename is to encourage people to use `CompleteTensorType` less, as most passes will only have limited information available. To make transition easier `complete_type->cast<TensorType>()` works, and makes our passes work with both kinds of specialization if they don't need extra the extra detail. - Renamed `ArgumentSpec` to `CompleteArgumentSpec`. Added a new `ArgumentSpec`, which matches argument only at the level of the new `TensorType`. - Shape analysis can process graphs with both `CompleteTensorType` and `TensorType`. - Fuser was a part that heavily relied on full shape information being available. Now, we simply try to fuse the largest possible graphs, and have to do run-time checks to make sure they match the code we generate. If they don't, we fall back to regular interpretation. The shape checks are implementing using an optimized method exploiting algebraic properties of shapes with broadcasting, and the relations of broadcasting with pointwise ops. A full written proof of correctness of the shape checking algorithm is included in a comment in `graph_fuser.cpp`. zdevito ezyang mruberry ngimel csarofeen Pull Request resolved: https://github.com/pytorch/pytorch/pull/10844 Differential Revision: D9498705 Pulled By: apaszke fbshipit-source-id: 0c53c2fcebd871cc2a29c260f8d012276479cc61	2018-08-26 09:54:48 -07:00
Elias Ellison	0ef5cfd28c	fix ivalue printing for lists (#10777 ) Summary: Fixing the printing of IValue lists, which didn't work previously. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10777 Differential Revision: D9474264 Pulled By: eellison fbshipit-source-id: 0c7d6e7ecaa3f7908b131ac9f1036f19ac4f8b4f	2018-08-24 16:02:03 -07:00
Elias Ellison	74e6a666b3	If none of the schema match, add ImplicitTensorToNum conversions where needed. (#10180 ) Summary: When matching schema, first try to match without adding TensorToNum conversions. Then make another pass where TensorToNum conversions are allowed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10180 Differential Revision: D9438153 Pulled By: eellison fbshipit-source-id: 80541b5abd06e9d4187e89dda751f44dab6f58c5	2018-08-24 16:02:00 -07:00
Richard Zou	ca567862b2	Support multidimensional indexing (#10787 ) Summary: Part of #10774. This PR does the following: - Support ast.ExtSlice in the frontend. This is done by returning a list of ast.Index and ast.Slice. - Support multidimensional indexing with ints and slices The general approach is to desugar multidimensional indexing into at::slice, at::select operations. This is exactly how normal pytorch does indexing (by desugaring it into at::slice, at::select, and other ops). I used [this code](https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/python_variable_indexing.cpp) as reference. We should be able to copy the rest of this to implement the missing indexing features in script (indexing with ellipses, tensors, sequences, etc). After I'm done implementing the missing indexing features in future prs, I can try to templatize python_variable_indexing.cpp so that it can work with both JIT script and normal pytorch indexing, but right now I'm not sure if that's a good idea or not. cc zdevito jamesr66a apaszke wanchaol Pull Request resolved: https://github.com/pytorch/pytorch/pull/10787 Differential Revision: D9481402 Pulled By: zou3519 fbshipit-source-id: 78c9fa42771a037d157879e23e20b87401cf1837	2018-08-24 08:10:32 -07:00
Zachary DeVito	3d43a82440	Add support for vararg style functions. (#10250 ) Summary: Things like `zeros(1,2,3, dtype=torch.int)` are now supported in the script by altering tryMatchSchema to auto-construct the list `[1,2,3]` when it sees inlined members of the list as the last positional arguments. I suggest reading the commits individually, since the first two incrementally change how we do tryMatchSchema to get it ready for adding vararg list conversion, while the third actually does the modification. closes #10632 closes #8516 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10250 Differential Revision: D9478235 Pulled By: zdevito fbshipit-source-id: 0c48caf7a6184e463d9293d97015e9884758ef9c	2018-08-23 15:10:36 -07:00
Elias Ellison	5c0eece2fd	Force types on values returned from if blocks to be equivalent (#10281 ) Summary: When emitting if Branches, check that the types on each value returned are equivalent. As with reassignment of values, tensors are not forced to be the same shape or subtype. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10281 Differential Revision: D9466566 Pulled By: eellison fbshipit-source-id: 746abdeb34a0f68806b8e73726ad5003b536911c	2018-08-22 19:55:38 -07:00
Adam Paszke	f72e813c2f	Allow tracing functions that take tuples of tensors as inputs (#10637 ) Summary: And return tuples. zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10637 Reviewed By: eellison Differential Revision: D9385892 Pulled By: apaszke fbshipit-source-id: 542f4444d909fb246d7f1d88d6fb98345de2d431	2018-08-22 15:37:10 -07:00
Richard Zou	6c84f7fea0	Relax RHS type assert for augassign (#10730 ) Summary: Augassign (i.e., `x += 1`) gets desugared to an assignment of a binop (`x = x + 1`). Right now we assert that the RHS of the binop is a tensor, but it really doesn't have to be because we support scalar/scalar ops and also list-list ops (i.e., `[1, 2] + [2, 3]`). Pull Request resolved: https://github.com/pytorch/pytorch/pull/10730 Differential Revision: D9465110 Pulled By: zou3519 fbshipit-source-id: 7b118622701f09ce356aca81b8db743d9611097b	2018-08-22 15:10:33 -07:00
James Reed	6fcac354c5	Erase ListConstruct nodes for ONNX export (#10713 ) Summary: ONNX doesn't support this. Instead flatten the inputs to the ListConstruct op and inline it into the subsequent usage Pull Request resolved: https://github.com/pytorch/pytorch/pull/10713 Differential Revision: D9458508 Pulled By: jamesr66a fbshipit-source-id: 0b41e69320e694bb2f304c6221864a39121e4694	2018-08-22 14:39:58 -07:00
Michael Suo	9e75ec11fb	Make empty list literals construct empty Tensor[] (#10705 ) Summary: This will make the common case more natural (no need to do `_construct_empty_tensor_list()`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/10705 Differential Revision: D9411622 Pulled By: michaelsuo fbshipit-source-id: 2d91fbc5787426748d6e1c8e7bbeee737544dc96	2018-08-20 18:28:28 -07:00
James Reed	585e6b581f	Allow method-style casts on tensors (#10641 ) Summary: Closes https://github.com/pytorch/pytorch/issues/10631 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10641 Differential Revision: D9407598 Pulled By: jamesr66a fbshipit-source-id: a0331f4e9e55d92718cde7a1112fe8c705206b1f	2018-08-20 14:10:21 -07:00
Richard Zou	f1420adfe3	Move at::chunk into the graph fuser (#10178 ) Summary: ... to avoid slow at::chunk (it is slow due to tensor initialization). Picking up from #10026 This is done through the following: 1) Absorb starting chunks into FusionGroup as a part of the graph fuser pass. 2) When compiling a kernel, emit a `std::vector<ConcatDesc>` that describes if an input (of the original graph) will be chunked. 3) When launching a kernel, `use std::vector<ConcatDesc>` to chunk an input tensor on the CPU. This chunk directly takes in an at::Tensor and creates four TensorInfo structs in-place in the argument list, bypassing the creation of intermediate Tensors. - Expect test and correctness test to see if a single chunk is fused by the graph fuser - Correctness test for a variety of chunks (dimension = beginning, middle, end) and tensors (contiguous, non-contiguous, edge case (splitSize = 1) for both CPU/CUDA - Expect test for multiple chunks fused into the same kernel and correctness test. cc zdevito apaszke LSTM forward pass, 1 layer, 512 hidden size and input size, 100 seq length, requires_grad=False on all inputs and weights. After changes: ``` thnn cudnn jit 8.8468 6.5797 9.3470 ``` Before changes: ``` thnn cudnn jit 9.9221 6.6539 11.2550 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10178 Differential Revision: D9382661 Pulled By: zou3519 fbshipit-source-id: 1f8a749208fbdd45559775ce98cf4eb9558448f8	2018-08-18 16:10:11 -07:00
Richard Zou	e29b5a1ea8	graph fuser inserts explicit expands where necessary (#10325 ) Summary: Fixes #10096 If the only thing preventing a simple mappable operator from being fused into a fusion group is that its Tensor inputs are not of the same shape as the output, then the graph fuser inserts explicit expand nodes for those inputs. This helps the graph fuser not miss out on any fusion opportunities involving simple mappable operations that have Tensor inputs. This PR doesn't do anything for the scalar case; that can be addressed later. Test Plan - Simple expect test case - Added expect tests for a raw LSTMCell. The expands help speed up the forwards pass by allowing more operations to be fused into the LSTMCell's single FusionGroup. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10325 Differential Revision: D9379308 Pulled By: zou3519 fbshipit-source-id: 86d2202eb97e9bb16e511667b7fe177aeaf88245	2018-08-17 16:03:46 -07:00
Richard Zou	86c9856d9c	Fuse tensor-scalar ops when scalar is constant (#10511 ) Summary: This is on the way to resolving #9940. Fixes #10501 This PR modifies graph fuser to fuse operations that have constant scalar arguments. These constant scalar arguments are directly inlined into the kernel body. The context for this is that LSTM backward (in particular, sigmoid backward) has many add(x, 1.) operations. This PR should be sufficient for LSTM backward to get fused by the graph fuser. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10511 Differential Revision: D9378896 Pulled By: zou3519 fbshipit-source-id: 6a7a2987f5b6e8edaaf4b599cd200df33361650f	2018-08-17 14:10:23 -07:00
Wanchao Liang	52058204d6	Add nn functional tests in JIT (#10409 ) Summary: The PR is the first step to integrate torch.nn library with JIT. It adds the tests for nn functional interfaces in trace/script mode, and tries to find out the different between torch.nn.functional ops and the ATen ops, to see the work need to be done in order to support a full set of nn functional in script mode. Some statistics in summary: - Totally 84 useful functions in torch.nn.functional (the number does not include helper funcs and deprecated funcs in torch.nn.functional). - 7 functions/ops does not support higher gradient, so just excluded from the whole test. - 36 functions is different with the Aten op for different reasons. Among those 36 functions, bunch of them (roughly around 10-15) are just naming difference and simple transformation using other ops inside the function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10409 Differential Revision: D9350694 Pulled By: wanchaol fbshipit-source-id: 8fce6f30d8d25ace5a544a57b219fe61f5a092f8	2018-08-17 11:09:49 -07:00
Elias Ellison	e190505e84	Adding support for inlining if branches (#10084 ) Summary: Inlining if branches which have constant inputs. If an if node gets inlined, the set of mutated variables returned by its ancestors may have changed. In the following example the block should return a mutated set of (a) and not (a, b). ``` if cond: if True: a = a - 1 else: b = b - 1 ``` To calculate this we recursively update mutate variables in if branches from the leaf nodes up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10084 Reviewed By: michaelsuo Differential Revision: D9340429 Pulled By: eellison fbshipit-source-id: b0dd638a5cace9fdec3130460428fca655ce4b98	2018-08-17 09:48:47 -07:00
Peter Goldsborough	c101a57a74	Build mechanism for custom operators (#10226 ) Summary: This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I: 1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries 2. Created a ` torch/op.h` header for easy inclusion of necessary headers, 3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op. 1. It defines an op in `op.{h,cpp}` 2. Registers it with the JIT using `RegisterOperators` 3. Builds it into a shared library via a `CMakeLists.txt` 4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey! The pure C++ and the Python builds are separate and not coupled in any way. zdevito soumith dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226 Differential Revision: D9296839 Pulled By: goldsborough fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0	2018-08-16 18:56:17 -07:00
Owen Anderson	abf85bf0ef	Perform CSE across block boundaries. (#10105 ) Summary: zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10105 Differential Revision: D9186678 Pulled By: resistor fbshipit-source-id: 87b63d4fc0c7d394edb4777acdefa8f022a8bf8d	2018-08-16 00:25:36 -07:00
James Reed	32bb4040dd	Unified type annotation parsing for script frontends (#10279 ) Summary: After this, all combinations of {String frontend, Python AST Frontend}{Python 3-style type annotations, MyPy-style type comments}{Script method, Script function} should properly accept type annotations. Possible TODOs: - Clean up the functions marked HACK - Clean up the Subscript tree-view to better match the Python AST versions - Can we use this for Python functions? That's the only place annotations.get_signature() is still needed Pull Request resolved: https://github.com/pytorch/pytorch/pull/10279 Differential Revision: D9319726 Pulled By: jamesr66a fbshipit-source-id: b13f7d4f066b0283d4fc1421a1abb9305c3b28fa	2018-08-14 18:13:15 -07:00
Richard Zou	b4462511fd	Add LSTMCell backward pass expect tests (#10506 ) Summary: - Exposed get_debug_graph for ScriptModule (gets the debug graph for its forward Method) - Added forward/backward expect tests for lstm and milstm cells. These are intended to prevent regressions cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10506 Differential Revision: D9316590 Pulled By: zou3519 fbshipit-source-id: 3c2510d8363e9733ccbc5c7cc015cd1d028efecf	2018-08-14 11:39:44 -07:00
Zachary DeVito	61bedc96f0	Schema-based creation of graph nodes (#10198 ) Summary: This commit adds the ability to insert a node with inputs, using the schema to check the inputs are valid types, fill in any default values, and perform standard implicit conversions. Since it is schema based, it will discover and use the right overload. Constructors to `NamedValue` enable it to be constructed using `IValue` constants so it is possible to use constant values in the input list as well: ``` g.insert(aten::add, {v, 3}); ``` Keyword arguments are also supported: ``` g.insert(aten::add, {v}, {{"other", t}, {"scalar", 1}}); ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10198 Differential Revision: D9307252 Pulled By: zdevito fbshipit-source-id: 644620aa85047d1eae1288383a619d50fec44d9b	2018-08-14 10:25:38 -07:00
Richard Zou	fed05cf4cf	Fix prim::FusedConcat bug (#10466 ) Summary: Fixes #10456 The graph fuser was fusing together groups with prim::FusedConcat (the producer) with other ops (the consumer) if the consumer is fusable. For example, ``` import torch torch.jit.script def fn(x, y, z): x1 = x + y y1 = x - y w = torch.cat([x1, y1]) return w + z x = torch.randn(2, 2, dtype=torch.float, device='cpu') y = torch.randn(2, 2, dtype=torch.float, device='cpu') z = torch.randn(4, 2, dtype=torch.float, device='cpu') fn(x, y, z) fn.graph_for(x, y, z) ``` produced the following graph: ``` graph(%x : Float(2, 2) %y : Float(2, 2) %z : Float(4, 2)) { %3 : int = prim::Constant[value=1]() %y1 : Float(2, 2) = aten::sub(%x, %y, %3) %8 : int = prim::Constant[value=0]() %14 : Float(4, 2) = prim::FusionGroup_0[device=-1](%z, %y1, %x, %y) return (%14); } with prim::FusionGroup_0 = graph(%1 : Float(4, 2) %5 : Float(2, 2) %7 : Float(2, 2) %8 : Float(2, 2)) { %11 : int = prim::Constant[value=1]() %9 : int = prim::Constant[value=1]() %x1 : Float(2, 2) = aten::add(%7, %8, %9) %w : Float(4, 2) = prim::FusedConcat[dim=0](%x1, %5) %2 : int = prim::Constant[value=1]() %3 : Float(4, 2) = aten::add(%w, %1, %2) return (%3); } ``` this is a problem because it violates two invariants: 1) all inputs to the FusionGroup must have the same size 2) prim::FusedConcat's output must not be used inside the FusionGroup This PR fixes this problem by checking if the output to a FusionGroup came from a prim::FusedConcat node when deciding whether to fuse the consumer and producer. If the producer is a value that came from a prim::FusedConcat node in a FusionGroup, then consumer & producer do not get fused. cc apaszke zdevito Pull Request resolved: https://github.com/pytorch/pytorch/pull/10466 Differential Revision: D9296686 Pulled By: zou3519 fbshipit-source-id: ed826fa9c436b42c04ca7d4d790cece804c162bd	2018-08-13 21:09:25 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
Roy Li	e9ad74357e	Use serialization container in ir import export (#10394 ) Summary: Copy of #10191 because these changes didn't land with the diff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10394 Differential Revision: D9260816 Pulled By: li-roy fbshipit-source-id: 7dc16919cfab6221fda1d44e98c5b900cfb40558	2018-08-10 00:09:30 -07:00
Michael Suo	0950d7a98d	support list slicing (#10318 ) Summary: As title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10318 Differential Revision: D9254351 Pulled By: michaelsuo fbshipit-source-id: be891a584dc295b5e353f7f5257d64a356fb9586	2018-08-09 17:25:13 -07:00
Michael Suo	b6402648f4	fix off-by-one bug in open-ended slicing (#10286 ) Summary: Previously, `tensor[i:]` was transformed to `tensor[i:-1]`. This incorrectly leaves off the last element. Noticed this when implementing slicing for list types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10286 Differential Revision: D9193292 Pulled By: michaelsuo fbshipit-source-id: df372b815f9a3b8029830dd9e8769f9985a890e7	2018-08-07 00:39:42 -07:00
Michael Suo	5a7c710548	Support some basic list operations (#10225 ) Summary: Support a few basic operators: - eq - add - len - select (indexing) Pull Request resolved: https://github.com/pytorch/pytorch/pull/10225 Differential Revision: D9172338 Pulled By: michaelsuo fbshipit-source-id: 6e75ec1453b9589b0fb4698598ecdba5a5fccff9	2018-08-07 00:39:40 -07:00
iotamudelta	a38b572de3	enable unit tests and other changes (#10266 ) Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297	2018-08-06 14:54:01 -07:00

1 2 3 4 5 ...

293 Commits