pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Zachary DeVito	733e2967b1	Allow `__constant__` values in a ScriptModule to be used as attributes for builtin functions (#7017 ) * Allow `__constant__` values in a ScriptModule to be used as attributes for builtin functions * Fix bugs in @script loops 1. while loops run shape propagation multiple times until the shapes have converged. There were two bugs here. (a) First the 'changed' condition was not checking if it actually changed the output, and instead would mark changed = true if the two inputs were different. This incorrect because the output of the block and the input of the block may always have different shapes. Now it actually checks if it is about to change the output entry that it is writing to. (b) expand nodes were being inserted into the graph even inside the while loop body. However, if we iteratively discover that the input shape to one of these expands is actual dynamic, then it was incorrect to insert the expand in the first place. This changes it so that we only insert expands after we have converged on the shapes. 2. the way deleteExtraInputs removed loop-carried dependencies was unsafe because it would lookup Value* elements in the loop body's environment that were previously invalidated when deleteExtraInputs remove another input to the loop. This changes the way deleteExtraInputs works so that it never has to read a value out of the loop body's environment to avoid using the invalidated pointers.	2018-04-27 17:44:17 -07:00
gchanan	a6bfa16c17	torch.arange: add numpy-style type inference. (#7016 ) * torch.arange: add numpy-style type inference. This is a backwards-compatibility breaking change. * Fix flake8. * Use at::optional. * Remove unneeded header files. * Use reference wrapper. * Update arange for test. * Address review comments.	2018-04-27 15:11:45 -04:00
Zachary DeVito	b2581c0289	Workaround in onnx to get transposes into init_nets (#6924 ) * Workaround in onnx to get transposes into init_nets This adds a pass to ONNX so that it can speculate Transpose operators so that ONNX's split pass can put them into an init_net Also fixes a potential bug in onnx peephole where an optimization across blocks might move a Value and violate scoping. * Perform shape propagation when embedding a program into a trace. This ensures the trace still has type information specific to that trace, which will help onnx export succeed in more cases.	2018-04-26 11:04:17 -04:00
Zachary DeVito	b7487d42a0	Workaround to make PythonOps traced with torch.jit.trace work correctly. (#6738 ) The long-term fix is to remove the handling-creating pathways and remove all the modes from PythonOp making it into an op that simply calls a PyObject. Right now ONNX expects PythonOp to hold a nn.Function, not a generic callable, so completely removing the legacy pathway will also require changes to how ONNX symbolics are found.	2018-04-24 17:21:00 -07:00
Zachary DeVito	0b5910f77e	[jit][script] Fix a bug combining sizes/unsized tensors (#6882 ) * [jit][script] Fix a bug combining sizes/unsized tensors This add an isSubtypeOf method to reflect that sized tensors are a subtype of Dynamic[Tensors]. It updates the typechecking code to reflect this relationship. * Add index_select to shape prop	2018-04-24 14:04:18 -07:00
Sam Gross	9765bb5f1e	Revert "Fix performance regression of simple indexing cases (#6793 )" (#6886 ) This reverts commit `8a016693c0`.	2018-04-23 22:22:12 -04:00
Zachary DeVito	b8ada7380a	Tuple literal and cat support (#6691 ) * Support list and tuple literals: Adds support for [a, b], (a, b) and "a, " * Allow non-tensors to reach emitBuiltinCall, each SugaredValue::call is now responsible for checking the types of its inputs. Add support for calling cat with a tuple to emitBuiltinOp	2018-04-23 10:58:07 -07:00
James Reed	814f791f2b	[JIT][script] Improve error reporting for tuple type mismatch (#6819 ) Previously we would see errors like: variable 'states' previously has type (Tensor, Tensor, Tensor, Tensor, Tensor, Tensor) but is now being assigned to a value of type (Tensor, Tensor, Tensor, Tensor, Tensor, Tensor): since the default case in the diagnostic printout was "Tensor". This adds a virtual member function to each Type class that returns a human-readable string for better error reporting * Improve error reporting for tuple type mismatch * Add better Tensor printout	2018-04-22 13:54:52 -04:00
James Reed	ef76e24f60	[JIT][script][ONNX] ScriptModule ONNX export + ONNX export for control flow nodes (#6608 ) * ScriptModule ONNX export * ScriptModule ONNX export * Export for control flow nodes * Add pretty-print capability for ONNX export testing * Update tests and handling of mutliple GraphProto names * Maybe bugfix? * factor out code from export and pretty print	2018-04-19 23:45:03 -07:00
Zachary DeVito	c420297545	[jit][script] Constants python int now turn into Long (#6728 ) This matches the behavior or literals.	2018-04-19 21:33:29 -07:00
gchanan	8a016693c0	Fix performance regression of simple indexing cases (#6793 ) * Fix performance regression on simple cases of indexing Dispatches to the old kernels * Adapt JIT test The test was expected to fail, but due to the change in the previous diff, it would now dispatch to index_select, which succeeds. I modified the function to go through the advanced indexing codepath * Only do checks once, properly AutoNoGil, AutoGPU.	2018-04-19 23:41:44 -04:00
Zachary DeVito	a3f3817fbd	[jit][script] Allow variables to be define in if statements (#6675 ) We allow variables defined inside of if statements to be defined after if statements as long as they will be defined unconditionally. This supports a larger subset of python programs than we supported before.	2018-04-19 11:32:31 -07:00
gchanan	e1f5d80d5c	Eliminate handle_zero_dim when broadcasting is applied earlier. (#6683 ) * Eliminate handle_zero_dim when broadcasting is applied earlier. This ends up not actually doing anything unless all the broadcasted tensors are scalars, which ends up with inconsistent behavior in that case only, because the type promotion rules are different. This is better solved with real type promotion logic. * Change type of script comparison to long. * Fix jit tests. * Fix cpp jit test by being consistent about long-vs-float. * Consistent float and long. * Use int64_t rather than long.	2018-04-18 23:37:54 -04:00
Zachary DeVito	f656301526	Allow traces to call @script functions (#6642 ) This adds the ability to trace script functions while preserving their control flow. When the trace encounters a script function it inlines the graph of the function into the trace rather than tracing the function itself.	2018-04-17 15:19:16 -04:00
Zachary DeVito	ee240aa00c	Allow script_methods to be defined out of order (#6341 ) This modifies the registration process so that all script methods in a ScriptModule are defined at once. Method gains a `method_creator` callback that gets invoked when the method is first called to define it if it has not already been defined. Recursive cycles in this `method_creator` are checked. This approach was chosen over first creating all the graphs and then inlining the call sites because it will combine better with type propagation for non-tensor types like tuples. e.g. ``` a = foo(b) return bar(*a) ```	2018-04-16 15:19:05 -07:00
James Reed	e8d2f05931	[JIT] Switch JIT passes to take a graph rather than TracingState (#6598 ) * Switch JIT passes to take a graph rather than TracingState * Add pybind11 binding for ONNX pass from graph * Fix canonicalize pass * address comment * Switch ToONNX to explicitly return new graph * optimize_graph instead of optimize_trace	2018-04-13 17:38:22 -07:00
Zachary DeVito	825ce7f196	[jit][script] Allow tuples to be re-assigned (#6538 ) * Allow tuples to be re-assigned This commit improves our support of tuples by making them more first-class. In particular, it allows tuples to be re-assigned across loops and ifs. It does this by making them first-class values in the Graph IR, and then removing the tuples in a LowerTuples pass. An alternative approach would have added more support for desugaring tuples in the Environment object as they were emitted. Instead, the current approach was chosen anticipating a future when tuples are fully supported (including the interpreter). In that future, the current code can be completly reused with the LowerTuples pass just becoming a optimization that removes unneeded tuple allocations.	2018-04-13 17:34:50 -07:00
James Reed	3b0204d43c	[JIT] Hacky: Staged symbolics for RNN nodes (#6297 ) * Staged symbolic for RNN modules * Move function to symbolic.py * Add comments, improve tests, fixup logic	2018-04-12 16:29:25 -07:00
Tongzhou Wang	e01569afd7	Restore allow_unused functionality (#6553 )	2018-04-12 21:30:42 +02:00
Zachary DeVito	8995ddda05	[jit][script] Check that each builtin returns the right number of values. (#6492 ) * Fixes to the way script handles multiple values, and other minor fixes. This commit improves our handling of operators that return multiple values. Builtins are now checked so that they return the right number of values, and support for TupleValue is extended to all things that can return multiple values. This resolves issues where the compiler accepted things like: a, b = c + c This would cause the interpreter to crash. Now each operator knows how many results it will produce and can check it against the number of requested inputs. Notes: * Allow True/False literals in constant expressions * make handling of keyword constants more consistent to support True/False * make parsing constants match the way we construct constants from python * improve the error messages when accessing bad graph attributes. * switch findTensorOp to return an optional. * check that attribute types are correct in findTensorOp * Check the correct number of outputs for builtins This also changes emitExpr to return a single SugaredValue Rather than possibly returning multiple values, emitExpr now always returns a single value, which _might_ be a tuple. This approach more closely follows python making the code easier to follow. Checks for returning the right number of values are now located in the assignment operator, and occur when unpacking the tuple. We still pass `n_binders` to function calls so that calls into python know how many values they should return.	2018-04-12 10:32:49 -07:00
James Reed	ad5d421554	[JIT] Implement staged symbolics for pack_padded_sequence/pad_packed_sequence (#6256 ) * Unit test for pack_padded tracing * Move monkeypatching stuff * Switch symbolic * Fix stack traces and update test * Fixup and confirm e2e working * lint * Move monkeypatch back to onnx * Address comments * remove extraneous import * Add gradient checking * lint * Address comments * improve test case	2018-04-10 11:30:50 -07:00
James Reed	1533155c4e	[JIT][script] Implement compile-time tuples & starred unpacking (#6214 ) * Something that works * Tuple sugared value * Works with commenting out input size check * support string frontend * Initial starred assignment * Fix parser * Fixup tests * clang-format * fix rebase error * lint * move star assign test to string frontend to make py2 happy * Py2 fix: parse starargs from Call node * Address some comments * Fixup merge * Remove overloaded unary operators * Bugfix and test case * Address a few more comments * asValues -> asTuple * Remove unrolledFor stuff * Fixup getValues * Pass CallsiteDescriptor struct and have different behavior for different call types * Address comments and lint * some type checks * Address comments * lint * Fix mistake	2018-04-09 19:34:51 -07:00
Adam Paszke	c1cd6eab9f	Handle broadcasting in the JIT (#6084 ) * Add size checks to JIT's fuser * Handle broadcasting in shape propagation pass * Fix build errors and add tests	2018-04-05 17:07:52 -07:00
Zachary DeVito	5ab30eedf3	Add __constants__ to Script modules (#6092 ) Like `__slots__` the `__constants__` property changes the set/getattr behavior of a script module for the keys listed so they behave as constants. This enables script methods to use them in way that are otherwise not allowed. * Python numbers/bools can be inlined as constants in script code. * List of numbers can be iterated over using for loops * nn.ModuleLists can be used in for loops as well, unrolling their content.	2018-04-05 11:31:43 -07:00
James Reed	9f49be51ec	Fix argument checking for inlining a module (#6207 )	2018-04-02 23:14:04 -04:00
Adam Paszke	da6c3c90d9	Relax constraints on return statements in the script (#6070 ) Script functions can now have no return statements, empty return statements, or return one or more values. Additionally fix the lexer to always emit TK_NEWLINE before TK_DEDENT, which simplifies the parser.	2018-03-31 18:35:33 +02:00
James Reed	5fe3c406f2	Experimental support for different ONNX export types (#6016 ) Allows you to export an ONNX model as: Protobuf file (this is what we have now) Uncompressed zip archive Compressed zip archive Directory * Experimental support for different ONNX export types * Remove a copy * Add comment * Add test cases * lint * fix bug * address comments	2018-03-30 15:30:38 -04:00
Richard Zou	1807bacd65	Fix printing of unknown binop operator in torchscript (#6069 ) Before, using an unknown binary operator like `@`: ``` import torch @torch.jit.script def mm(x, y): return x @ y x = torch.randn(4, 3) y = torch.randn(3, 2) mm(x, y) ``` resulted in [this not-so-readable trace](https://gist.github.com/zou3519/052b8998108c4bc0fe0e7c85c6f5758e). Now, it tells the user that the problem is an unknown binary operator: ``` NotSupportedError: unsupported binary operator: MatMult @torch.jit.script def mm(x, y): return x @ y ~~~ <--- HERE ```	2018-03-28 19:41:45 +02:00
Zachary DeVito	0f198fa723	Add additional script module functionality (#6033 ) * allow calls to non-script methods, allow python non-script attributes in methods * add test to make sure submodules are not reassigned * Test that we can change python attributes	2018-03-27 23:37:56 -07:00
Richard Zou	5d628db0a2	Deprecate ctx.saved_variables via python warning. (#5923 ) * Deprecate ctx.saved_variables via python warning. Advises replacing saved_variables with saved_tensors. Also replaces all instances of ctx.saved_variables with ctx.saved_tensors in the codebase. Test by running: ``` import torch from torch.autograd import Function class MyFunction(Function): @staticmethod def forward(ctx, tensor1, tensor2): ctx.save_for_backward(tensor1, tensor2) return tensor1 + tensor2 @staticmethod def backward(ctx, grad_output): var1, var2 = ctx.saved_variables return (grad_output, grad_output) x = torch.randn((3, 3), requires_grad=True) y = torch.randn((3, 3), requires_grad=True) model = MyFunction() model.apply(x, y).sum().backward() ``` and assert the warning shows up. * Address comments * Add deprecation test for saved_variables	2018-03-26 14:13:45 -04:00
James Reed	213fa61706	Implement range for loop in script (#5827 ) * Implement range for loop in script * Fix handling of boolean constants * Use WithInsertPoint * Allow dynamic max trip count * fix symbols * Fix argument order * fix test * Add insert{Input,Output} APIs and use them * Factor out condition stuff * clang-format * Address remaining comments * Fix tests * Implement script in AST frontend	2018-03-23 11:55:32 -04:00
Adam Paszke	a58f2d242a	Test both Python and string JIT frontends (#5891 )	2018-03-22 16:58:36 +01:00
Zachary DeVito	c8d1ec02be	[jit] Have ScriptModule inherit from Module (#5769 ) * Have ScriptModule inherit from Module This is accomplished by created replacement _parameters, _buffers, and _modules which implement the OrderedDict APIs but which actually get/set their members inside script::Module * Merge TracedModule with ScriptModule * Move logic of attribute handling into Python bindings rather than make script::Module handle it. This was redundant with nn.Module, which already handles attribute. * Make TracedModule a subclass of ScriptModule * Move handling of attribute kind logic into bindings. * Allow ScriptModule to contain non-script module submodules.	2018-03-22 00:17:49 -04:00
Adam Paszke	e6ac93b817	Add support for number and list literals in Python frontend (#5843 )	2018-03-17 10:22:23 -04:00
Edward Z. Yang	acc409396b	Namespaced symbols (#5820 ) * Namespaced symbols - Our interned strings now have structure, "ns::symname" rather than just "symname" before. We support efficient namespace testing for uniques by encoding the namespace in one byte in the Symbol internal representation. See torch/csrc/jit/interned_strings.h for a more in-depth implementation discussion. - All uses of ksymbol are now attr::symbol (or some appropriate namespace). The valid namespaces are prim, attr, onnx and aten. - Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the attribute setting/getting API, whose symbols must always be attr symbols; they get special cased to assume strings are passed. There's a little bit of naughtiness in the implementation, maybe you know how to solve it. - However, the g.op() convenience function assumes that you're generating ONNX operators, unless you explicitly qualify. - All ATen operators and nodes have built-in interned strings generated for them, so you should never have to write a string literal ever again. The tracing code is adjusted to use it. - ONNX exporter now properly tests to see that all operators are in onnx namespace before accepting the export. This is way more robust than the previous exporter, which would be willing to export capitalized operators which were not actually ONNX operators. - A slight organizational change for symbolic.py; this module now ONLY contains aten operators. In particular, the exporter for Constant has moved into utils.py (along with Undefined, from the C++ side), since primitive ops get "special treatment." - The un-inplacing logic in recording is more robust, so that we don't delete a trailing underscore from __and__. This never affected us before because we didn't have any tests for it. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-16 13:36:11 -04:00
Adam Paszke	eeb90d9c95	Add a Number node to the JIT AST and unify script syntax with Python (#5716 )	2018-03-15 20:56:23 +01:00
Adam Paszke	c66111e79b	Desugar torch.* and F.* functions in JIT script (#5784 )	2018-03-15 12:02:31 +01:00
Adam Paszke	694bee1f7e	Fix the rule for Assign in JIT's Python frontend (#5793 )	2018-03-15 09:14:03 +01:00
Zachary DeVito	41285edbb6	[jit] add a compiled script module (#5630 ) Add script::Module C++ class to represent script modules switch AST -> IR conversion to work on Modules/Methods rather than raw graphs function-only AST -> IR conversion is just a simplified case where there is only one module with a single method and no parameters. introduce SugaredValue in compiler.h to represent values in scope in a script function that are not first-class and that get desugared. This is used to represent the module's self parameter, as well as python function calls, and method calls on tensor provide a Python ScriptModule that provides a nice API on top of script::Module allowing for the definition of script modules with methods, parameters, and submodules Not in this PR but intended for the future: ScriptModule actually subclasses nn.Module, with most methods implemented Unification of tracedmodule and script module functionality into one container class. Detailed changelog: * Switch compiler over to using Module, but don't use them yet. * Remove intermediate attribute encoding in compiler * Create SugaredValue object to handle resolution of compiled module. * switch to_ir to modules, implement Select * hacky python wrappers * Private ScriptModule * Add `define` to script module * Attributes use TK_LIST_LITERAL this anticipates adding a real list literal expression to the language. * Add a metaclass to make sure script stubs are registered * Add a test * Doc createResolutionCallback * Docs and minor editing * Address PR comments * Document * Fix unicode issue	2018-03-12 09:52:40 -04:00
Adam Paszke	e4c303f373	Defer shape analysis failures until runtime (#5574 )	2018-03-09 18:43:03 +01:00
Adam Paszke	5597aba868	Add return statement to the JIT AST (#5578 )	2018-03-06 13:14:53 +01:00
James Reed	60415cf0d2	Big batch of fixes for JIT (#5517 ) * Check if node output matches in shape propagation * Fix list attributes and view shape propagation * fix inferred shapes for view * Fix shape inference for integrally typed tensors * Fixes for concat in control flow * Fix print	2018-03-02 15:03:44 -08:00
Adam Paszke	4afd62db09	Add TracedModule to the JIT (#5409 )	2018-02-28 22:50:50 -08:00
James Reed	55c64e5243	Add Python function calls to JIT script (#5445 ) * Add Python function calls to script * Script compiler gains a `Resolver` object that runs when it does not understand a function call. This decouples the python resolution from the conversion to IR.	2018-02-28 19:45:04 -08:00
Zachary DeVito	39608b0180	Add source information to IR nodes (#5449 ) * Add source information to IR nodes SourceRange information from the script is not propagated to IR nodes. This information is only used in two places now: the interpreter wraps errors that occur when an instruction executions and shape propagation now reports errors on the line where it fails: Traceback (most recent call last): File "test/test_jit.py", line 1655, in test_script_error bar(Variable(torch.rand(10), requires_grad=True), Variable(torch.rand(9), requires_grad=True)) RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0: @torch.jit.script def bar(c, b): return c / b ~~~~~ <--- HERE In the future, shape propagation should really not report any size errors and instead just not propagate shapes and let the actual execution fail. However, this is hard to accomplish while we still depend on running the op to do shape propagation.	2018-02-28 17:06:18 -08:00
Zachary DeVito	05269b582b	[JIT] Support shape propagation with control-flow (#5391 ) Support shape propagation with control-flow * This allows us to enable optimization in the GraphExecutor for most script tests. * Changes Type to always be present (non-null) on a Value, removing `hasType()` and `typeOption()`. A new type kind 'DynamicType' now represents when a specific type has not been determined. * If/Loop nodes propagate shapes/types in the simple cases where types of outputs do not change depending on where control flows. In other cases, we propagate DynamicType to indicate we do not know what the shape will be. * Remove the `cond` input to the body of Loop to simplify handling in interpreter and shape propagation. * Bugfix for zero-dim contiguousStridesOf	2018-02-26 15:24:05 -08:00
Zachary DeVito	c6d47f6386	add @torch.jit.script, @torch.jit.compile, torch.jit.CompilationUnit(str) (#5367 ) * torch.jit.trace annotation now creates a GraphExecutor The other torch.jit.trace, which was used for testing purposes and for onnx to get the trace graph, is now called torch.jit. torch.jit.get_trace_graph. * @script annotation, and compilation unit for strings	2018-02-26 13:22:45 -08:00
Adam Paszke	a0118533ef	Add a print() function to the JIT script (#5274 ) Additionally: - add support for calling functions that are not methods in the Python frontend - add an end-to-end test for the Python frontend - add a capture_stdout helper for checking that `print` actually works	2018-02-24 11:15:55 +01:00
Sam Gross	30ec06c140	Merge Variable and Tensor classes (#5225 ) This replaces the torch.Tensor constructors with factories that produce Variables. Similarly, functions on the torch module (e.g. torch.randn) now return Variables. To keep the PR to a reasonable size, I've left most of the unused tensor code. Subsequent PRs will remove the dead code, clean-up calls to torch.autograd.Variable, and rename Variable to Tensor everywhere. There are some breaking changes because Variable and Tensors had slightly different semantics. There's a list of those changes here: https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge	2018-02-23 18:03:31 -05:00
Zachary DeVito	8904616028	add control flow to interpreter (#5293 ) * Use stacks in the interpreter/aten_dispatch Rather than have separate input/output lists, the interpreter now works using a single stack. Operators in the interpreter push/pop from the stack. This allows ownership of tensors to transfer directly to an operator, and an operator can drop the reference to a tensors as soon as it is no longer needed. This is important for the GraphExecutor op, which recursively runs the interpreter. Once autograd is updated to pass variables to Function by value, we will be able to ensure that we release ownership as soon as possible. This commit also switches the interpreter to use a fake tensor 'ContainerTensor' rather than at::Retainable to hold non-tensor data in the interpreter. This allows us to use std::vector<at::Tensor> for all registers, which is significantly less confusing than the OwnedRetainables struct it was replacing. * Add If and Loop to interpreter * Preprocess loop to calculate where references to tensor should be dropped * Add control instructions JumpZ/JumpNZ/Jump * Switch from explicitly having stage structs to having a single list of instructions with Store/Load instructions to take values off the initial stack * Make the interpreter tests executable rather than use expect files * add a flag to interpreter code so that constants are variables if the interpreter is running on variables. * Add tensor_as to its own file	2018-02-22 19:56:15 -08:00
Tongzhou Wang	1848cad108	[ready] Layer Normalization (#4922 ) * at::maybe_data_ptr and Check.h => TensorUtils.h * THNN support for optional BN running_* * ATen support for optional BN running_* * Python nn.* support for optional BN running_; Improve IN and BN doc Add tests for IN and BN new option * Layer Norm * Fix LRN doc * functional interface for LN and IN * Layer norm tests * fix BN double backward returning undefined tensors * fix jit test using wrong dim inputs for BN * add/improve BN, IN and LN GPU tests with half type * Udpate docs to be consistent with Conv notation Fix onnx Clarified onnx symbokic wrapper * fix typo * Address comments	2018-02-22 11:56:41 -05:00
James Reed	5eefe87d4e	Emit ternary if in script compiler (#5291 )	2018-02-18 09:53:13 +00:00
James Reed	3ffd6ffa7d	while and if for experimental JIT script (#5176 ) This commit adds while and if support to the experimental script frontend, following the design of ONNX.	2018-02-16 15:30:18 -08:00
Adam Paszke	cb2fd39fdd	Add Python frontend to the JIT (#5190 )	2018-02-15 22:53:19 +01:00
Adam Paszke	99474d28b8	Fix compound assignment in JIT script (#5178 )	2018-02-12 09:12:28 -08:00
bddppq	3e85613751	Experimental jit script (#5074 )	2018-02-07 20:43:45 +01:00
Sam Gross	895aebac08	Use Variable instead of Tensor in Function.forward (#4786 ) The Tensor and Variable classes are being merged. autograd.Function.forward is now called on Variables, but with "no-grad" mode (torch.no_grad()) enabled. One benefit is that we no longer have to explicitly track shared storages.	2018-02-06 17:24:27 -05:00
Zachary DeVito	c308e03f3e	Initial GraphExecutor Implementation. (#4982 ) This adds the initial implementation of graph executor for the new JIT design. It includes a few python tests ensuring that nograd, backward, and double-backward cases work for simple examples and some corner cases. More work needs to be done to performance optimize as there are many extra copies and places where we hold onto variables longer than we should. These are noted in the comments.	2018-02-02 17:45:59 -08:00
Adam Paszke	ae903ca61a	Fix JIT tracing in autograd codegen (#4941 )	2018-01-31 00:14:36 +01:00
peterjc123	c011c8b5a6	Enable fixed tests again in Windows (#4928 )	2018-01-30 16:33:49 +01:00
Zachary DeVito	0ae5498079	[JIT] add create_autodiff_subgraphs (#4822 ) This pass splits differentiable subgraphs into their own Node, similar to a fusion group. This initial implementation does not create optimal subgraphs, but it works well in the case where most things are differentiable, and has the building blocks (`mergeNodes`) to extend to the better implementation.	2018-01-23 23:46:54 -05:00
Edward Z. Yang	27505e6429	Fix #4480 by tracing inputs before running function. (#4807 ) * Fix #4480 by tracing inputs before running function. The DCE trick says that if I have y = f(x), and f is internally implemented as g, it's OK to trace both g and f. Recall the tracing algorithm is: enter f(x) compute its result y trace y = f(x) return from f So when you run the example above, you'll do this: # suppose x is mapped to %1 enter f(x) enter g(x) result of g is y trace y = g(x a.k.a. %1) (mapping y to %2) return from g result of f is y trace y = f(x a.k.a. %1) (remapping y to %3) return from f and end up with a trace like this: %2 = g(%1) %3 = f(%1) ... only %3 is live, because %2 was killed from the mapping... Subsequent DCE will eliminate the invocation of g and you'll only see f in the final trace. However, if f and g are inplace functions, the machinery breaks: # suppose x is mapped to %1 enter f(x) enter g(x) result of g is x trace x = g(x a.k.a. %1) (remapping x to %2) return from g result of f is x trace x = f(x a.k.a. %2) (remapping x to %3) return from f resulting in: %2 = g(%1) %3 = f(%2) # OOPS This commit changes the strategy so we instead do this: enter f(x) trace f(x) compute its result y trace y = f(x) (computed above) return from f Now we get the correct Value before it is overwritten. Here is what the new trace code looks like: jit::tracer::PreTraceInfo trace_info; if (jit::tracer::isTracing( self, index )) { trace_info = jit::tracer::preRecordTrace( "index_fill", { self, index } ); setattr(trace_info.n, jit::Symbol("dim"), dim); setattr(trace_info.n, jit::Symbol("value"), value); } baseType->index_fill_(self_, dim, index_, value); increment_version(self); rebase_history(self, grad_fn); if (trace_info.state != nullptr) { jit::tracer::postRecordTrace( trace_info, { self } ); } Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Revert "Hot patch ONNX _run_symbolic_function" This reverts commit `d1c973fee1`. * lintfix Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Add missing expect file Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-01-23 18:06:55 -05:00
Adam Paszke	1a02d3ae86	Implement MM fusion (MM with add reduction tree) (#4615 ) Implement MM fusion (MM with add reduction tree) A tree where leaves are matrix multiplies and inner vertices are adds can be computed as a single mm. Such subgraph often appear in backward if a single weight is reused multiple times (e.g. in RNNs). NOTE: this seems to be slightly slower on the GPU than the naive implementation, but it's a huge win on the CPU (think 100x lower overhead)	2018-01-17 21:36:21 +01:00
Luca Antiga	040336f5dc	Further fix to tracing scope (#4558 ) * Set missing temporary scope in callPySymbolicMethod * Use expected traces in all scope tests	2018-01-09 15:57:40 -05:00
Zach DeVito	674ddf6b91	Fix multi-gpu fuser bug cuModuleLoad is only valid for a single device so we need to compile for the particular device that the fusion group will run on. CompiledFunction already specializes different traces for tensors, so we just need to have fusion_compiler produce the cuFunction on the right device.	2018-01-08 15:04:22 -08:00
Luca Antiga	d3612a5914	Fix tracking of tracing scopes during ONNX pass (#4524 ) * Fix tracking of tracing scopes during ONNX pass * Use ResourceGuard to manage setting a temporary current scope in Graph * Add tests for ONNX pass scopes * Remove unused num_classes argument	2018-01-08 12:20:52 -05:00
Edward Z. Yang	dc76db349e	Delete a pile of dead code (#4295 ) * Delete obsolete basic ops. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * More deletion. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete some unused utilities. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete dead apply_fn Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete CppFunction symbolic support. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Delete ForwardFunction Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Batchnorm is 'working' Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-01-04 09:21:54 -05:00
Adam Paszke	2d2b157d25	Handle repeated outputs in the tracer	2018-01-03 17:29:27 +01:00
Adam Paszke	e6cbe84bf6	Handle repeated inputs in JIT tracer	2018-01-03 17:29:27 +01:00
Edward Z. Yang	5f7c5502b8	Further improvements to ATen convolution (#4287 ) - Rename THNN convolution to have thnn_ prefix. - Propagate CuDNN benchmark and deterministic to at::Context - Add 'convolution', 'convNd' and 'conv_transposeNd' native wrappers, with defaults The conv_transposeNd wrappers are updated to have the same argument order as Python. - torch.nn.functional directly dispatches to the native wrappers - Make it possible to turn off tracing for some native wrappers, so I don't have to write symbolics for all the functions above - Spectral ops can now make use of CuDNN convolution if possible - Better commentary on cudnn_batch_norm - Turn on DCE for all JIT tests. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-21 13:03:43 -05:00
Zachary DeVito	674d7d1f8e	Allow compiled functions to call compiled functions. (#4286 )	2017-12-20 19:02:59 -05:00
Adam Paszke	efb6feb242	Make the JIT interpreter handle unused inputs correctly	2017-12-20 11:27:40 -08:00
Edward Z. Yang	6daf34ce7b	Don't mark index as traceable, and other improvements (#4249 ) * Improve 'untraced variable' message, add failing test. * Make index traceable.	2017-12-20 11:25:59 -08:00
Zachary DeVito	766312b7f2	Further relax VariableFlags, ... and fix bugs (#4244 ) * Further relax VariableFlags * Allow a requires_grad=True trace to be used for a requires_grad=False input by computing the gradient but they not connecting it to the input. * Enable CSE to de-duplicate WLM backwards pass code which calls sum twice. * Fix a bug in the interpreter that frees a register too early when it appears twice in a use list. * [fuser] Follow all outputs to check if fusion is safe This bug was introduced when we allowed fusion groups to fuse together. Previously producers were forced to have a single output, but now producers that are fusion groups can have multiple outputs. So now we check the uses of all the outputs of a producer. * [JIT] Fix handling of undefined inputs It is not legal to call .data() on variable objects whose tensors are undefined.	2017-12-20 10:36:22 -05:00
Will Feng	1681d07199	Disable tests and fix issues with Windows CUDA build (#4251 )	2017-12-20 11:30:21 +01:00
Sam Gross	d605058212	Replace Variable.volatile with torch.no_grad() (#3970 ) This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_grad_enabled() and the context manager torch.no_grad(). In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled() Fixes #3627	2017-12-18 15:46:13 -05:00
Zachary DeVito	0e804ae042	[jit.compile] add a jit_debug_info method (#4205 ) This method prints a bunch of useful debug information including the traces that have been record, their shapes, and the traced graphs associated with them.	2017-12-16 13:26:28 -05:00
Edward Z. Yang	6d72c82985	Trace ATen native functions as themselves, not their implementations. (#4127 ) * Trace ATen non-primitive functions as themselves, not their implementations. Previously, if I invoked an ATen non-primitive function foo, which in turn called subfoo, I would always see 'subfoo' in the trace (e.g., tracing 'inlines' all of these operations.) Such inlining is bad for ONNX (and can be bad for optimization) as it prevents high-level optimizations from taking advantage of the structure. It might be right to inline, but give the optimizer a chance to work before inlining happens! The implementation here is surprisingly simple, because it uses the "DCE trick". Essentially, it doesn't matter if the constituent calls perform tracing, because you can always trace it again, and override the trace nodes associated with the returned variables. The original trace becomes dead and can be DCE'd. While implementing this, I also refactored how 'isTracing' and 'trace_outputs' works: - isTracing was previously a single function with overloads for both Tensor and Variable arguments. Unfortunately, such overloads are not safe, because of how C++ implicit conversions work. You would think that C++ should never confuse an overload for Variable with ArrayRef<Tensor>, but this is exactly what can happen: Tensor is convertible to both Variable and ArrayRef<Tensor>, thus it's ambiguous and C++ doesn't like it. The last time I ran into this problem, I applied initializer lists to everything and called it a day. A more robust fix is to separate out the Variable and Tensor overloads, which I have done in this patch. - trace_outputs was fed as an initializer list, which doesn't work when you have heterogenous inputs. So instead we first feed everything through 'flatten', which has overloads for each of the argument patterns in ATen, which then goes on to the recordTrace (which takes an ArrayRef). This is no less efficient, because we were allocating a vector anyway (to do the conversion from vector of Tensor to vector of Variable). This fixes mean that 'index' can properly be traced... although the JIT still does not support it. A failing test case has been added to this effect. Some knock-on effects: - The fuser now knows about chunk as well as split. They're pretty similar so there is no problem. - There is a new 'canonicalize' pass in the JIT which renumbers a graph so that all structurally equivalent graphs render the same. - We run DCE before the fuser tests, to make sure dead nodes don't block fusion. - There are new ONNX exports for the newly introduced higher level ATen operations. This includes type_as (no-op case only), chunk, select. Zach didn't like the extra use of 'native' in the new codegen, so we've introduced a new concept, 'abstract'. An abstract function is one that is implemented in derived types (e.g., CPUDoubleType), where as a concrete one is implemented in the base type (Type). Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-12-15 13:50:32 -05:00
Will Feng	db446d69ca	Fix issues with Windows 7 & 10 CPU build (#4065 )	2017-12-15 10:14:43 +01:00
Zach DeVito	f72fe0624d	Add a CPU Fuser (single core) This adds a simple fusion backend for the CPU. * Refactors CompiledFusionFunction to have two subclasses that handle the compilation details of each backend. * emit-compile-link-run cycle for the CPU * simple single core loop to run the operation * lift CUDA-only restrictions in the fuser, checks that fusion groups are only on a single backend.	2017-12-04 14:13:44 -05:00
Luca Antiga	4eb8e12765	Introduce scopes during tracing (#3016 )	2017-12-04 09:19:06 -08:00
Zachary DeVito	929a11f920	Add interpreter support for Handles/PythonOp/CppOp (#3866 ) * Add interpreter support for Handles/PythonOp/CppOp This treats Handles as a first-class type in the interpreter since this turned out to be conceptually simpler than treating them as a separate concept, which requires a second channel for register allocating and moving data from one op to the next. Notes: * The refcounting nature of tensors is factored into its own base type so that it can be shared with other refcounted types such as handle. * Some methods redundant with TensorBase have been deleted from Tensor * The interpreter uses raw refcounted handles. In addition to being able to treat Tensors and Handles as the same base object, it removes a lot of redundant refcounting as objects moved from tensors to input/ output lists. * aten_dispatch has been updated to work directly on the raw refcounted lists to avoid refcounting and duplicate lists. * Removing jit_closure.cpp, The interpreter can now handle all pathways. * Functions like `unsafeToTensorShare` describe how ownership transfers in the interpreter. The `Steal` variants take rvalue references as arguments, and invalidate those arguments to prevent potential problems. * Make TensorTemporary is not a subtype relationship because it is too easy to do something horribly unsafe: ``` void foo(at::Tensor bar) { // bar destructor call release on a temporary! } foo(TensorTemporary(retainable)); // structure slicing! ```	2017-11-29 11:38:57 -05:00
Adam Paszke	669a99b595	Remove as much of Python from JIT hot path as possible	2017-11-27 11:42:47 +01:00
Soumith Chintala	c5e8048f58	add error checking for FusionCompiler around old CUDA versions (#3753 ) * add error checking for FusionCompiler around old CUDA versions * improve error message	2017-11-21 18:27:12 -05:00
Adam Paszke	3e4a777e44	Correct JIT interpreter autograd function (#3760 )	2017-11-19 21:48:22 +01:00
Edward Z. Yang	689cf7d480	Reduce nondeterminism in test_jit (#3561 ) Occasionally Travis builds would fail on these two tests. It's not entirely clear where this nondeterminism is coming from. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-11-17 19:48:59 +08:00
Adam Paszke	d1fb8fdf03	Improve IODescriptors in JIT arg checking	2017-11-17 00:13:02 +01:00
Zach DeVito	ef4b19f767	Refactor ir.h to distinguish Nodes and Values This commit adds a Value type similar to the one @ezyang suggested a while ago for handling multi-return nodes. Previously if we had a graph like: a = op1(b) c, d = op2(a) Then its in-memory format would look like: %0 = op1(b) %1 = op2(%0) %2 = select(%1, 0) %2 = select(%1, 1) Select nodes were used only to handle the multi-output case. In the single-output case ops referred directly to their uses. This required special handling for the single- and multi- output cases, and was confusing when used with ONNX which distinguishes values (the inputs/outputs of a node) from the nodes themselves (e.g. a Conv). This commit adds the Node/Value distinction to the IR. In the example above, `a`, `b`, `c`, and `d` are now Value objects, while `op1` and `op2` are now Node objects. Inputs/Outputs to the graph are values. * Nodes now always have multiple outputs, accessible through their `output()` method. * Methods exist for adding/removing outputs from a node. * Nodes own their output Values, destroying a node destroys its outputs and it is only valid to destroy a node when no uses of its outputs remain. * Unlike select, Values do not appear in the nodes list. * The method `node()` on `Value` retrieves its defining node. Calling it is always valid. For inputs, its kind is "Param". Like "Return" there is a single Param node representing all inputs. * For single-output Nodes, the method `output()` retrieves the single output Value, asserting that the node is in-fact single output. * Functions are the same, but some functions like `type()` have moved to Value. * `replaceAllUsesWith` is now sanely defined for both Values and Nodes. In the case of Nodes, it replaces all outputs of the node with the outputs of the replacement node. * stage is defined both on Node/Value. This is because Inputs require a stage. * Apart from changing data types from Node->Value most passes remain the same. Things that previously assumed single-output nodes now have to call output() to get the node. * This removes the uses = [...] field in the outputs because it was getting confusing even before this commit when uses would refer to nodes, but we print the names of Values. The lint pass validates the use list, so printing it out seems less necessary.	2017-11-15 11:47:18 -08:00
Adam Paszke	3bb2308a89	Minor JIT improvements (#3703 ) * Record autograd profiler events in JIT * Fix the graph fuser It was supposed to only work for float inputs, but worked for all types _except_ float.	2017-11-14 21:23:31 -08:00
Adam Paszke	1f1612ee37	Move _CompiledMixin to C++	2017-11-10 16:31:44 +01:00
Adam Paszke	621fbd5c4e	Move flattening/unflattening JIT logic to C	2017-11-06 19:42:44 -05:00
Edward Z. Yang	9ca8b321f5	Skip cpp tests if CUDA not available. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-11-02 02:21:10 -04:00
Edward Z. Yang	247d50e2ad	Improve const-correctness of JIT. This started off as a minor fix based on Adam's question, "why is printing a graph not const" and snowballed into a giant yak shaving exercise. - The Graph and Node APIs now uniformly enforce deep constness; e.g., if you get a const Node* or const Graph, it is not possible to get a non-const Node/Graph* somewhere else in the graph (even though the member variables of these are non-const. Hooray for private access specifier.) - A big pile of functions got const versions, most notably the printing functions, and functions for accessing inputs(). - REALLY IMPORTANT, BC-BREAKING CHANGE: inputs() now returns a COPY of the inputs, rather than a reference to the underlying. I was forced to do this because there is no way to portably turn a std::vector<Node> into a std::vector<const Node>, which is necessary to provide a const-correct version of inputs() that enforces deep const-correctness. I then justified this choice to myself with the observation that outputs() returned a copy (by necessity), so this makes the API more uniform. But making this change uncovered two very subtle bugs: 1. If you change functions from returning a reference to returning a copy, the idiom node->inputs().begin() is no longer valid, because the memory the iterator points to immediately becomes invalid. THIS SUCKS. Honestly, we should add a lint rule rejecting calling begin()/end() on temporaries because this is very dangerous. To excise this pattern from the codebase, I added begin() and end() methods to Graph, so that we got rid of the graph->nodes().begin() idiom, which happens to be sound, despite not returning a reference, because graph_node_list is a non-owning reference. 2. pybind11 doesn't handle std::vector<Node> cast out of the box. Fortunately, I found a simple fix in the GitHub issues tracker that involved adding an extra type converter. And yes, this does mean that outputs() in Python never worked correctly. - New const_graph_node_list, which is a graph_node_list that gives you const Node There are some more miscellaneous improvements: - Applied CR comment fixes on export.cpp; using replaceInput, and renaming variables for clarity. - assertValidInput helper method added, and applied to replaceInput - Use an explicit function to print THPObjectPtr, otherwise we get the wrong overload. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-11-01 09:49:53 -04:00
Zachary DeVito	8cc30e4895	Fix the Fusion Pass (#3362 ) * update fuser to match ATen-formatted JIT ops * fix concat optimizations and add test * allow onnx export to work with single-export functions * fix onnx handling of multi-return nodes. * nits, format, vision test update * fix add constant * fix driver init issues * Add missing Neg symbolic. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-31 13:44:13 -04:00
Adam Paszke	fa0f3cf98a	Re-enable and fix most JIT tests	2017-10-27 02:40:09 +05:30
Edward Z. Yang	53fe804322	Make ONNX work with new C++ autograd world. The general strategy is there is a new module, torch.onnx.symbolic, which contains a function for every ATen method name with the ONNX translation. While implementing this, I took the opportunity to expunge all references of 'g' from the public API; instead, it is managed by a global variable in torch.onnx which tracks the "current graph". Other changes: - If you pass a Tensor to op as an argument, it will now automatically be converted into a Constant ONNX node. This lets us remove needing to implement ONNX - Rename value to other, wherever there is both a Scalar and Tensor overload. This way, keyword dispatch can work uniformly in both cases. - Deleted any autograd Function classes that both had a symbolic and were ported to the new C++ autograd implementation. There may still be some straggling classes that didn't have symbolic. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-20 15:38:01 -04:00
Edward Z. Yang	f709199c49	Make test_jit more robust about compilation. It's pretty easy to accidentally fail to actually compile a JITed region, which means that we have accidentally failed to have test coverage for a number of features. This adds a secret _assert_compiled kwarg, which will raise an error if we don't actually hit the compiled codepath. This is not intended to be user visible; we have some other ideas for handle this case. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-14 12:04:40 -04:00
Edward Z. Yang	f7f37306e4	New torch.jit.verify function for verify once-backward. A few notes about the implementation: - Need to plumb 'devices' through to the 'fork_rng' calls. You definitely want these; it makes verify run A LOT faster - New keyword argument for compiled model execution, '_force_trace', which forces us to retrace a model. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-10 11:46:40 -04:00
Edward Z. Yang	6fbdf40284	Translate addmm into Gemm operator / fix alpha-beta mixup / execute in JIT. The alpha/beta naming in addmm was flipped; this commit fixes that problem. It also fixes the ONNX export of alpha/beta parameters. Finally, it supports executing matmul in the JIT. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-10-03 17:23:43 -04:00
Adam Paszke	b6b41c829a	Add inplace checks in JIT	2017-10-03 10:20:58 -04:00
Edward Z. Yang	954e9e370c	Uncurry trace. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-28 12:34:35 -04:00
Edward Z. Yang	600fcf2f04	Delete params. We have decided we are not going to support it. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-28 12:34:35 -04:00
Edward Z. Yang	0ad6c2d59c	Lintfix. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-28 12:34:35 -04:00
Edward Z. Yang	db3349faa3	Support class decorator syntax; remove instance compilation. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-28 12:34:35 -04:00
Edward Z. Yang	0c40305ddd	Rewrite torch.jit interface. torch.jit now contains two user-facing functions: compile and trace (corresponding to what was previously trace/traced and record_trace). The non-curried versions of these functions have been eliminated, so that there is only one function in the API (we must have the curried versions, since these enable their use as decorators). There is detailed usage documentation in the docblocks for these methods. This comes with a complete rewrite of the internals of torch.jit, in the process fixing a number of bugs. Key points of the new implementation: - compile and trace both always return a Module representing the wrapped with compilation/tracing underlying function/module. This makes handling of the function/module cases more uniform, as we can think of the function case as creating an on-the-fly module with the parameters explicitly specified by the user. For technical reasons, we now require any parameters in the function case to be honest-to-goodness Parameters (gory details: you can't register a Variable as a Parameter to a Module, but you can't create a Parameter from a Variable while sharing the same underlying identity.) - Flattening and unflattening is done a lot more uniformly. We now have a _flatten and _unflatten function which are inverses of each other: _flatten always returns both the flat, tuple of Variables, as well as the "proto" (now referred in the code as the "struct") from which we can unflatten the variables. Low level functions like 'raw_trace' always work with the flattened inputs/outputs, which keeps their logic simple. - JIT trace keying now also includes the "struct" of the input arguments. This is a step towards accepting non-Variable arguments in functions, although flatten/unflatten don't currently support it. - TraceForKey (previously TraceInfo) has had its API reworked to have less degrees of freedom when you are interacting with it. TODO: Verify, timing, and trace dumping have been temporarily excised. I plan on adding them back. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-28 12:34:35 -04:00
Edward Z. Yang	c08395e290	Give a better error message when we hit a legacy function. We now include the type name of the legacy function implementing class. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-25 12:26:07 -04:00
Lu Fang	f256f686b5	Remove device comparison TODO mark, change the white list to black list on node kind checing	2017-09-22 17:06:27 -04:00
Lu Fang	18a1d272bf	Add attributes comparison, fixed several issues, more interesting test case.	2017-09-22 17:06:27 -04:00
Lu Fang	0a1ac8bfe5	create a cse pass, with very naive support.	2017-09-22 17:06:27 -04:00
Edward Z. Yang	8d19319fa7	Documentation for FusionGroup and Eval requested by @houseroad (#2808 ) Plus a test for Eval nodes in the IR, since we hadn't actually covered this case now that some nodes are transparently traceable. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-21 17:14:56 -04:00
Edward Z. Yang	794e52bb1c	Make cloneFrom() copy all metadata; use createClone() as much as possible. To be honest, this was the whole point of this refactor set. I noticed that in a lot of code, we were repeatedly copying lots of metadata from old nodes to new nodes. This was quite concerning because I wanted to add some more metadata (alias information) and I didn't want to have to get it right in all cases. Plus, in a lot of cases we were forgetting to set more optional properties like debug names when we "copied". To solve this, I first made cloneFrom() copy all of this metadata. Then, I searched for all occurrences of setType() (a proxy for "I'm cloning this node), looked for cases where we really were morally doing a copy, and rewrote the code to use cloneFrom() instead, allowing us to drop explicit setType() (and getting more metadata preservation in the process.) Finally, I refactored tryToMoveChunk. The code is modestly longer, but the new version has the nice property that the initialization of selects for input_chunk are next to the creation of the node (as opposed to delayed for later.) I also added a lot more comments for invariants I noticed when I was working on the code. One minor extra change: TensorType grew a new constructor and a withSizesStride "immutable setter" which returns a new copy of TensorType with different info. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-20 12:24:27 -04:00
Adam Paszke	a7c4152302	Prune null edges in Eval nodes	2017-09-19 10:53:32 -04:00
Adam Paszke	28828e033f	Make certain functions traceable	2017-09-19 10:53:32 -04:00
Adam Paszke	214eef5e5d	Record device information in TensorType and check it in the fuser	2017-09-19 10:53:32 -04:00
Adam Paszke	b708b6de8d	Add ONNX pass (JIT trace initialization)	2017-09-19 10:53:32 -04:00
Adam Paszke	6b60f31081	Fix bugs in AutogradClosure	2017-09-19 10:53:32 -04:00
Adam Paszke	aafa35e0b5	Fix bugs in Traceable Previous refactor introduced a few problems like not saving the output proto, and it didn't use the flattened inputs when querying the key.	2017-09-19 10:53:32 -04:00
Adam Paszke	d90cd88fb7	Improve next_functions hanling in tracer and JIT closure Added extra logic that records edges of previous stages and allows JIT closures to copy next_functions for next stages.	2017-09-06 21:35:50 -04:00
Adam Paszke	3b1dfcb51c	Add trace flag checking in backward passes too	2017-09-06 21:35:50 -04:00
Adam Paszke	ea888c1905	Check input flags in Traceable	2017-09-06 21:35:50 -04:00
Adam Paszke	230721e198	Support calling traced functions multiple times in forward * Variables now hold a list of ValueTracingStates and can participate in multiple traces. * Refactored Traceable to maintain a list of traces, and only stop tracing once it records all stages	2017-09-06 21:35:50 -04:00
Adam Paszke	fdbef1cfb0	Traces can now expire	2017-09-06 21:35:50 -04:00
Edward Z. Yang	57eb8bd288	Frontend refactor, and some documentation. - BC BREAKING: export now also takes a mandatory file-ish argument, specifying the file to export the protobuf to. I rewrote the tests to use BytesIO to get out the string so they could parse it again. - BC BREAKING: export no longer returns the tensors that were computed. To get these, use the internal _export function. - Multiple inputs to models are now supported by passing a tuple to input. (Old API of a single Variable still works.) - Keyword arguments to models are now supported via kwargs keyword arg. - Renamed embed_params to export_params, and it now defaults to True. - Toffee tests now live in their own test_toffee.py file. I had to rename a pile of expect files for this. - Removed defunct torch.toffee imports from autograd to solve module import cycle. - Helper function _with_file_like to abstract over opening file-ish arguments, taken from torch.save() Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	3ef2ec6153	Actually correctly handle non-float exports. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	c537aebf5a	Always run DCE in Traceable	2017-09-05 17:48:55 -04:00
Edward Z. Yang	d59714e3b1	Code review comment changes. - Reduce setup.py diff. - Expunge WITH_TOFFEE from codebase. - Elaborate on a comment. - Move gen_toffee.sh to tools - Delete densenet test. - Use 'using' to inherit a constructor. - Delete outdated comment. - Comment about why primspecs can return fewer outputs. - Remove dead, commented out includes. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	2e266837f5	Port TracingState to pybind11, new export() method. Along the way I added converters for Variable and TracingInput. Variable should probably be moved to a more widely known spot. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	d8e2ab632e	Add support for Constant nodes in AutogradClosureFactory	2017-09-05 17:48:55 -04:00
Adam Paszke	594f98ce16	Support multi-stage AutogradClosures	2017-09-05 17:48:55 -04:00
Edward Z. Yang	b2e305e390	Lint after ToffeeIR, and subsequent fallout. I realized we weren't running the linter after ToToffeeIR, so I added a lint call. It thus emerged that the current implementation was using "Unused" nodes that were not added to the graph, which was tripping the lint. I fixed this a few ways: - BatchNorm and Conv primspecs were returning dead "unused" nodes for their (implicit) handle parameters. I removed them because setOutputs handles this already, and a dead unused node which is not attached to the graph violates the "no dead nodes" invariant. - OK, but MaxPool actually needs to return a unused node for the output which supported by PyTorch but not Toffee; we need to error if subsequently in the trace this output is used. The new strategy is to have MaxPool's primspec return a None at the unused position, and then immediately check if there are any uses of that output. If there are, that's an error! - I needed to adjust the Select invariant in the exporter loop: only if a Select node has uses is it mandatory for it to be defined in env. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	82efbe349b	Handle batchnorm properly. Basic idea: - Pass buffers (marked as non-Variable tensors) as input variables to the trace. Every buffer gets represented as an input variable to the trace, and we remember a correspondence of the underlying TH pointer and an input variable in the trace. - When we initially trace a function, we DO NOT record the buffers as edges. This is so autograd doesn't have to know anything about buffers. If we ever turn buffers into requires_grad=False parameters, then this problem goes away. - When we primspec the buffer, NOW we reach into the cached buffers (now appropriately named) and gin up the buffer information we need. Other things: - CppOp execution is now supported (but lightly tested) using SimpleEval (thanks @apaszke!) Todo: - E2E tests need to have their hacks removed. - Figure out what is going on with backwards Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	de6ef65be5	Port to nanopb. General strategy: - nanopb is statically linked into PyTorch. It must be built with -fPIC. - Generated nanopb files for toffee.proto are checked into our repo. - Because nanopb generated protobufs are C only, we wrote a wrapper around it to give a Google C++ style interface. More on this shortly. How does the wrapper work? - It's called "micropb" becaues it is less small than nanopb :) - nanopb requires all variable-length fields to be written out using a "callbacks" mechanism. - We wrote pre-canned callbacks for all of the types ToffeeIR writes out and lists; these are micropb_callback and micropb_callback_list. These operate simply by dynamically allocating and storing the data to be written out in data (this defeats the purpose of the callback mechanism, but it's easy to implement) - Finally some boilerplate to actually implement the wrapper classes and have owning pointers to the actual data. Testing strategy: - Take the serialized protobuf from nanopb, parse it again with ToffeeIR and print it. Worked with all of test_jit.py! These tests don't run without 'toffee' being installed. TODO: - Update CI to install ToffeeIR, so we can run the Toffee tests in CI - Update E2E with Caffe2 tests so that they work with new stuff. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Zach DeVito	bad5717e15	add ability to specify initial values for inputs	2017-09-05 17:48:55 -04:00
Edward Z. Yang	f062e06c91	Make null Variables on convolution and batchnorm work. This addresses when bias is disabled, which occurs in torchvision's alexnet and densenet. The general strategy is this: - When we encounter a null variable, we turn this into a Constant node with an undefined at::Tensor - Toffee exports for BatchNorm and Conv have special cases for bias, checking if they are provided by a Constant node with undefined value, and just omit the input if so. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Zach DeVito	a60d9bd022	Bind Attributes in python ir, and add test for python ir binding	2017-09-05 17:48:55 -04:00
Zach DeVito	a3fdb281d1	Python wrapper for Node IR using pybind11 Supports almost all of the IR API.	2017-09-05 17:48:55 -04:00
Adam Paszke	fa308b3183	Improve backward tracing	2017-09-05 17:48:55 -04:00
Edward Z. Yang	ee2ba279f2	Working Reshape op Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	35f1cb462d	Invert negation. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	6f6fe177f1	Make Toffee optional. Unbreaks CI. The general strategy: - We put all the toffee files in torch/csrc/toffee; they will only be added when toffee is enabled - Toffee is enabled if torch/lib/ToffeeIR is present (since we don't have a submodule/subtree thing going on) - The most prevalant place you will need to use WITH_TOFFEE is for primspec definitions on C++ autograd functions. There is a macro HAS_PRIMSPEC to ameliorate optionally defining primspec() virtual overrides on Function classes. HasPrimspec is always available but will be a zero field class when Toffee is disabled. NB: We might revert this commit in the future if we figure out a way to unconditionally enable Toffee that everyone likes. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	dd58b145c3	Toffee graph exporting for PyTorch. This commit adds a new exporter pass which takes a graph and returns a string of the human-readable protobuf representation of a model. We have two strategies for how conversions are implemented: - If a Python autograd function has a primspec static method, we invoke it to get the Toffee conversion. Use torch.toffee.op to generate the format expected to be returned. The particular data representation is opaque and subject to change in the future. - Otherwise, there's a giant if statement in the exporter, which manually uses the JIT IR C++ API and Toffee IR C++ protobuf API to convert. You must check out a copy of the ToffeeIR repo https://github.com/ProjectToffee/ToffeeIR at torch/lib; at the moment we don't have a subtree/submodule set up. Technical debt in this commit: - To get protobuf headers in scope, we unconditionally add $CONDA_PREFIX/include to the include path. This needs to be replaced with a more robust mechanism. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	890c2071f0	PR comments	2017-09-05 17:48:55 -04:00
Zach DeVito	e91966a0b4	Unify our tracing API into a single interface for functions/models. The API works on either functions or models, taking an extra parameter argument so that functions can pass in additional variables to trace. Other behavior is folded into boolean options: time - collect stats for our own perf debugging verify - run the original code, and check it is within threshold optimize - run optimization (currently off until fusiongroups pr is accepted). enabled - flag to turn off tracing so you can check timing of stuff that cannot be traced.	2017-09-05 17:48:55 -04:00
Adam Paszke	7f60a18293	Add initial support for backward tracing	2017-09-05 17:48:55 -04:00
Edward Z. Yang	eb730f8321	Inplace test. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	4a1bbc01ac	Fix #41 . Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	21c0ad9702	Test case that we fail legacy traces Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	453b0fac03	Always print diffs, no matter how large. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	1c4538e017	Trace C functions	2017-09-05 17:48:55 -04:00
Adam Paszke	bdcbbeaf68	Remove GlobalTracingState	2017-09-05 17:48:55 -04:00
Edward Z. Yang	b158aaf6b4	Make linter an optimization pass. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	2dc3ef73ae	Lint	2017-09-05 17:48:55 -04:00
Adam Paszke	3e0f1608fe	Capture Variables that are not inputs as constants	2017-09-05 17:48:55 -04:00
Adam Paszke	233a66dcbe	Remove SimpleMap from JIT IR	2017-09-05 17:48:55 -04:00
Zach DeVito	50e51eaa7f	Fusion of simple map operations using nvrtc. Approach is based on the approach of THC's pointwiseApply{1,2,3} family of kernels, but doesn't have any dependencies on that code. Adjacent contiguous dimensions of input tensors are compressed to reduce the complexity of indexing math. For the completely contiguous case, the indexing logic simplifies to just the linear index. In simple tests, this code matched or beat the equivalent from THC.	2017-09-05 17:48:55 -04:00
Adam Paszke	a4086508c6	Enable tests	2017-09-05 17:48:55 -04:00
Adam Paszke	f270973937	Add JIT IR -> Autograd IR converter	2017-09-05 17:48:55 -04:00
Adam Paszke	e186d16e6b	Apply JIT optimizations form Python	2017-09-05 17:48:55 -04:00
Adam Paszke	72659bcdef	Minor code cleanup	2017-09-05 17:48:55 -04:00
Edward Z. Yang	57d65a99bb	Add LSTM fusion test. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	e238f3cada	Very simple accept/golden test framework for JIT trees. - To test whether or not a multiline string matches some expected value, you can use assertExpected. This tests that the string matches the content stored at a file based on the name of the test (and an optional subname parameter you can pass if you what to assertExpected multiple times.) - Suppose you make a change that modifies the output in a big way. Instead of manually going through and updating each test, you instead run python test/test_jit.py --accept. This updates all of the expected outputs. You can now review them one-by-one and make sure your changes make sense. We can add more features later (e.g., munging the output to make it more stable, more sanity checking) but this is just to get us started testing. One thing to watch out for is that accept tests on intermediate representation can be a bit wobbly: it is extremely important that people be able to read the IR. It may be worth introducing niceties to the printer in order to ensure this is the case. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Adam Paszke	55c9e0258e	Make the linter happy	2017-09-05 17:48:55 -04:00
Edward Z. Yang	2ced918063	Add a very simple visual (non-automated) test. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00

... 34 35 36 37 38 ...

1913 Commits