pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Will Feng	60745b3380	Revert #7750 and #7762 to fix Windows CI on master (#7772 ) * Revert "Add missing brace (#7762)" This reverts commit `ea27c5af50`. * Revert "[C++ API] Add backward() to Tensor and Variable (#7750)" This reverts commit `1e2762796f`.	2018-05-22 15:42:52 -07:00
Peter Goldsborough	1e2762796f	[C++ API] Add backward() to Tensor and Variable (#7750 ) * Add backward() to Tensor and Variable * Added a couple tests	2018-05-22 10:43:04 -07:00
Tongzhou Wang	1c01eabd3c	Codemod to update our codebase to 0.4 standard (#6641 ) * Codemod to update our codebase to 0.4 standard * Update some of the test scri[ts * remove Variable in test_clip_grad_value * fix _symbolic_override_wrapper_maker	2018-04-17 22:06:54 -04:00
Tongzhou Wang	e01569afd7	Restore allow_unused functionality (#6553 )	2018-04-12 21:30:42 +02:00
Priya Goyal	e3196e0ea8	[Re-checkpointing] Autograd container for trading compute for memory (#6467 ) * Autograd container for trading compute for memory * add a unit test for checkpoint * address comments * address review comments * adding some docs for the checkpoint api * more comments * more comments * repro bug * Fix a subtle bug/apply some review comments * Update checkpoint.py * Run everything in grad mode * fix flake and chunk=1 * use imperative backward as per discussion * remove Variable and also add models and test for models * Add a simple thread local variable to check for autograd grad mode * remove models and models test after debugging * address review comments * address more comments * address more comments	2018-04-10 15:26:24 -04:00
Edward Z. Yang	49f2bb7e0b	Extra comment about backward vs. grad in engine. (#6005 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-27 10:54:06 -04:00
Luca Antiga	396637cdd6	Python-free build of autograd + jit (#5356 ) This PR adds the possibility to build the C++ parts of autograd and jit, with no dependency on Python. The goal is to allow taking a PyTorch IR representation (a tree s-expr) and running it with provided inputs. Prerequisite: build PyTorch so that codegen runs once. Instructions: cd tools/cpp_build bash build_all.sh This will build libtorchjit and torchjit_test in tools/cpp_build/build/torchjit-build. The latter basically runs the code in test_jit.cpp for now. While writing the PR, it turned out that a few of Python.h includes were redundant. They were removed here (PyTorch tests still pass on my machine, we'll see CI). * Introduce Python-free builds of autograd and jit * Remove NO_PYTHON ifdef in functions/special	2018-03-08 15:13:10 -05:00
Peter Goldsborough	702a7f3864	Improve Function interface (#5221 ) * Improve Function interface * Undo tracer changes * Fix bug in VariableType.set_history * Rename function_counter and sequence_number to sequence_nr * Clarify Function documentation * Replace swap_next_edges with next_edges() getter * Bring back set_gradient_edge * Simplify special.cpp * add_gradient_edge -> create_gradient_edge * Add mutable getters for pre/post hooks * Use make_variable with Edge * Remove remove_gradient_edge in favor of detach_ * Fix documentation and remove create_gradient_edge friend method * Canonicalize some includes	2018-02-21 16:37:52 -05:00
Peter Goldsborough	2d5fbe6e0d	Improve Variable interface (#5127 ) * Improve Variable interface * Address comments from @apaszke and @colesbury * string ::operator= is not noexcept * Remove ir.h from tracer_state.h to improve build times * Make Variable a struct and pack SavedVariable fields * Implement as_variable_ref * grad_fn_ptr() -> grad_fn_unsafe() * Reduce hackiness of set_type hack * Include variable.h and edge.h in tracer_state.h because it uses them * class Variable -> struct Variable because Windows cant even * Make Variable::output_nr uint32_t instead of int * Add comment about tracing state * Replaced more static_cast<Variable&> and improve docs * Remove SavedVariable destructor and construct members in init list * Clarify docs for Variable * Variable::set_version -> set_version_counter	2018-02-12 23:26:26 -05:00
Peter Goldsborough	f38b6f611e	Replace NULL with nullptr in autograd (#5162 )	2018-02-12 12:01:52 -08:00
Peter Goldsborough	25e946bf78	Replace edge_type with Edge and create Variable::gradient_edge() (#5030 )	2018-02-07 10:50:42 -08:00
Adam Paszke	79d15c52cb	Improve the engine support for functional graph execution (#4690 ) Previously the side-effect free grad calculation was performed using callbacks that could also override the decision to run a function. However this had a few problems e.g. it forced us to iterate over pretty much all functions in the graph and drop their buffers. This patch improves the mechanism, by adding explicit support for this kind of evaluation in execute(). It's safer, and the algorithm used to decide which nodes have to be evaluated was replaced with a faster one.	2018-01-18 11:20:30 +01:00
Sam Gross	d605058212	Replace Variable.volatile with torch.no_grad() (#3970 ) This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_grad_enabled() and the context manager torch.no_grad(). In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled() Fixes #3627	2017-12-18 15:46:13 -05:00
Sam Gross	b79d74aa81	Re-initialize autograd engine in child processes (#4158 ) * Re-initialize autograd engine in child processes The autograd engine uses threads for backwards. These don't exist after forks and they were not being re-initialized because the Engine::start_threads_flag was already set. This re-initializes the engine in child processes, which will cause it to re-create threads when backwards() is called in the child process. Note that we only attempt to handle the common case where fork() is called while the backwards threads are idle. Fixes #3966 * Avoid non-async-signal-safe functions in fork handler	2017-12-18 01:51:27 -05:00
Sam Gross	0a434ff685	Remove Function::is_executable (#3907 ) * Remove Function::is_executable Ensure that grad_fn is null if requires_grad is false. * Assert that grad_fn implies requires_grad=True	2017-11-28 18:29:27 -08:00
Adam Paszke	3128218397	Allow specifying unused inputs to torch.autograd.grad (#2859 )	2017-09-25 14:42:33 -04:00
Sam Gross	f4169260f8	Fix crash when calling backwards on leaf variable which does not require grad (#2788 )	2017-09-20 09:43:20 -04:00
Sam Gross	1290e586fb	Use at::Tensor based autograd Variable (#2676 ) Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file. Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.	2017-09-12 11:36:01 -04:00
Adam Paszke	ea888c1905	Check input flags in Traceable	2017-09-06 21:35:50 -04:00
Adam Paszke	9f0c4c9f9a	Make autograd engine reentrant without creating new threads	2017-09-05 17:48:55 -04:00
Adam Paszke	f83c4fad7b	Fix exception propagation from recursive Engine calls	2017-09-05 17:48:55 -04:00
Adam Paszke	d8e2ab632e	Add support for Constant nodes in AutogradClosureFactory	2017-09-05 17:48:55 -04:00
Adam Paszke	594f98ce16	Support multi-stage AutogradClosures	2017-09-05 17:48:55 -04:00
Adam Paszke	fa308b3183	Improve backward tracing	2017-09-05 17:48:55 -04:00
Adam Paszke	3e0f1608fe	Capture Variables that are not inputs as constants	2017-09-05 17:48:55 -04:00
Adam Paszke	f270973937	Add JIT IR -> Autograd IR converter	2017-09-05 17:48:55 -04:00
Adam Paszke	ea05ac8f41	Move JIT-related files to jit dir. Remove IR interpreter	2017-09-05 17:48:55 -04:00
Zach DeVito	1325fa511c	JIT IR including use-def chains and updated comments.	2017-09-05 17:48:55 -04:00
Zach DeVito	f369f8e80d	simplify IR	2017-09-05 17:48:55 -04:00
Edward Z. Yang	4979359800	Add graphs, trace them. It is not an /expression/ we trace, but it is a /graph/: that is, a closed expression which knows its parameters. Knowing the list of parameters is helpful and helps remove a hack when interpreting. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	a797ab9343	Rewrite AST to a new, more functional representation. Previously, our AST was a DAG, where shared Nodes indicated a computation should be reused. This commit rewrites the IR into a new functional representation which represents sharing explicitly using variable bindings. We offer a few justifications for this new style: 1. The new representation is not all that different from the old one; it is about as easy to construct, and the lack of an explicit graph doesn't negatively impact our ability to interpret the graph, since we've chosen, as a matter of design, to NOT have the IR participate in the actual execution of a graph. 2. The new let-binding representation has an implicit ordering, which we can use to conveniently keep track of the original order the trace showed up as. This automatically gives us a topsort, and gives us an easier to read textual representation of our IR: %14 = Embedding %11, %0, -1, None, 2, False, False %15 = Dropout %14, 0.2, True, False %16 = Index %12, 0 %17 = Index %12, 1 %18 = Index %13, 0 %19 = Index %13, 1 %20 = Index %15, 0 %21 = Linear %20, %1, %3 %22 = Linear %16, %2, %4 3. It moves us closer to a Futhark style language (http://futhark-lang.org/publications/pldi17.pdf). Major aspects of the diff - Node is replaced with Expr and Arg, a pair of mutually recursive structures which represent our new language. In BNF, the language looks like this: a ::= c \| %i e ::= %i, ... = e \| PyOp e, ... \| Ret %i, ... Technically, Ret is not actually a return (no control flow is involved), it just tuples up a series of tensors (identified by variables). One important invariant is that locals are always tensors; they are never constants (this is asymmetric with Args.) - Arguments support Python constants. This is an important piece because many operators take extra Python literals like integers and tuples in order to specify extra parameters about how an operator operates. Adding this was essential to getting word_language_model to work. - As both Expr and Arg have multiple variants, there is new infrastructure for doing case on the variants using ExprVisitor and ArgVisitor. The strategy here is adapted from WebAssembly's visitors, although we have generalized to permit arbitrary argument forwarding, which is necessary to support tail-recursive visitor calls. TCO is important because our interpreter may recurse arbitrarily deep into a stack of nested lets. If users wish, they can also manually case on the type tag. - Tracing is now turned on and off using _tracer_enter/_tracer_exit in torch._C. _tracer_enter accepts a list of variables which are to be treated as arguments; _tracer_exit accepts the list of traced variables which should be returned when you reexecute the trace, and returns the trace expression which can be reexecuted. GlobalTracingState is a global variable which tracks whether or not we are tracing or not. - You use run_forward to execute a trace on some set of parameters. - When under tracing, variables keep track, via trace_local, what the name of their variables in the IR are. Here is a simple runner which leaks memory but can be used to JIT models: import torch.autograd.function as F import torch._C def jit(model): import types real_forward = model.forward def forward(self, args): def flatten(x): return tuple(F._iter_variables(x)) if not hasattr(self, "saved_trace"): torch._C._tracer_enter(tuple(self.parameters()) + flatten(args)) out = real_forward(args) self.saved_trace = torch._C._tracer_exit(flatten(out)) self.saved_outs = out return out else: flat_out = Variable._execution_engine.run_forward(self.saved_trace, tuple(self.parameters()) + flatten(args)) return F._unflatten(flat_out, self.saved_outs) Major problems: - Sanity checking is spotty at best, especially when users pass in variables. - The interpreter leaks tensor memory from the store. When we add back def-use we should be able to deallocate tensors as soon as we know they are no longer necessary. - The interpreter needs to reach feature parity with the old execution engine. From there, we need to see if backwards can be subsumed as well. - I still have no confidence in having memory managed everything correctly. This requires a close look. - Rather than return an open expression as a trace, we should return a lambda instead, which knows about how many formal parameters it requires. - The IR is not introspectable from Python at the moment, but this is simply a matter of implementing all the binding code. - The tracer is NOT reentrant (you can't trace while you're inside a trace.) Furthermore, no sanity checking is done if you try to incorrectly reuse things from one trace in another. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Edward Z. Yang	1e8bf12b3a	Add an inefficient but working evaluator for forward traces. Simple test: import torch from torch.autograd import Variable import torch._C as _C x = Variable(torch.Tensor([4]), requires_grad=True) y = Variable(torch.Tensor([7]), requires_grad=True) z = x * y z.sum().backward() print(x.grad) print(y.grad) x.data[0] = 2 y.data[0] = 3 (z,) = z._execution_engine.run_forward((x, y), (z,)) z.sum().backward() print(x.grad) print(y.grad) Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-09-05 17:48:55 -04:00
Sam Gross	42485d87c2	Set the current device in each engine's thread (#2081 ) Fixes #2017	2017-07-13 16:24:38 -04:00
Adam Paszke	86a065e45b	Add end callbacks to the engine	2017-06-12 21:58:38 -04:00
Edward Z. Yang	565bf7116b	A pile of misc doc fixes. (#1682 ) * A pile of misc doc fixes. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Handle @apaszke review comments. Signed-off-by: Edward Z. Yang <ezyang@fb.com> * Initial csrc documentation. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2017-06-02 11:59:03 -04:00
Sam Gross	e1d257bc6d	Fix segfault in autograd: (#1644 ) * Fix segfault in autograd: 1) Every "output" variable must have a grad_fn or grad_accumulator 2) compute_partial_exec_callbacks uses Python errors * assertRaisesRegexp was renamed assertRaisesRegex in 3.2 * Use HANDLE_TH_ERRORS macro	2017-05-24 17:13:08 -04:00
gchanan	3d38e4f126	Acquire GIL before THPVariable_wrap (#1625 ) * Acquire GIL before THPVariable_wrap. * mutex not required when GIL is held. * Remove unused mutex.	2017-05-24 15:19:34 -04:00
Adam Paszke	feef54ec34	Don't modify non-volatile grads in zero_grad	2017-05-10 16:43:14 +02:00
Adam Paszke	5c7453447f	Fix bugs, rename differentiate to grad, make it more flexible	2017-05-01 16:44:56 -04:00
Adam Paszke	e5db8f98be	Add torch.autograd.differentiate	2017-05-01 16:44:56 -04:00
Adam Paszke	702a2e3bc5	Make Variables not subclass Function anymore Because of this Variables can no longer appear in the graph. Every usage of a leaf Variable will leave an AccumulateGrad function that has no outputs, but modifies var.grad as a side effect.	2017-05-01 16:44:56 -04:00
Adam Paszke	2ca787fcf4	Refactor attribute names in autograd	2017-05-01 16:44:56 -04:00
albanD	71303b8af4	Autograd deadlock for recent glibc fix (#1243 )	2017-04-12 22:24:31 +02:00
Sam Gross	34ce58c909	Parallelize backwards	2017-03-03 11:26:00 -08:00
Adam Paszke	977630bc15	Handle duplicate backward roots in autograd	2017-03-01 19:42:39 +01:00
Sam Gross	bd5303010d	Refactor autograd package to separate Python dependencies. (#662 ) The core autograd Variable, Function, and Engine no longer depend on the Python API. This let's us implement functions in C++. In the future, we can also multithread engine and release the GIL for most of the non-Python backwards.	2017-02-13 16:00:16 -08:00

46 Commits