Summary:
This PR adds machinery to cache the schema in an IR node, and allows lookups of (possibly) constant inputs by their names (instead of position). The new methods are:
- `at::optional<T> get<T>(Symbol name)` - if the argument called name is a constant, then casts it to type `T` and returns it. If it's not constant returns `nullopt`. Raises an error if there's no argument with that name.
- `at::optional<IValue> get<T>(Symbol name)` - like above, but packs the result in an IValue
- `Value* getValue(Symbol name)` - retrieves a `Value*` for an argument (no need to know its position).
All above functions currently inspect the attributes as well, but that's only so that I could start using them in other places in the JIT without disrupting our current functionality. I wanted this diff to be a preparation that doesn't change the semantics too much, and so both the tracer and script create nodes with attributes. The next PR will put that to a stop, and hopefully the changes we need to make to other components will be simpler thanks to what I did here.
One more thing I'd like to do before actually stopping creating the non-attributed nodes is to have a convenient way of creating a schema programmatically, matching nodes against it, and creating them without having to pack inputs into flat argument lists (which is quite error prone).
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9505
Reviewed By: ezyang
Differential Revision: D8915496
Pulled By: apaszke
fbshipit-source-id: 39d14fc9a9d73d8494f128367bf70357dbba83f5
Summary:
As in the title. Lets us simplify a lot of code.
Depends on #9363, so please review only the last commit.
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414
Reviewed By: zdevito
Differential Revision: D8836496
Pulled By: apaszke
fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
* this removes the flag controlling whether the interpreter works on variables.
* now the interpreter _always_ works on variables
* constants in the IR are still _always_ non-variables, and an assert was added to ensure this.
* as_tensor was split into as_variable and as_tensor since it is sometimes used
to construct constants in the IR
* I tried changing the IR to also always use variables but that change was much more
cross cutting and fragile and I never got it working
When tracing we record expand nodes. This is useful in some cases because
it makes it clear a broadcast happened. However, in future runs
the broadcast may be different or not needed. This change adds an
attribute to expand to track if it was implicitly added. This
takes the form of an unused input to expand with a default value.
The execution engine then removes implicit expands before execution.
Note that shape_analysis will re-add expands when it can prove by
shape analysis that they will exist and this is useful for the fuser,
so this change should not affect fusion passes.
This makes the JIT tracer much more robust, by allowing it to record
dependencies on tensor sizes. For example, if you were to trace this
function
def fn(x):
return x.view(x.size(1), -1)
before this patch, then it would embed the actual value of x.size(1)
in the trace as a constant, making it very hard to have e.g. batch size
independent traces. Now, this will correctly record the dependency, and
will retrieve the size of x at every run.
The long-term fix is to remove the handling-creating pathways and
remove all the modes from PythonOp making it into an op that simply
calls a PyObject. Right now ONNX expects PythonOp to hold a
nn.Function, not a generic callable, so completely removing the legacy
pathway will also require changes to how ONNX symbolics are found.
* Have ScriptModule inherit from Module
This is accomplished by created replacement _parameters, _buffers,
and _modules which implement the OrderedDict APIs but which
actually get/set their members inside script::Module
* Merge TracedModule with ScriptModule
* Move logic of attribute handling into Python bindings rather than
make script::Module handle it. This was redundant with nn.Module,
which already handles attribute.
* Make TracedModule a subclass of ScriptModule
* Move handling of attribute kind logic into bindings.
* Allow ScriptModule to contain non-script module submodules.
* Namespaced symbols
- Our interned strings now have structure, "ns::symname" rather than just
"symname" before. We support efficient namespace testing for uniques
by encoding the namespace in one byte in the Symbol internal representation.
See torch/csrc/jit/interned_strings.h for a more in-depth implementation
discussion.
- All uses of ksymbol are now attr::symbol (or some appropriate namespace).
The valid namespaces are prim, attr, onnx and aten.
- Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the
attribute setting/getting API, whose symbols must always be attr
symbols; they get special cased to assume strings are passed.
There's a little bit of naughtiness in the implementation, maybe you know
how to solve it.
- However, the g.op() convenience function assumes that you're generating
ONNX operators, unless you explicitly qualify.
- All ATen operators and nodes have built-in interned strings generated
for them, so you should never have to write a string literal ever again.
The tracing code is adjusted to use it.
- ONNX exporter now properly tests to see that all operators are in
onnx namespace before accepting the export. This is way more
robust than the previous exporter, which would be willing to
export capitalized operators which were not actually ONNX operators.
- A slight organizational change for symbolic.py; this module now ONLY
contains aten operators. In particular, the exporter for Constant
has moved into utils.py (along with Undefined, from the C++ side),
since primitive ops get "special treatment."
- The un-inplacing logic in recording is more robust, so that we don't
delete a trailing underscore from __and__. This never affected us
before because we didn't have any tests for it.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
This PR adds the possibility to build the C++ parts of autograd and jit, with no dependency on Python.
The goal is to allow taking a PyTorch IR representation (a tree s-expr) and running it with provided inputs.
Prerequisite: build PyTorch so that codegen runs once.
Instructions:
cd tools/cpp_build
bash build_all.sh
This will build libtorchjit and torchjit_test in tools/cpp_build/build/torchjit-build. The latter basically runs the code in test_jit.cpp for now.
While writing the PR, it turned out that a few of Python.h includes were redundant. They were removed here (PyTorch tests still pass on my machine, we'll see CI).
* Introduce Python-free builds of autograd and jit
* Remove NO_PYTHON ifdef in functions/special
* Add source information to IR nodes
SourceRange information from the script is not propagated to IR nodes.
This information is only used in two places now: the interpreter
wraps errors that occur when an instruction executions and shape
propagation now reports errors on the line where it fails:
Traceback (most recent call last):
File "test/test_jit.py", line 1655, in test_script_error
bar(Variable(torch.rand(10), requires_grad=True), Variable(torch.rand(9), requires_grad=True))
RuntimeError:
The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0:
@torch.jit.script
def bar(c, b):
return c / b
~~~~~ <--- HERE
In the future, shape propagation should really not report any size
errors and instead just not propagate shapes and let the actual
execution fail. However, this is hard to accomplish while we still
depend on running the op to do shape propagation.
* Improve Variable interface
* Address comments from @apaszke and @colesbury
* string ::operator= is not noexcept
* Remove ir.h from tracer_state.h to improve build times
* Make Variable a struct and pack SavedVariable fields
* Implement as_variable_ref
* grad_fn_ptr() -> grad_fn_unsafe()
* Reduce hackiness of set_type hack
* Include variable.h and edge.h in tracer_state.h because it uses them
* class Variable -> struct Variable because Windows cant even
* Make Variable::output_nr uint32_t instead of int
* Add comment about tracing state
* Replaced more static_cast<Variable&> and improve docs
* Remove SavedVariable destructor and construct members in init list
* Clarify docs for Variable
* Variable::set_version -> set_version_counter
The Tensor and Variable classes are being merged.
autograd.Function.forward is now called on Variables, but with "no-grad"
mode (torch.no_grad()) enabled.
One benefit is that we no longer have to explicitly track shared
storages.
* Fix#4480 by tracing inputs before running function.
The DCE trick says that if I have y = f(x), and f is internally implemented as
g, it's OK to trace both g and f. Recall the tracing algorithm is:
enter f(x)
compute its result y
trace y = f(x)
return from f
So when you run the example above, you'll do this:
# suppose x is mapped to %1
enter f(x)
enter g(x)
result of g is y
trace y = g(x a.k.a. %1) (mapping y to %2)
return from g
result of f is y
trace y = f(x a.k.a. %1) (remapping y to %3)
return from f
and end up with a trace like this:
%2 = g(%1)
%3 = f(%1)
... only %3 is live, because %2 was killed from the mapping... Subsequent DCE
will eliminate the invocation of g and you'll only see f in the final trace.
However, if f and g are inplace functions, the machinery breaks:
# suppose x is mapped to %1
enter f(x)
enter g(x)
result of g is x
trace x = g(x a.k.a. %1) (remapping x to %2)
return from g
result of f is x
trace x = f(x a.k.a. %2) (remapping x to %3)
return from f
resulting in:
%2 = g(%1)
%3 = f(%2) # OOPS
This commit changes the strategy so we instead do this:
enter f(x)
trace f(x)
compute its result y
trace y = f(x) (computed above)
return from f
Now we get the correct Value before it is overwritten.
Here is what the new trace code looks like:
jit::tracer::PreTraceInfo trace_info;
if (jit::tracer::isTracing( self, index )) {
trace_info = jit::tracer::preRecordTrace( "index_fill", { self, index } );
setattr(trace_info.n, jit::Symbol("dim"), dim);
setattr(trace_info.n, jit::Symbol("value"), value);
}
baseType->index_fill_(self_, dim, index_, value);
increment_version(self);
rebase_history(self, grad_fn);
if (trace_info.state != nullptr) {
jit::tracer::postRecordTrace( trace_info, { self } );
}
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Revert "Hot patch ONNX _run_symbolic_function"
This reverts commit d1c973fee1.
* lintfix
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Add missing expect file
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Previous Symbol was just a uint32_t and we converts symbolToString and
stringToSymbol. Now Symbol is a struct with a toString method, and
constructors from either BuiltinSymbols enums (e.g. kParam) or strings.
Symbol is convertible to a uint32_t to ensure it can still be used in
switch statement BuiltinSymbol case branches.
This commit adds a Value type similar to the one @ezyang suggested a while
ago for handling multi-return nodes.
Previously if we had a graph like:
a = op1(b)
c, d = op2(a)
Then its in-memory format would look like:
%0 = op1(b)
%1 = op2(%0)
%2 = select(%1, 0)
%2 = select(%1, 1)
Select nodes were used only to handle the multi-output case. In the
single-output case ops referred directly to their uses.
This required special handling for the single- and multi- output cases,
and was confusing when used with ONNX which distinguishes values (the
inputs/outputs of a node) from the nodes themselves (e.g. a Conv).
This commit adds the Node/Value distinction to the IR. In the example
above, `a`, `b`, `c`, and `d` are now Value objects, while `op1` and
`op2` are now Node objects. Inputs/Outputs to the graph are values.
* Nodes now always have multiple outputs, accessible through their `output()`
method.
* Methods exist for adding/removing outputs from a node.
* Nodes own their output Values, destroying a node destroys its outputs and it
is only valid to destroy a node when no uses of its outputs remain.
* Unlike select, Values do not appear in the nodes list.
* The method `node()` on `Value` retrieves its defining node. Calling it
is always valid. For inputs, its kind is "Param". Like "Return" there is a single Param
node representing all inputs.
* For single-output Nodes, the method `output()` retrieves the single
output Value, asserting that the node is in-fact single output.
* Functions are the same, but some functions like `type()` have moved to
Value.
* `replaceAllUsesWith` is now sanely defined for both Values and Nodes.
In the case of Nodes, it replaces all outputs of the node with the outputs
of the replacement node.
* stage is defined both on Node/Value. This is because Inputs require a stage.
* Apart from changing data types from Node->Value most passes remain the same.
Things that previously assumed single-output nodes now have to call output()
to get the node.
* This removes the uses = [...] field in the outputs because it was
getting confusing even before this commit when uses would refer to nodes,
but we print the names of Values. The lint pass validates the use list,
so printing it out seems less necessary.
* Update comments and size logic
* Record stack traces during JIT tracing
* Use string helper functions and AutoGIL
* Use SourceLocation object instead of storing in debugName
* Address zdevito comments
* Address comments
* update fuser to match ATen-formatted JIT ops
* fix concat optimizations and add test
* allow onnx export to work with single-export functions
* fix onnx handling of multi-return nodes.
* nits, format, vision test update
* fix add constant
* fix driver init issues
* Add missing Neg symbolic.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Fix clang-802.0.42 tuple overload bug, fixes#3234.
Originally, my plan for emit_record_trace was to keep it as
simple as possible, if at the expense of some somewhat ugly
overloads. So this meant we had a 'recordTrace' function
with overloads like this:
recordTrace(..., const Variable& out)
recordTrace(..., const std::tuple<Variable, Variable>& out)
Unfortunately, this triggers a bug in clang-802.0.42
(widely used in macOS Sierra 10.12.6) wherein a Variable is
implicitly convertible into a std::tuple<Variable, Variable>;
a minimal repro can be seen below here:
#include <tuple>
struct T {};
void f(const std::tuple<T, T>&) {}
void g(T& x) { f(x); }
To work around this bug, the code generator is a bit more
complicated, and is taught how to handle this situation.
Previously the generated code looked like:
jit::tracer::recordTrace( "min", { self }, ret );
Now it looks like:
jit::tracer::recordTrace( "min", { self }, { std::get<0>(ret), std::get<1>(ret) } );
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* CR comments
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
The general strategy is there is a new module, torch.onnx.symbolic, which
contains a function for every ATen method name with the ONNX translation.
While implementing this, I took the opportunity to expunge all references
of 'g' from the public API; instead, it is managed by a global variable in
torch.onnx which tracks the "current graph".
Other changes:
- If you pass a Tensor to op as an argument, it will now automatically be
converted into a Constant ONNX node. This lets us remove needing to
implement ONNX
- Rename value to other, wherever there is both a Scalar and Tensor overload.
This way, keyword dispatch can work uniformly in both cases.
- Deleted any autograd Function classes that both had a symbolic and were ported
to the new C++ autograd implementation. There may still be some straggling
classes that didn't have symbolic.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
The generated tracing code looks like this:
if (jit::tracer::isTracing({ self })) {
jit::Node *n = jit::tracer::recordTrace( "mean", { self }, ret );
n->rawSet(jit::stringToSymbol("dim"), dim);
n->rawSet(jit::stringToSymbol("keepdim"), keepdim);
}
A few design decisions I made:
- Instead of making the assignment of 'n' conditional on whether or not
attributes are present, I just add (void)n if it would not be used
otherwise. This modestly simplifies code generation.
- Tracing of operations that involve Generator or Storage are not supported.
This is fine because such ops don't take any Variable arguments anyway,
so they couldn't trigger tracing.
- Unfortunately, at::ArrayRef is not covariant, so there is some faffing about
to support conversions from at::ArrayRef<Tensor> (aka TensorList) to
at::ArrayRef<Variable>. In the case of 'recordTrace' (slow path), I just
allocated an intermediate std::vector to get the types correct; in the case
of isTracing (fast path) there's three overloads to avoid refcount bumping
when possible.
- Tracing is all in one place, rather than spattered between the beginning
and end of an ATen function, as Sam suggested.
- This commit doesn't actually enable ATen definitions.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Variable is now a subclass of at::Tensor backed by a VariableImpl* pImpl. The implementation of the ATen functions is defined in the auto-generated VariableType.h/cpp file.
Currently, only functions which fall through to the base type, such as sizes() and isCuda() are implemented. Differentiable ops like add() and mul() will be added in a subsequent PR.
* Variables now hold a list of ValueTracingStates and can participate
in multiple traces.
* Refactored Traceable to maintain a list of traces, and only stop
tracing once it records all stages