Summary:
**Review last commit only.** Stacked on top of #10949.
This commit fixes a number of issues connected to caching
differentiability status of graphs inside graph executors,
and changes the rules for optimization of differentiable subgraphs.
Previously every one of those was instantiated as a separate graph
executor, but now they are simply heavier-optimized graph regions,
and graph executors are only instantiated for their backward.
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10977
Differential Revision: D9600626
Pulled By: apaszke
fbshipit-source-id: dad09a0f586e396afbd5406319c1cd54fbb8a3d3
Summary:
More clang tidy cleanups in `torch/csrc`. This time:
1. `hicpp-use-equals-default` recommends `= default` instead of `{}` for constructors/destructors. This is better practice because it expresses the intent better (https://stackoverflow.com/questions/6502828/what-does-default-mean-after-a-class-function-declaration)
2. `readability-inconsistent-declaration-parameter-name` enforces that parameter names in the declaration match parameter names in the definition. This is just generally useful and can prevent confusion and bugs.
Also updated my script a little bit.
apaszke ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9737
Differential Revision: D9069069
Pulled By: goldsborough
fbshipit-source-id: f7b3f3a4eb4c9fadc30425a153566d3b613a41ae
Summary:
I got some tensor->variable conversion exceptions from `torch/csrc/autograd/variable.h`, which used the `TORCH_ASSERTM` macros instead of `AT_CHECK`, so they didn't have backtraces. This was such a substantial loss for debugability that I decided to update the whole codebase to use the backtrace-enabled ATen macros instead of `TORCH_ASSERT` and `JIT_ASSERT`, the latter having been an alias of the former.
ezyang apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9575
Differential Revision: D8924566
Pulled By: goldsborough
fbshipit-source-id: 7a4013b13eec9dbf024cef94cf49fca72f61d441
Summary:
I'm cramming through clang tidy emitted warnings. This PR addresses the `hi-cpp-override` check which warns that `virtual` + `override` is redundant, since `override` already signifies that a function is overriding and thus virtual.
Where there was `virtual` + `override` I removed the `virtual`, where there was `virtual` and no `override` I removed `virtual` and added `override`.
ezyang apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9335
Differential Revision: D8807082
Pulled By: goldsborough
fbshipit-source-id: e0a261053f6540a22cc56ec160a24aa285af6319
* this removes the flag controlling whether the interpreter works on variables.
* now the interpreter _always_ works on variables
* constants in the IR are still _always_ non-variables, and an assert was added to ensure this.
* as_tensor was split into as_variable and as_tensor since it is sometimes used
to construct constants in the IR
* I tried changing the IR to also always use variables but that change was much more
cross cutting and fragile and I never got it working
* Fixes to the way script handles multiple values, and other minor fixes.
This commit improves our handling of operators that return multiple values.
Builtins are now checked so that they return the right number of values,
and support for TupleValue is extended to all things that can return
multiple values.
This resolves issues where the compiler accepted things like:
a, b = c + c
This would cause the interpreter to crash. Now each operator knows
how many results it will produce and can check it against the number
of requested inputs.
Notes:
* Allow True/False literals in constant expressions
* make handling of keyword constants more consistent to support True/False
* make parsing constants match the way we construct constants from python
* improve the error messages when accessing bad graph attributes.
* switch findTensorOp to return an optional.
* check that attribute types are correct in findTensorOp
* Check the correct number of outputs for builtins
This also changes emitExpr to return a single SugaredValue
Rather than possibly returning multiple values, emitExpr now
always returns a single value, which _might_ be a tuple. This approach
more closely follows python making the code easier to follow.
Checks for returning the right number of values are now located in
the assignment operator, and occur when unpacking the tuple.
We still pass `n_binders` to function calls so that calls into python
know how many values they should return.
* Namespaced symbols
- Our interned strings now have structure, "ns::symname" rather than just
"symname" before. We support efficient namespace testing for uniques
by encoding the namespace in one byte in the Symbol internal representation.
See torch/csrc/jit/interned_strings.h for a more in-depth implementation
discussion.
- All uses of ksymbol are now attr::symbol (or some appropriate namespace).
The valid namespaces are prim, attr, onnx and aten.
- Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the
attribute setting/getting API, whose symbols must always be attr
symbols; they get special cased to assume strings are passed.
There's a little bit of naughtiness in the implementation, maybe you know
how to solve it.
- However, the g.op() convenience function assumes that you're generating
ONNX operators, unless you explicitly qualify.
- All ATen operators and nodes have built-in interned strings generated
for them, so you should never have to write a string literal ever again.
The tracing code is adjusted to use it.
- ONNX exporter now properly tests to see that all operators are in
onnx namespace before accepting the export. This is way more
robust than the previous exporter, which would be willing to
export capitalized operators which were not actually ONNX operators.
- A slight organizational change for symbolic.py; this module now ONLY
contains aten operators. In particular, the exporter for Constant
has moved into utils.py (along with Undefined, from the C++ side),
since primitive ops get "special treatment."
- The un-inplacing logic in recording is more robust, so that we don't
delete a trailing underscore from __and__. This never affected us
before because we didn't have any tests for it.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Previous Symbol was just a uint32_t and we converts symbolToString and
stringToSymbol. Now Symbol is a struct with a toString method, and
constructors from either BuiltinSymbols enums (e.g. kParam) or strings.
Symbol is convertible to a uint32_t to ensure it can still be used in
switch statement BuiltinSymbol case branches.
* Optimizer: Optimize transposes in variety of circumstances
- No-op transposes
- Consecutive transposes (fuse them)
- Transposes into Gemm (fuse them into transA/transB parameter)
* touch up out of date comment
* update fuser to match ATen-formatted JIT ops
* fix concat optimizations and add test
* allow onnx export to work with single-export functions
* fix onnx handling of multi-return nodes.
* nits, format, vision test update
* fix add constant
* fix driver init issues
* Add missing Neg symbolic.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
The generated tracing code looks like this:
if (jit::tracer::isTracing({ self })) {
jit::Node *n = jit::tracer::recordTrace( "mean", { self }, ret );
n->rawSet(jit::stringToSymbol("dim"), dim);
n->rawSet(jit::stringToSymbol("keepdim"), keepdim);
}
A few design decisions I made:
- Instead of making the assignment of 'n' conditional on whether or not
attributes are present, I just add (void)n if it would not be used
otherwise. This modestly simplifies code generation.
- Tracing of operations that involve Generator or Storage are not supported.
This is fine because such ops don't take any Variable arguments anyway,
so they couldn't trigger tracing.
- Unfortunately, at::ArrayRef is not covariant, so there is some faffing about
to support conversions from at::ArrayRef<Tensor> (aka TensorList) to
at::ArrayRef<Variable>. In the case of 'recordTrace' (slow path), I just
allocated an intermediate std::vector to get the types correct; in the case
of isTracing (fast path) there's three overloads to avoid refcount bumping
when possible.
- Tracing is all in one place, rather than spattered between the beginning
and end of an ATen function, as Sam suggested.
- This commit doesn't actually enable ATen definitions.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Since this code has been stable for a while, I think it's
a good opportunity to make it const correct. There is only
a slight increase in code size, which I hope will appease @zdevito.
- consts were added to all methods which are logically const. Most notably,
lint() is now declared const.
- I made extra const versions of Node::iterator(), Node::reverseIterator(),
Graph::nodes(), Attribute::find(), linked_list::begin(), linked_list::end(),
linked_list::rbegin(), linked_list::rend(); in all cases these were one-liners
except for find() (I spent a little time trying to make find() a one-liner
but didn't think of a way to do it.).
- graph_node_list got factored out into a new, templated type linked_list<T>
(perhaps we should call it intrusive_list<T>). I had to template the iterator
to define constant and non-constant iterators without duplicating code,
and once I was there, I decided to templatize everything else. The code
nicely factors out, although I wouldn't recommend using it for anything
else without more refactoring.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
previously:
PythonOp/CppOp Graph -> ToffeeIR, primspecs worked with protobufs
now:
PythonOp/CppOp --ToToffeIR--> jit::Graph of in-memory ToffeIR -> protobufs of ToffeIR
This commit let's primspec functions work directly with JIT IR nodes,
which makes it possible to do a lot more stuff in those functions.