Summary:
Now, run `python test/onnx/test_operators.py --no-onnx`, we won't introduce any onnx python dependence. (No onnx/protobuf python packages needs to be installed)
The major changes:
- output pbtxt from C++ exporter directly, so the floating format may be slightly different. (This should be fine, since it's just to guard ONNX exporting.)
- ONNX python packages are only imported if we run the ONNX related checks. Those checks are disabled when using `--no-onnx` flag.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10151
Reviewed By: jamesr66a
Differential Revision: D9130706
Pulled By: houseroad
fbshipit-source-id: ea28cf5db8399929179698ee535137f209e9ce6f
Summary:
Copy of #10191 because these changes didn't land with the diff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10394
Differential Revision: D9260816
Pulled By: li-roy
fbshipit-source-id: 7dc16919cfab6221fda1d44e98c5b900cfb40558
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130
Update some include paths to make them internally consistent
Reviewed By: ezyang
Differential Revision: D9119906
fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
Summary:
Follow up task of #9584.
Commit 1:
- change expect/cast to return shared pointers instead of raw pointer
- isSubtypeOf accept TypePtr instead. Use `x->isSubtypeOf(NumberType::get())` rather than `x->isSubtypeOf(*NumberType::get())`
Commit 2:
- to address enable_shared_from_this pitfalls, we make the constructor private and expose the factory method to make sure user can only create it using our factory method.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9786
Reviewed By: zdevito
Differential Revision: D8980441
Pulled By: wanchaol
fbshipit-source-id: e5c923fc57a701014310e77cf29985b43bb25364
Summary:
I got some tensor->variable conversion exceptions from `torch/csrc/autograd/variable.h`, which used the `TORCH_ASSERTM` macros instead of `AT_CHECK`, so they didn't have backtraces. This was such a substantial loss for debugability that I decided to update the whole codebase to use the backtrace-enabled ATen macros instead of `TORCH_ASSERT` and `JIT_ASSERT`, the latter having been an alias of the former.
ezyang apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9575
Differential Revision: D8924566
Pulled By: goldsborough
fbshipit-source-id: 7a4013b13eec9dbf024cef94cf49fca72f61d441
* Workaround in onnx to get transposes into init_nets
This adds a pass to ONNX so that it can speculate Transpose
operators so that ONNX's split pass can put them into an init_net
Also fixes a potential bug in onnx peephole where an optimization
across blocks might move a Value and violate scoping.
* Perform shape propagation when embedding a program into a trace.
This ensures the trace still has type information specific to that trace, which will help onnx export succeed in more cases.
when no explicit hidden state is provided, a default is created by
constructing a new Variable filled with zeros. This gets traced as a
Constant operator, which hardcodes in the batch size.
To fix this, we remove such constant operators in an 'optimization'
pass. We could have also fixed it by causing the code to not generate
a Constant in the first place, but this is the least invasive fix from
the perspective of the pure pytorch codebase.
Allows you to export an ONNX model as:
Protobuf file (this is what we have now)
Uncompressed zip archive
Compressed zip archive
Directory
* Experimental support for different ONNX export types
* Remove a copy
* Add comment
* Add test cases
* lint
* fix bug
* address comments
* Namespaced symbols
- Our interned strings now have structure, "ns::symname" rather than just
"symname" before. We support efficient namespace testing for uniques
by encoding the namespace in one byte in the Symbol internal representation.
See torch/csrc/jit/interned_strings.h for a more in-depth implementation
discussion.
- All uses of ksymbol are now attr::symbol (or some appropriate namespace).
The valid namespaces are prim, attr, onnx and aten.
- Symbol is bound in Python as a qualified string "attr::symbol", EXCEPT for the
attribute setting/getting API, whose symbols must always be attr
symbols; they get special cased to assume strings are passed.
There's a little bit of naughtiness in the implementation, maybe you know
how to solve it.
- However, the g.op() convenience function assumes that you're generating
ONNX operators, unless you explicitly qualify.
- All ATen operators and nodes have built-in interned strings generated
for them, so you should never have to write a string literal ever again.
The tracing code is adjusted to use it.
- ONNX exporter now properly tests to see that all operators are in
onnx namespace before accepting the export. This is way more
robust than the previous exporter, which would be willing to
export capitalized operators which were not actually ONNX operators.
- A slight organizational change for symbolic.py; this module now ONLY
contains aten operators. In particular, the exporter for Constant
has moved into utils.py (along with Undefined, from the C++ side),
since primitive ops get "special treatment."
- The un-inplacing logic in recording is more robust, so that we don't
delete a trailing underscore from __and__. This never affected us
before because we didn't have any tests for it.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* PyObject* <--> at::Tensor no longer unwraps variables, instead we expect end uses to always work with variable types, and we will only unwrap the variables when we optimize.
* Add torch::CPU, torch::CUDA and torch::getType
* at::CPU -> torch::CPU in extensions
* Add source information to IR nodes
SourceRange information from the script is not propagated to IR nodes.
This information is only used in two places now: the interpreter
wraps errors that occur when an instruction executions and shape
propagation now reports errors on the line where it fails:
Traceback (most recent call last):
File "test/test_jit.py", line 1655, in test_script_error
bar(Variable(torch.rand(10), requires_grad=True), Variable(torch.rand(9), requires_grad=True))
RuntimeError:
The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0:
@torch.jit.script
def bar(c, b):
return c / b
~~~~~ <--- HERE
In the future, shape propagation should really not report any size
errors and instead just not propagate shapes and let the actual
execution fail. However, this is hard to accomplish while we still
depend on running the op to do shape propagation.
Support shape propagation with control-flow
* This allows us to enable optimization in the GraphExecutor for most
script tests.
* Changes Type to always be present (non-null) on a Value, removing `hasType()`
and `typeOption()`. A new type kind 'DynamicType' now represents when
a specific type has not been determined.
* If/Loop nodes propagate shapes/types in the simple cases where types of
outputs do not change depending on where control flows. In other
cases, we propagate DynamicType to indicate we do not know what
the shape will be.
* Remove the `cond` input to the body of Loop to simplify handling in
interpreter and shape propagation.
* Bugfix for zero-dim contiguousStridesOf
* PackedSequence: store batch_sizes as tensor
rather than converting to a list of python integers. This maintains
the invariant that module's inputs/outputs are collections of
Variables.
In particular, this causes the JIT to no longer choke when flattening
and unflattening arguments.
* Handle sequence lengths correctly when exporting RNNs to ONNX
- when uniform sequence lengths are provided, correctly omit the
argument when constructing the ONNX graph, so as to not fix the
graph to the batch size.
- handle PackedSequences by floating them through the graph and
eliminating them in an optimization pass. ONNX does not have packed
sequences, but operates on a representation equivalent to
PaddedSequence, so we hide the representation-switching from ONNX
- as a preliminary step towards handling PackedSequences, not directly
tied to ONNX export, change batch_sizes from being an argument to
the RNN operators into being an argument to the forward() function
of those RNN operators. This more closely models the reality that
batch_sizes are effectively part of the input sequences.
This commit is getting the IR ready for representing ONNX control flow.
It adds nested blocks to the IR.
* Each node now has blocks(), addBlock(), and eraseBlock() similar to a node's
output list.
* Blocks are a property of every node rather than an attribute because
to make it easier to manage the lifetime of the containing nodes and because
the behavior of cloning Blocks will likely be different from the way we clone other
attributes.
* A block itself has a list of nodes, as well as inputs and outputs.
The meaning of the nested input/output nodes are specific to the particular
node kind containing the block. It is safe to assume inputs to a block will be
in scope in the block.
* Each Block has an owningNode() and each node has an owningBlock().
The owningNode of the top-most block is null.
* Values are lexically scoped: nested blocks can use values from outer blocks
that have been defined in previous nodes. Lint has been updated with these
new scoping rules.
* This change preserves almost all of the pre-Block API. No attempt has been made
to make optimizations aware of Blocks. This will need to be done on a case-by-case
basis as we make optimizations capable of handling Blocks.
Previous Symbol was just a uint32_t and we converts symbolToString and
stringToSymbol. Now Symbol is a struct with a toString method, and
constructors from either BuiltinSymbols enums (e.g. kParam) or strings.
Symbol is convertible to a uint32_t to ensure it can still be used in
switch statement BuiltinSymbol case branches.
* Convolution derivatives in ATen
This PR introduces ATen implementation of convolution, which dispatches to
THNN/CuDNN/nnpack based on input parameters. The general strategy is to compose
this function out of the various forward-backward pairs of specific
implementations, rather than write a monolithic function with backwards (which
is what we did before because the boilerplate of doing it otherwise would have
been very high.) The new API provides the following functions:
- _convolution, which is a fully generic, native convolution implementation
that dispatches to various other convolution implementations depending on
input characteristics. This is prefixed with an underscore because it
explicitly takes benchmark, deterministic and cudnn_enabled which are
implementation details for CuDNN. The intent is to eventually provide a
convolution that reads these parameters out of the context using #4104.
- _convolution_nogroup is a convolution implementation for non-CuDNN
algorithms which don't support group convolution natively.
- _convolution_double_backward is the generic double-backwards implementation
for convolution.
In more detail:
- Most functionality from torch/csrc/autograd/functions/convolution.cpp has been
moved into aten/src/ATen/native/Convolution.cpp
- We continue to make use of ConvParams, but we now construct the parameters
upon entry to a function from the function signature (which does not use
ConvParams; having convolution take ConvParams directly would require teaching
the code generator how to accept these as parameters, complicating ATen's API
model) and destruct them when making subprocedure calls.
- I introduce a new idiom, input_r, which represents a const Tensor& reference,
which will subsequently be assigned to a local Tensor input. This is helpful
because a lot of the existing algorithms relied on being able to assign to
locals, which is not permitted with a const reference.
- The native argument parser now supports std::array<bool,2> inputs (NB: there
MUST NOT be a space; this is the same hack as is applied to derivatives.yaml)
- Native parser now supports Tensor? arguments, which indicates a nullable
tensor. Previously this function was only used by NN methods.
- Documentation updates on THNN library
- I added an extra fgradInput argument to VolumetricConvolutionMM_updateOutput
and VolumetricConvolutionMM_accGradParameters so that its buffer list lines up
with the backward argument list. This makes it possible to write derivative
for conv3d which previously was not supported (commented out in
derivatives.yaml)
- Extra double_backward declarations for all convolution backwards functions was
added.
- You can now use the syntax Tensor? in native_functions.yaml to indicate that a
tensor argument is nullable. There are adjustments to propagate this to the
Python argument parser.
- NNPACK was ported to ATen, and ATen now builds and links against ATen if
possible. New AT_NNPACK_ENABLED macro. The nnpack functions are
nnpack_spatial_convolution.
- Some modest CuDNN convolution refactoring to remove _forward from names.
- There's a new cudnn_convolution_backward function to deal with the fact that
CuDNN convolution double backward requires you to have computed all gradients
in one go.
- Variable set_flags now checks if the tensor is undefined, fixing a silent memory
corruption.
- checkSameType updated to not raise an exception if called with Variable arguments
- "no ATen declaration found for" error message is improved to say what available declarations are
- make_variable now accepts undefined tensors, and returns an undefined tensor in this case.
This commit adds a Value type similar to the one @ezyang suggested a while
ago for handling multi-return nodes.
Previously if we had a graph like:
a = op1(b)
c, d = op2(a)
Then its in-memory format would look like:
%0 = op1(b)
%1 = op2(%0)
%2 = select(%1, 0)
%2 = select(%1, 1)
Select nodes were used only to handle the multi-output case. In the
single-output case ops referred directly to their uses.
This required special handling for the single- and multi- output cases,
and was confusing when used with ONNX which distinguishes values (the
inputs/outputs of a node) from the nodes themselves (e.g. a Conv).
This commit adds the Node/Value distinction to the IR. In the example
above, `a`, `b`, `c`, and `d` are now Value objects, while `op1` and
`op2` are now Node objects. Inputs/Outputs to the graph are values.
* Nodes now always have multiple outputs, accessible through their `output()`
method.
* Methods exist for adding/removing outputs from a node.
* Nodes own their output Values, destroying a node destroys its outputs and it
is only valid to destroy a node when no uses of its outputs remain.
* Unlike select, Values do not appear in the nodes list.
* The method `node()` on `Value` retrieves its defining node. Calling it
is always valid. For inputs, its kind is "Param". Like "Return" there is a single Param
node representing all inputs.
* For single-output Nodes, the method `output()` retrieves the single
output Value, asserting that the node is in-fact single output.
* Functions are the same, but some functions like `type()` have moved to
Value.
* `replaceAllUsesWith` is now sanely defined for both Values and Nodes.
In the case of Nodes, it replaces all outputs of the node with the outputs
of the replacement node.
* stage is defined both on Node/Value. This is because Inputs require a stage.
* Apart from changing data types from Node->Value most passes remain the same.
Things that previously assumed single-output nodes now have to call output()
to get the node.
* This removes the uses = [...] field in the outputs because it was
getting confusing even before this commit when uses would refer to nodes,
but we print the names of Values. The lint pass validates the use list,
so printing it out seems less necessary.
* Update comments and size logic
* Record stack traces during JIT tracing
* Use string helper functions and AutoGIL
* Use SourceLocation object instead of storing in debugName
* Address zdevito comments
* Address comments
* Regenerate ONNX nanopb from latest version.
But don't bump the IR version, we don't handle discriminators
yet.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Add discriminator to AttributeProto.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Add back ONNX definition for permute
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
This breaks a lot of the onnx-pytorch tests because the abstraction
barriers are not respected. I'll spin up a patch for that separately.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
This started off as a minor fix based on Adam's question, "why is printing
a graph not const" and snowballed into a giant yak shaving exercise.
- The Graph and Node APIs now uniformly enforce deep constness; e.g., if you
get a const Node* or const Graph*, it is not possible to get a non-const
Node*/Graph* somewhere else in the graph (even though the member variables
of these are non-const. Hooray for private access specifier.)
- A big pile of functions got const versions, most notably the printing
functions, and functions for accessing inputs().
- REALLY IMPORTANT, BC-BREAKING CHANGE: inputs() now returns a COPY of the
inputs, rather than a reference to the underlying. I was forced to do this
because there is no way to portably turn a std::vector<Node*> into a
std::vector<const Node*>, which is necessary to provide a const-correct
version of inputs() that enforces deep const-correctness. I then justified
this choice to myself with the observation that outputs() returned a
copy (by necessity), so this makes the API more uniform.
But making this change uncovered two very subtle bugs:
1. If you change functions from returning a reference to returning a copy,
the idiom node->inputs().begin() is no longer valid, because the memory
the iterator points to immediately becomes invalid. THIS SUCKS.
Honestly, we should add a lint rule rejecting calling begin()/end() on
temporaries because this is very dangerous. To excise this pattern from
the codebase, I added begin() and end() methods to Graph, so that we got
rid of the graph->nodes().begin() idiom, which happens to be sound,
despite not returning a reference, because graph_node_list is a
non-owning reference.
2. pybind11 doesn't handle std::vector<Node*> cast out of the box.
Fortunately, I found a simple fix in the GitHub issues tracker
that involved adding an extra type converter. And yes, this
does mean that outputs() in Python never worked correctly.
- New const_graph_node_list, which is a graph_node_list that gives you const
Node*
There are some more miscellaneous improvements:
- Applied CR comment fixes on export.cpp; using replaceInput, and renaming
variables for clarity.
- assertValidInput helper method added, and applied to replaceInput
- Use an explicit function to print THPObjectPtr, otherwise we get
the wrong overload.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* update fuser to match ATen-formatted JIT ops
* fix concat optimizations and add test
* allow onnx export to work with single-export functions
* fix onnx handling of multi-return nodes.
* nits, format, vision test update
* fix add constant
* fix driver init issues
* Add missing Neg symbolic.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
- Deleted Addmm/Concat Function class, as this is now native ATen operator
- Resurrected ONNX operator for Concat (now called 'cat')
- Add a "fake" Expand ONNX operator, which we now do the optimization on;
this helps prevent us from emitting a warning that 'expand' is not supported.
We still fail if any of these Expand operators make it to the final model,
until we actually formalize Expand in ONNX. This also simplifies the
fuseBroadcast code, because single-return ONNX nodes don't get select nodes.
- New error reporting strategy. If we fail to export an operator because of
something, we emit a warning, but otherwise keep going. At the very end,
in export.cpp, we now check if there are any ATen operators left over. If
there are, we bug out. This assumes that ATen is lower case and ONNX is upper
case. You're now supposed to 'return _unimplemented(msg)' in these cases.
- New toString() method on Graph, for getting the string graph (useful for
slapping it into error messages.)
- Some of the legacy symbolics (still in Python symbolic method of Function
subclass) have been cleaned up for clarity.)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
The pieces:
- I improved the lint / asserts to catch some bugs which I
committed while working on my export. There are two new
properties which the linter checks now:
(1) "Anticipated uses". If a node says that is used by
M, M better appear later in the topsort. Previously,
we only checked if it was in all_nodes.
(2) If you are a select node, you better be a multi-type node;
if you're not a select node, you better not be! And you
should never have an input that is multi-type.
- There is a new peephole optimization pass, for simple, local
transformations to graphs. Right now, it implements a simple
optimization: remove 'expand' invocations that are no-ops
(the size before matches the size after), but we can add other
things to it later. I needed this for ONNX because no-op expands
show up in the left-hand argument, which we don't support.
- There is now a broadcast fuser, which fuses ATen expand ops
into broadcastable ONNX ops (Add, Div, Mul, Pow, Sub, Gemm.)
It only fuses when the original size is a suffix of the new
size, as per the ONNX spec.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
In many "non-Python" headers, we include Python.h because we need
to declare a pointer to PyObject, and solely because of that. It
would be a lot better if we had a simpler version of Python.h that
just declared PyObject available for pointers, without anything
else. This is what torch/csrc/utils/python_stub.h does.
The good thing about not including Python.h is that it is easy to
be warning-less; no more ugly insertions of Python.h on headers
where it has no good reason to be.
This makes PyTorch warning clean again.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>