Commit Graph

70 Commits

Author SHA1 Message Date
Elias Ellison
e57cb4a1b2 Add a Constant Propagation Pass to the JIT (#8808)
Summary:
Adding a constant propagation pass to the JIT. I have added examples to the expect files.

There are a couple of special cases which have not been implemented here. IF nodes with constant conditions can be inlined with the correct block. WHILE nodes can be removed if the condition is false.  I have added a test for each case in test_jit.py file as expected failures.

To be consistent with DCE, python ops & CPP ops are treated as not having side-effects.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8808

Reviewed By: wanchaol

Differential Revision: D8906770

Pulled By: eellison

fbshipit-source-id: 10ad796d89f80b843566c9ddad6a0abd1f3dc74c
2018-07-30 15:54:31 -07:00
James Reed
851c18dd20 PyTorch File Format API (#9900)
Summary:
This is a follow-up to https://github.com/pytorch/pytorch/pull/9794 that contains only the serialization library and exposes a cleaner API. This should later be incorporated into the module export code
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9900

Reviewed By: zdevito

Differential Revision: D9021057

Pulled By: jamesr66a

fbshipit-source-id: 01af74a7fdd1b90b2f5484644c3121d8ba9eb3b3
2018-07-27 22:24:57 -07:00
Adam Paszke
e39c8043dc Make GraphExecutors work on Stacks instead of variable_tensor_lists (#9763)
Summary:
This is blocking the IR operator unification, because I need to be able to pass scalars to backward functions.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9763

Reviewed By: zou3519

Differential Revision: D8978457

Pulled By: apaszke

fbshipit-source-id: 570b4c3409322459cb0f2592069730a7d586ab20
2018-07-26 12:00:27 -07:00
James Reed
0b16b03b98 Plumb type annotations through script compilation (new) (#9547)
Summary:
Supersedes https://github.com/pytorch/pytorch/pull/9405
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9547

Reviewed By: zdevito

Differential Revision: D8900327

Pulled By: jamesr66a

fbshipit-source-id: a00a94615af4fbaec98ee3ede0cb54bcfd9108dd
2018-07-25 17:10:14 -07:00
Adam Paszke
1d4d9fc7da Prepare to stop using attributes in the JIT (#9505)
Summary:
This PR adds machinery to cache the schema in an IR node, and allows lookups of (possibly) constant inputs by their names (instead of position). The new methods are:

- `at::optional<T> get<T>(Symbol name)` - if the argument called name is a constant, then casts it to type `T` and returns it. If it's not constant returns `nullopt`. Raises an error if there's no argument with that name.
- `at::optional<IValue> get<T>(Symbol name)` - like above, but packs the result in an IValue
- `Value* getValue(Symbol name)` - retrieves a `Value*` for an argument (no need to know its position).

All above functions currently inspect the attributes as well, but that's only so that I could start using them in other places in the JIT without disrupting our current functionality. I wanted this diff to be a preparation that doesn't change the semantics too much, and so both the tracer and script create nodes with attributes. The next PR will put that to a stop, and hopefully the changes we need to make to other components will be simpler thanks to what I did here.

One more thing I'd like to do before actually stopping creating the non-attributed nodes is to have a convenient way of creating a schema programmatically, matching nodes against it, and creating them without having to pack inputs into flat argument lists (which is quite error prone).

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9505

Reviewed By: ezyang

Differential Revision: D8915496

Pulled By: apaszke

fbshipit-source-id: 39d14fc9a9d73d8494f128367bf70357dbba83f5
2018-07-20 10:56:00 -07:00
Chunli Fu
a487b08c2e AutoBatching - IR transformation(basic operators) (#9198)
Summary:
Use decorator `torch.jit.batch` to implement auto-batching (call `to_batch` pass to do IR tranformation).
- `to_batch` pass: "to_batch.h/cpp" in csrc/jit/passess to transform a graph to a new batched graph.
- Write several basic operators for BatchTensor (add, mul, sigmoid, tanh, mm, matmul, select).
- Register the operators in a lookup table `<std::string, std::shared_ptr<Graph>>`. (use the Graph to replace the original node in IR graph)

Move BatchTensor in python from torch.BatchTensor to torch.jit.BatchTensor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9198

Reviewed By: zdevito

Differential Revision: D8744466

Pulled By: ChunliF

fbshipit-source-id: 9ea56a30f55cb870f13a2069a47cc635419763ff
2018-07-11 18:25:07 -07:00
Zachary DeVito
efefd1d7cf Unify aten_dispatch and aten_schema into a single operator abstraction with human-readable schema. (#8885)
Summary:
This is a series of two commits that should probably be read separately. They are stacked on top of #9018 since the second commit requires it for correctness.

Commit 1
=======

This commit is the first in a series that will clean up how we handle declaring operators and intrinsics in the JIT to make it more modular and readable. This introduces readable declarations that can be used to register operators and switches gen_jit_dispatch to generate this schema. A follow up PR will remove the dispatch keys like "add-3" and resolve ops directly based on the registered schema, further simplifying the generation process.

* Switches schema over to parsed declarations, in the future this will allow something like:

```
  registry.register_intrinsic("foo(Tensor a, Tensor b) -> Tensor", [](Stack& stack) {
    ...
  })
```

This will allow the scalable registration of intrinsics for lists, tuples, and other ops, as long as meta-data for these ops (e.g. derivatives and size propagation routines).

The declarations resemble those used by PythonArgParser but have been singificantly cleaned up to minimize the number of types that can appear in the declaration. We should strive to get the other parts of PyTorch switched over to this restricted declaration set when possible, but it is too much to do in a single PR. My hope is that eventually we will use a very similar language to describe declarations in C10, and this can serve as a guide for that.

Parsing is done using the script lexer, so it is very robust to whitespace and extensible for future types.

This removes the other way we encoded schema, and makes it easier to see what schema are registered.

Current generated declarations: https://gist.github.com/zdevito/a96a17766fb3a098d69a91ee00abaaf6

* Switches how we handle attempting to use an integer in the place of a fixed-sized int list, such as in conv (e.g. 'int[3] stride=1'). Now that we can statically distinguish between int and Tensor, we handle the expansion as an implicit conversion in the compiler. This allows us to simplify the interpreter since it no longer needs to handle the conversion itself.

* Schema declarations have been changed so that they match the type system in the IR exactly. In particular, attribute_info which was used by liftConstantAttributes has been dropped and constant attributes are lifted purely based on the type of the input. Type conversions in compiler have been simplified due to this change.

* Error highlighting in ErrorReport now only reports at most 20 lines of code, to make reading where an error occurred easier.

Commit 2
=======

This commit unifies aten_dispatch and aten_schema into a single Operator object that both contains schema and implementation information. In the future we can use this object to also contain functionality like shape prop and autodiff needed by all operators. Operators are registered globally, and dispatch logic uses the schema information to figure out which variant to use. Descriptor keys, a frequent source of inscrutable debug errors, have been removed.

* Introduce Operator, to replace TensorOp. Unlike TensorOp, we use Operator for all op implementations, including primitives that may occur in the graphs. The only exceptions are ops that are only known to the interpreter like jumps, and GraphExecutors where we need to record additional debug info.

* Adds a global registry for Operator implementations. aten_dispatch.cpp turns into register_aten_ops.cpp, which registers all the Operators for aten with the operator registry. register_prim_ops.cpp now contains the implementations for primitive operators that used to be in the interpreter. This means that it is now safe to use `getOperation(node)` to lookup the true interpreter function for the node, which will simplify const-propagation passes.

* Remove addInterpreterOpHandler in favor of global operator registry.

* Instead of descriptors, we match Node arguments directly against FunctionSchema describing expected inputs in `matchSchema`. `matchSchema` knows how parse both attributes and positional inputs from a node and match it to the appropriate registered operator. Debug error messages when we try to run an invalid operator are significantly improved: they now automatically display the schema for the op with the same name that are registered.

* Merge aten_schema into regsiter_aten_ops. Each Operator takes a string schema which is parsed to determine when to dispatch to that op.

* Cleans up gen_jit_dispatch.py now that we do not need to write out descriptors.  In particular, skip_scalar_overloads can be removed since Richard's code sorts declarations to put Tensor, Tensor declarations first.

* remove matchSchemaAndLiftConstantAttributes and use emitBuiltinCall instead to remove code duplication

* refactor stack manipulation functions into a separate header file.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8885

Reviewed By: jamesr66a

Differential Revision: D8751048

Pulled By: zdevito

fbshipit-source-id: 312aabfbf88307c5f6ab947b6caf691468b94557
2018-07-10 10:24:48 -07:00
Soumith Chintala
474fdd7e2d minor pybind for jit (#8890)
Summary:
add two small bindings to recently added attributes.

Also want to leave a reference gist here: https://gist.github.com/soumith/8102ef39530bac09070912b1a5401d0f

It showcases:

- traced a module
- symbolically differentiated the forward graph, to get a forward, backward graph
- executed the subsequent forward + backward graphs correctly
- compared the jit vs non-jit results
Closes https://github.com/pytorch/pytorch/pull/8890

Reviewed By: ezyang

Differential Revision: D8677663

Pulled By: soumith

fbshipit-source-id: a29919c05baad997cd7fb7df718f933a83035118
2018-07-01 21:39:29 -07:00
Chunli Fu
67b21117b7 Add BatchTensor class (#8922)
Summary:
Add BatchTensor class
- construct from data, mask, dims or construct from list of tensors
- can return a list of tensors from an BatchTensor class

next step: do IR level transformation and operators
Closes https://github.com/pytorch/pytorch/pull/8922

Differential Revision: D8668986

Pulled By: ChunliF

fbshipit-source-id: 8b24d2a9f46a3b42dbb397e99e9e059dfb2b326e
2018-06-29 15:57:27 -07:00
James Reed
2e23bc1a20 Switch to emitting ScriptModule for scripted and traced functions (#8876)
Summary:
Solves https://github.com/pytorch/pytorch/issues/8716 and closes https://github.com/pytorch/pytorch/issues/8867

This makes it so that all of {script, traced} {module, function} create ScriptModules and implements proper inlining between them. This also greatly simplifies things and makes clear that tracing is a way to convert regular Python into a ScriptModule
Closes https://github.com/pytorch/pytorch/pull/8876

Differential Revision: D8675996

Pulled By: jamesr66a

fbshipit-source-id: 3b12ad4b758324f558074c27c1f1a9fb616b170a
2018-06-28 16:44:21 -07:00
Soumith Chintala
b5a123c06c
[jit] Add python bindings for Gradient and differentiate (#8830)
* improve assertion error message in jit::differentiate

* add python binding for Graph::copy

* add pybind for jit::differentiate and jit::Gradient
2018-06-25 18:09:29 -04:00
Richard Zou
8489c4cc6e
Better support for literals in jit script (#8687)
Addresses #8177

A design doc can be found here: [gist](https://gist.github.com/zou3519/4b7f13f03cc9f3612bd9363e6405fa0a) version or [quip](https://fb.quip.com/azL1AqUckBdo) version

General approach:
- Add NumberType, FloatType, IntType to represent Python numbers, floats and ints.
- Emit these types for python literals
- Change aten_schema such that Scalars are NumberType, int64_t and bool are IntType.
- Emit aten::type_as, prim::NumToTensor, and prim::TensorToNum nodes for tensor-number math. (see examples below)
- Erase NumberType,  prim::NumToTensor, and prim::TensorToNum for ONNX export

### Tensor/number math
```
import torch
@torch.jit.script
def fn(x):
    return x + 1
```
```
graph(%x : Dynamic) {
  %1 : int = prim::Constant[value={1}]()
  %2 : Dynamic = prim::NumToTensor(%1)
  %3 : Dynamic = aten::type_as(%2, %x)
  %4 : Dynamic = aten::add[alpha={1}](%x, %4)
  return (%5);
}
```

### Number/Number Math
```
import torch
@torch.jit.script
def fn(zero):
    c = 1 + 1
    return zero + c
```
```
graph(%zero : Dynamic) {
  %1 : int = prim::Constant[value={1}]()
  %2 : int = prim::Constant[value={1}]()
  %3 : Dynamic = prim::num_to_tensor(%1)
  %4 : Dynamic = prim::num_to_tensor(%2)
  %5 : Dynamic = aten::add[alpha={1}](%3, %4)
  %c : int = prim::TensorToNum(%6)  # this is the result of the addition
  ...
  return (%13);
}
```

List of squashed commits:

* Introduce Python Number types

Added: IntType, FloatType, NumberType with
IntType <: NumberType
FloatType <: NumberType

Changed aten_schema so arguments have corresponding types

* Emit a NumberType for python literals.

Also emit a NumberType for Scalar default values.

* Add prim::NumToTensor and prim::TensorToNum

* Add DynamicType -> NumberType implicit cast for bc

* Better ensureTensor error message

* Add ensureTensorOrNumber. Allow passing Number to some functions

Like the range() construct and slices

* Patch IntList to work.

IntList is still a DynamicType in the frontend: a tensor gets built from
a List[int].

Also, IntList[1] is a "union between int and IntList" the way it is
implemented. If the frontend sees an int being passed for an IntList[1]
arg, it converts it to a tensor as well.

* Enforce some order on schemas to avoid overload ambiguity

add(Tensor, Tensor) should appear earlier than add(Tensor, Scalar). This
matches the order in which python_arg_parser parses its arguments.

* Disable std_dim and var_dim tests.

With the new schema information, std(input, keepdim) and std(input, dim)
are ambiguous. This will need to be fixed at a later date.

* Add NumberType erasure pass.

This is used for ONNX export and to ensure that NumberType information
doesn't reach the interpreter

* Add support for mixed tensor/number math ops.

* Tests for new functionality.

Includes:
- Tensor/number math
- number/number math
- EraseNumberTypes pass test

* Patch tests

Update expect tests for:
- decompose_addmm
- loop unrolling tests

Because python numbers are now NumberType, they cannot be returned by
functions anymore. Work around this by using "torch.full", or by adding
a tensor([0]) (taken from FIXME_zerol()). Both approaches are used
because torch.full is more readable, but it is broken in some cases.

* Add erase_number_types to torch/CMakeLists.txt

* Move math back to emitSimpleExpr from emitSugaredExpr

* Remove some dead lines

* Renable some excluded script/trace tests that are fixed.

* Move some tests to expected failure

* Address some comments (more addressing to come)

* Erase relevant aten::type_as nodes in EraseNumberTypes

I also changed it so that EraseNumberTypes is only called for ONNX
export. It is no longer used to prevent
prim::NumToTensor/prim::TensorToNum from reaching shape_analysis or
interpreter.cpp.

shape_analysis infers the type of the output of these nodes to be the
same as their input.

intepreter.cpp treats both of these nodes as no-ops.

* Add reminder to fix std/var

* Call EraseNumberTypes only when exporting a script module

* Update expects after rebase
2018-06-21 15:43:38 -04:00
Peter Goldsborough
372d1d6735
Create ATen tensors via TensorOptions (#7869)
* Created TensorOptions

Storing the type in TensorOptions to solve the Variable problem

Created convenience creation functions for TensorOptions and added tests

Converted zeros to TensorOptions

Converted rand to TensorOptions

Fix codegen for TensorOptions and multiple arguments

Put TensorOptions convenience functions into torch namespace too

All factory functions except *_like support TensorOptions

Integrated with recent JIT changes

Support *_like functions

Fix in place modification

Some cleanups and fixes

Support sparse_coo_tensor

Fix bug in Type.cpp

Fix .empty calls in C++ API

Fix bug in Type.cpp

Trying to fix device placement

Make AutoGPU CPU compatible

Remove some auto_gpu.h uses

Fixing some headers

Fix some remaining CUDA/AutoGPU issues

Fix some AutoGPU uses

Fixes to dispatch_tensor_conversion

Reset version of new variables to zero

Implemented parsing device strings

Random fixes to tests

Self review cleanups

flake8

Undo changes to variable.{h,cpp} because they fail on gcc7.2

Add [cuda] tag to tensor_options_cuda.cpp

Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks

Fix linker error in AutoGPU.cpp

Fix bad merge conflict in native_functions.yaml

Fixed caffe2/contrib/aten

Fix new window functions added to TensorFactories.cpp

* Removed torch::TensorOptions

Added code to generate wrapper functions for factory methods

Add implicit constructor from Backend to TensorOptions

Remove Var() from C++ API and use torch:: functions

Use torch:: functions more subtly in C++ API

Make AutoGPU::set_device more exception safe

Check status directly in DynamicCUDAHooksInterface

Rename AutoGPU to DeviceGuard

Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad

remove python_default_init: self.type()

Add back original factory functions, but with deprecation warnings

Disable DeviceGuard for a couple functions in ATen

Remove print statement

Fix DeviceGuard construction from undefined tensor

Fixing CUDA device compiler issues

Moved as many methods as possible into header files

Dont generate python functions for deprecated factories

Remove merge conflict artefact

Fix tensor_options_cuda.cpp

Fix set_requires_grad not being checked

Fix tensor_new.h

TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac

Fix bug in DeviceGuard.h

Missing includes

TEMPORARILY moving a few more methods into .cpp to see if it fixes windows

Fixing linker errors

* Fix up SummaryOps to use new factories

Undo device agnostic behavior of DeviceGuard

Use -1 instead of optional for default device index

Also move DeviceGuard methods into header

Fixes around device index after optional -> int32_t switch

Fix use of DeviceGuard in new_with_tensor_copy

Fix tensor_options.cpp

* Fix Type::copy(

* Remove test_non_float_params from ONNX tests

* Set requires_grad=False in ONNX tests that use ints

* Put layout/dtype/device on Tensor

* Post merge fixes

* Change behavior of DeviceGuard to match AutoGPU

* Fix C++ API integration tests

* Fix flip functions
2018-06-16 00:40:35 -07:00
Adam Paszke
f45a3d5558
Add a loop unrolling pass to PyTorch JIT (#7672) 2018-06-06 09:36:12 +02:00
Gao, Xiang
fe805794ac docstring support for @script and @script_method (#7898)
* docstring support for @script and @script_method

* make it python2 compatible

* improve according to review

* improve build_stmts

* use filter instead of list comprehension

* improve the way wrap is handled for script_method

* stash the original method instead

* allow dynamic attr for ScriptMethod and GraphExecutor

* a bit comment on build_Expr

* remove _build_wrap

* a bit improve on comments

* rename to __original_methods

* should be _original_methods
2018-06-05 10:36:08 -04:00
Zachary DeVito
185f8fbe7c Removing remaining NO_PYTHON ifdefs (#8067)
* Remove NO_PYTHON in tracing

* Remove NO_PYTHON in ir.h

* Remove NO_PYTHON in test_jit.cpp
2018-06-04 10:53:28 -04:00
Adam Paszke
9232afeffa
Add code for TensorBoard visualization of JIT GraphExecutors (#8050) 2018-06-02 20:55:25 +02:00
Zachary DeVito
23dd033b51 Factor python dependency out of interpreter (#7970)
* Factor python dependency out of interpreter

* Remove NO_PYTHON for the autograd engine

If there is no python bindings, then a default Engine is constructed
the first time it is requested.

If the python libraries are loaded, then they override the default
accessor and the default engine becomes a python Engine.

Note: it is possible for two engines to be generated if a non-python
one gets created before the python bindings are loaded. This case
is rare, and just results in additional threads being spawned.

* Fixing AlexNet test which is skipped in CI
2018-06-01 16:07:21 -04:00
James Reed
1f94a6eab3 [JIT] Fission and fusion passes for addmm (#7938)
* Addmm decomposition pass

* Addmm peephole pass

* Fix handling of output shape in fusion pass

* Add DCE to the peephole passes

* add comments

* maybe bugfix?

* Fix GPU tests

* fix py2/3 test issue
2018-05-30 18:06:58 -04:00
Adam Paszke
b45f2ff1ae
Remove CompiledFunction + clean up JIT tests (#7421) 2018-05-16 20:03:04 +02:00
James Reed
4667983f0f
Fixes for interpreter and ONNX export for translation (#7044)
Fixes for interpreter and ONNX export for translation

Address comments
2018-04-27 22:23:57 -07:00
James Reed
ef76e24f60
[JIT][script][ONNX] ScriptModule ONNX export + ONNX export for control flow nodes (#6608)
* ScriptModule ONNX export

* ScriptModule ONNX export

* Export for control flow nodes

* Add pretty-print capability for ONNX export testing

* Update tests and handling of mutliple GraphProto names

* Maybe bugfix?

* factor out code from export and pretty print
2018-04-19 23:45:03 -07:00
Zachary DeVito
f656301526 Allow traces to call @script functions (#6642)
This adds the ability to trace script functions while preserving their
control flow. When the trace encounters a script function it inlines
the graph of the function into the trace rather than tracing the
function itself.
2018-04-17 15:19:16 -04:00
James Reed
e8d2f05931
[JIT] Switch JIT passes to take a graph rather than TracingState (#6598)
* Switch JIT passes to take a graph rather than TracingState

* Add pybind11 binding for ONNX pass from graph

* Fix canonicalize pass

* address comment

* Switch ToONNX to explicitly return new graph

* optimize_graph instead of optimize_trace
2018-04-13 17:38:22 -07:00
Adam Paszke
c1cd6eab9f Handle broadcasting in the JIT (#6084)
* Add size checks to JIT's fuser

* Handle broadcasting in shape propagation pass

* Fix build errors and add tests
2018-04-05 17:07:52 -07:00
Adam Paszke
da6c3c90d9
Relax constraints on return statements in the script (#6070)
Script functions can now have no return statements, empty
return statements, or return one or more values.

Additionally fix the lexer to always emit TK_NEWLINE before
TK_DEDENT, which simplifies the parser.
2018-03-31 18:35:33 +02:00
Zachary DeVito
c8d1ec02be
[jit] Have ScriptModule inherit from Module (#5769)
* Have ScriptModule inherit from Module
  This is accomplished by created replacement _parameters, _buffers,
  and _modules which implement the OrderedDict APIs but which
  actually get/set their members inside script::Module
* Merge TracedModule with ScriptModule
* Move logic of attribute handling into Python bindings rather than
  make script::Module handle it. This was redundant with nn.Module,
  which already handles attribute.
* Make TracedModule a subclass of ScriptModule
* Move handling of attribute kind logic into bindings.
* Allow ScriptModule to contain non-script module submodules.
2018-03-22 00:17:49 -04:00
Adam Paszke
b239b123e4
Clean up TraceInput (#5743) 2018-03-15 19:38:33 +01:00
Adam Paszke
4afd62db09 Add TracedModule to the JIT (#5409) 2018-02-28 22:50:50 -08:00
Zachary DeVito
c6d47f6386
add @torch.jit.script, @torch.jit.compile, torch.jit.CompilationUnit(str) (#5367)
* torch.jit.trace annotation now creates a GraphExecutor

The other torch.jit.trace, which was used for testing purposes and for onnx to get the trace graph, is now called torch.jit. torch.jit.get_trace_graph.

* @script annotation, and compilation unit for strings
2018-02-26 13:22:45 -08:00
Adam Paszke
a0118533ef
Add a print() function to the JIT script (#5274)
Additionally:
- add support for calling functions that are not methods in the Python frontend
- add an end-to-end test for the Python frontend
- add a capture_stdout helper for checking that `print` actually works
2018-02-24 11:15:55 +01:00
Adam Paszke
cb2fd39fdd
Add Python frontend to the JIT (#5190) 2018-02-15 22:53:19 +01:00
Adam Paszke
8910dd5a81
Fix GraphExecutor and add more AD formulas (#5215) 2018-02-14 16:59:48 +01:00
bddppq
3e85613751 Experimental jit script (#5074) 2018-02-07 20:43:45 +01:00
Zachary DeVito
c308e03f3e
Initial GraphExecutor Implementation. (#4982)
This adds the initial implementation of graph executor for the new JIT design. It includes a few python tests ensuring that nograd, backward, and double-backward cases work for simple examples and some corner cases. More work needs to be done to performance optimize as there are many extra copies and places where we hold onto variables longer than we should. These are noted in the comments.
2018-02-02 17:45:59 -08:00
Zachary DeVito
0ae5498079 [JIT] add create_autodiff_subgraphs (#4822)
This pass splits differentiable subgraphs into their own Node,
similar to a fusion group.

This initial implementation does not create optimal subgraphs, but
it works well in the case where most things are differentiable,
and has the building blocks (`mergeNodes`) to extend to the
better implementation.
2018-01-23 23:46:54 -05:00
Adam Paszke
e6cbe84bf6 Handle repeated inputs in JIT tracer 2018-01-03 17:29:27 +01:00
Edward Z. Yang
6d72c82985
Trace ATen native functions as themselves, not their implementations. (#4127)
* Trace ATen non-primitive functions as themselves, not their implementations.

Previously, if I invoked an ATen non-primitive function foo, which in turn
called subfoo, I would always see 'subfoo' in the trace (e.g., tracing
'inlines' all of these operations.)  Such inlining is bad for ONNX
(and can be bad for optimization) as it prevents high-level
optimizations from taking advantage of the structure.  It might
be right to inline, but give the optimizer a chance to work before
inlining happens!

The implementation here is surprisingly simple, because it uses
the "DCE trick".  Essentially, it doesn't matter if the constituent
calls perform tracing, because you can always trace it again, and
override the trace nodes associated with the returned variables.
The original trace becomes dead and can be DCE'd.

While implementing this, I also refactored how 'isTracing' and
'trace_outputs' works:

- isTracing was previously a single function with overloads for
  both Tensor and Variable arguments.  Unfortunately, such overloads
  are not safe, because of how C++ implicit conversions work.  You
  would think that C++ should never confuse an overload for
  Variable with ArrayRef<Tensor>, but this is exactly what can
  happen: Tensor is convertible to both Variable and ArrayRef<Tensor>,
  thus it's ambiguous and C++ doesn't like it.  The last time I ran
  into this problem, I applied initializer lists to everything and
  called it a day.  A more robust fix is to separate out the
  Variable and Tensor overloads, which I have done in this patch.

- trace_outputs was fed as an initializer list, which doesn't work
  when you have heterogenous inputs.  So instead we first feed
  everything through 'flatten', which has overloads for each of the
  argument patterns in ATen, which then goes on to the recordTrace
  (which takes an ArrayRef).  This is *no less efficient*, because
  we were allocating a vector anyway (to do the conversion from
  vector of Tensor to vector of Variable).

This fixes mean that 'index' can properly be traced... although the
JIT still does not support it.  A failing test case has been added to
this effect.

Some knock-on effects:

- The fuser now knows about chunk as well as split.  They're pretty
  similar so there is no problem.

- There is a new 'canonicalize' pass in the JIT which renumbers a graph
  so that all structurally equivalent graphs render the same.

- We run DCE before the fuser tests, to make sure dead nodes don't
  block fusion.

- There are new ONNX exports for the newly introduced higher level ATen
  operations.  This includes type_as (no-op case only), chunk, select.

Zach didn't like the extra use of 'native' in the new codegen, so
we've introduced a new concept, 'abstract'.  An abstract function
is one that is implemented in derived types (e.g., CPUDoubleType),
where as a concrete one is implemented in the base type (Type).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-12-15 13:50:32 -05:00
Adam Paszke
d1fb8fdf03 Improve IODescriptors in JIT arg checking 2017-11-17 00:13:02 +01:00
Adam Paszke
1f1612ee37 Move _CompiledMixin to C++ 2017-11-10 16:31:44 +01:00
Adam Paszke
621fbd5c4e Move flattening/unflattening JIT logic to C 2017-11-06 19:42:44 -05:00
Edward Z. Yang
d4abaa4b9e Move ONNX broadcast fusion into separate ONNX pass, fixes verbose printing.
This breaks a lot of the onnx-pytorch tests because the abstraction
barriers are not respected.  I'll spin up a patch for that separately.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-11-01 09:49:53 -04:00
Edward Z. Yang
40f7f6e095 Improve handling of 'expand' (broadcasting) in JIT and ONNX
The pieces:

- I improved the lint / asserts to catch some bugs which I
  committed while working on my export.  There are two new
  properties which the linter checks now:

    (1) "Anticipated uses".  If a node says that is used by
    M, M better appear later in the topsort.  Previously,
    we only checked if it was in all_nodes.

    (2) If you are a select node, you better be a multi-type node;
    if you're not a select node, you better not be!  And you
    should never have an input that is multi-type.

- There is a new peephole optimization pass, for simple, local
  transformations to graphs.  Right now, it implements a simple
  optimization: remove 'expand' invocations that are no-ops
  (the size before matches the size after), but we can add other
  things to it later.  I needed this for ONNX because no-op expands
  show up in the left-hand argument, which we don't support.

- There is now a broadcast fuser, which fuses ATen expand ops
  into broadcastable ONNX ops (Add, Div, Mul, Pow, Sub, Gemm.)
  It only fuses when the original size is a suffix of the new
  size, as per the ONNX spec.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-10-29 23:50:34 -04:00
Lu Fang
0a1ac8bfe5 create a cse pass, with very naive support. 2017-09-22 17:06:27 -04:00
Adam Paszke
b708b6de8d Add ONNX pass (JIT trace initialization) 2017-09-19 10:53:32 -04:00
Adam Paszke
0e53fe3a41 Put ONNX files where they belong 2017-09-19 10:53:32 -04:00
Adam Paszke
8dae433de8 Move JIT passes to a separate directory 2017-09-19 10:53:32 -04:00
Zach DeVito
6d8d5bab4c Codemod Toffee -> ONNX, toffee -> onnx. Change file names to match 2017-09-06 13:45:39 -04:00
Adam Paszke
c537aebf5a Always run DCE in Traceable 2017-09-05 17:48:55 -04:00
Edward Z. Yang
d59714e3b1 Code review comment changes.
- Reduce setup.py diff.
- Expunge WITH_TOFFEE from codebase.
- Elaborate on a comment.
- Move gen_toffee.sh to tools
- Delete densenet test.
- Use 'using' to inherit a constructor.
- Delete outdated comment.
- Comment about why primspecs can return fewer outputs.
- Remove dead, commented out includes.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-05 17:48:55 -04:00