* Add a JIT interpreter
The separate interpreter is used to graphs with a lower overhead than
converting them to autograd graphs. Some notes:
* does not support Handles/PythonOp/CppOp, these will be in a future commit
* jit_closure.cpp still exists and we fall back to it for now when
cannot handle something because of PythonOp/CppOp
* In order to support retain_graph=True, the interpreter can be cloned,
creating a copy that can be run with different arguments. This is
assumed to be the non-standard case so cloning is not particularly optimized.
No tensor _data_ is copied, but the at::Tensor list in the interpreter is.
If we hit problems, there is a lot we could do (such as register allocation)
to minimize the stuff that needs to be copied.
* Uses a pImpl pattern to keep implementation details out of its header file.
* Modifies the way getTensorOp works so that it reads/writes to already-existing
vectors, this prevents needing to realloc these buffers each time.
* Timings are here: https://gist.github.com/zdevito/5a20ac29fb1b9e449e693b67dc478127
This reduces overhead to about the same as running it in python.
It is about 10us faster to run the same thing using ATen directly.
* Code Mod
Interpreter -> InterpreterState
Function -> Code
Add other requested comments.
* RegList -> ListHandle<T>
Change the RegList functions to be safer by identifying the type of
each argument list, and checking that list insert does not try
to add to two different lists at once.
* Use exactly equal for interp tests