* Don't unnecessarily wrap the elem in PythonTensor
Instead of saying that a PythonTensor has a regular (e.g., CPU) tensor
and an FX proxy, a PythonTensor *is a* regular CPU tensor, that also
carries an FX proxy (that updates as we go along).
This should fix https://github.com/pytorch/functorch/issues/465 and
it also fixed some expected failures in the test suite.
This kills the meta variant logic entirely; maybe some other time we'll
try to bring it back.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
* Added decomposition testing + display infra
* add a couple more decompositions
* changed some stuff
* made some changes
* Added decomposition testing + display infra
* add a couple more decompositions
* fix some decompositions
* changed some stuff
* updated generation
* fix test failures
* removed extraneous files
* fixed test failures
* fixed tests
* updated
* fixed tests again
Two main things happened:
- I removed {wrap_key, PythonTensor, pythonkey_trace} from being public
APIs
- I moved all compilation related things to the functorch.compile
namespace. This includes nnc_jit which is now in
functorch.compile.nnc_jit
Concerns:
- nnc_jit was in the functorch namespace for a long time. Should we
leave it there? Are there stakeholders to notify?
Summary: Recomputation fwd in the bwd pass can improve the performance
of pointwise operators, where it helps us in reduce memory bandwidth
pressure at the expense of more computation. This PR adds a new
partitioning function to enable this type of recomputation.
* Support buffers in compiled_module
* Don't compute gradients for inputs that don't require grad
* Add a unit test for batchnorm
* Fix eager compilation tests that change requires_grad
* Create new args for tests without recompilation
* Enable some eager fusion opinfo tests that now work (because we stopped asking for unimplemented derivatives)
Summary: The existing code assumed a single output; this generalizes to tuple
outputs
Test Plan: Compile a simple test program with multiple outputs and check that
outputs/grads are the same as eager.
```
def foo(a, b):
return a + b, a * b
```
* handled some cases of index.Tensor
* fixed merge errors
* Added batching rules for index, both cases are batched
* fix some issues
* handled some cases of index.Tensor
* fixed merge errors
* Added batching rules for index, both cases are batched
* fix some issues
* fix tests
* handled some cases of index.Tensor
* fixed merge errors
* fixed tests