Commit Graph

139 Commits

Author SHA1 Message Date
Edward Z. Yang
6a18616296 Support for sym_strides() in backwards formulas (#85210)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85210
Approved by: https://github.com/Chillee, https://github.com/voznesenskym
2022-09-19 18:05:09 +00:00
Edward Z. Yang
f1ee162193 Use SymInt signature to compute saved variables (#84354)
This seems to have been accidentally working, but it broke
when I added support for saving optional SymInt directly
from input arguments.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84354
Approved by: https://github.com/Krovatkin
2022-09-01 16:30:00 +00:00
Edward Z. Yang
ad44670fa1 Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628)" (#84173)
Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature"

Original commit changeset: dab4a9dba4fa
Original commit changeset: dcaf16c037a9

Original Phabricator Diff: D38984222
Original Phabricator Diff: D39075159

Also update Metal registrations for C++ registration changes.

Also update NNPI registration to account for tightened schema checking

Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173
Approved by: https://github.com/Krovatkin
2022-08-29 18:01:07 +00:00
PyTorch MergeBot
c7edcd6968 Revert "Don't introduce new overload for SymInt (#83628)"
This reverts commit 9790d90e4b.

Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to Breaks internal builds, see D39076487
2022-08-27 01:23:17 +00:00
Edward Z. Yang
9790d90e4b Don't introduce new overload for SymInt (#83628)
Previously, we introduced new SymInt overloads for every function we wanted.  This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented.

This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts.

This is BC-breaking in the following ways:

* The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change.  Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually.  This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this.

This is not BC-breaking in the following ways:

* The user facing C++ API remains compatible.  Even if a function changes from int to SymInt, the default C++ binding still takes only ints.  (e.g., at::empty(IntArrayRef, ...).  To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed.
* This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type.

Structure of the PR:

* The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it *as if* it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other:
  * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular:
    * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences.
    * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!)
  * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway.
* Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes.
* The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK.
* I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it.
* I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload)
* I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.)
* I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints.
* I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628
Approved by: https://github.com/albanD, https://github.com/bdhirsh
2022-08-26 01:35:40 +00:00
PyTorch MergeBot
a7edf71360 Revert "Don't introduce new overload for SymInt (#83628)"
This reverts commit 8fae7027b3.

Reverted https://github.com/pytorch/pytorch/pull/83628 on behalf of https://github.com/malfet due to breaking internal builds, see https://www.internalfb.com/diff/D38984222
2022-08-25 00:49:40 +00:00
Sergii Dymchenko
591222f5d9 Fix use-dict-literal lint (#83718)
Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718
Approved by: https://github.com/albanD
2022-08-24 00:26:46 +00:00
Edward Z. Yang
8fae7027b3 Don't introduce new overload for SymInt (#83628)
Previously, we introduced new SymInt overloads for every function we wanted.  This led to a lot of boilerplate, and also a lot of confusion about how the overloads needed to be implemented.

This PR takes a simpler but more risky approach: just take the original function and changes its ints to SymInts.

This is BC-breaking in the following ways:

* The C++ API for registering implementations for aten operators will change from int64_t to SymInt whenever you make this change. Code generated registrations in PyTorch do not change as codegen handles the translation automatically, but manual registrations will need to follow the change.  Typically, if you now accept a SymInt where you previously only took int64_t, you have to convert it back manually.  This will definitely break XLA, see companion PR https://github.com/pytorch/xla/pull/3914 Note that not all dispatch keys get the automatic translation; all the composite keys and Meta keys are modified to take SymInt directly (because they should handle them directly), and so there are adjustments for this.

This is not BC-breaking in the following ways:

* The user facing C++ API remains compatible.  Even if a function changes from int to SymInt, the default C++ binding still takes only ints.  (e.g., at::empty(IntArrayRef, ...).  To call with SymInts, you must call at::empty_symint instead. This involved adding two more signatures to CppSignatureGroup; in many cases I refactored code to iterate over all signatures in the group instead of hard-coding the two that previously existed.
* This is TorchScript compatible; internally we treat SymInts as ints so there is no change to what happens at runtime in TorchScript. In particular, it's OK to reference an empty schema by its old type (using int types), as long as you're not doing string equality (which you shouldn't be), these parse to the same underyling type.

Structure of the PR:

* The general strategy of this PR is that, even when you write `SymInt` inside `native_functions.yaml`, sometimes, we will treat it *as if* it were an `int`. This idea pervades the codegen changes, where we have a translation from SymInt to c10::SymInt or int64_t, and this is controlled by a symint kwarg which I added and then audited all call sites to decide which I wanted. Here are some of the major places where we pick one or the other:
  * The C++ FunctionSchema representation represents `SymInt` as `int`. There are a few places we do need to know that we actually have a SymInt and we consult `real_type()` to get the real type in this case. In particular:
    * When we do schema validation of C++ operator registration, we must compare against true schema (as the C++ API will provide `c10::SymInt`, and this will only be accepted if the schema is `SymInt`. This is handled with cloneWithRealTypes before we check for schema differences.
    * In `toIValue` argument parsing, we parse against the true schema value. For backwards compatibility reasons, I do still accept ints in many places where Layout/SymInt/etc were expected. (Well, accepting int where SymInt is expected is not BC, it's just the right logic!)
  * In particular, because SymInt never shows up as type() in FunctionSchema, this means that we no longer need a dedicated Tag::SymInt. This is good, because SymInts never show up in mobile anyway.
* Changes to functorch/aten are mostly about tracking changes to the C++ API registration convention. Additionally, since SymInt overloads no longer exist, registrations for SymInt implementations are deleted. In many cases, the old implementations did not properly support SymInts; I did not add any new functionality with this PR, but I did try to annotate with TODOs where this is work to do. Finally, because the signature of `native::` API changed from int to SymInt, I need to find alternative APIs for people who were directly calling these functions to call. Typically, I insert a new dispatch call when perf doesn't matter, or use `at::compositeexplicitautograd` namespace to handle other caes.
* The change to `make_boxed_from_unboxed_functor.h` is so that we accept a plain IntList IValue anywhere a SymIntList is expected; these are read-only arguments so covariant typing is OK.
* I change how unboxing logic works slightly. Previously, we interpret the C++ type for Layout/etc directly as IntType JIT type, which works well because the incoming IValue is tagged as an integer. Now, we interpret the C++ type for Layout as its true type, e.g., LayoutType (change to `jit_type.h`), but then we accept an int IValue for it anyway. This makes it symmetric with SymInt, where we interpret the C++ type as SymIntType, and then accept SymInt and int IValues for it.
* I renamed the `empty.names` overload to `empty_names` to make it less confusing (I kept mixing it up with the real empty overload)
* I deleted the `empty.SymInt` overload, which ended up killing a pile of functions. (This was originally a separate PR but the profiler expect test was giving me grief so I folded it in.)
* I deleted the LazyDynamicOpsTest tests. These were failing after these changes, and I couldn't figure out why they used to be passing: they make use of `narrow_copy` which didn't actually support SymInts; they were immediately converted to ints.
* I bashed LTC into working. The patches made here are not the end of the story. The big problem is that SymInt translates into Value, but what if you have a list of SymInt? This cannot be conveniently represented in the IR today, since variadic Values are not supported. To work around this, I translate SymInt[] into plain int[] (this is fine for tests because LTC dynamic shapes never actually worked); but this will need to be fixed for proper LTC SymInt support. The LTC codegen also looked somewhat questionable; I added comments based on my code reading.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83628
Approved by: https://github.com/albanD, https://github.com/bdhirsh
2022-08-23 22:04:07 +00:00
Mikayla Gawarecki
e3e33cfae0 Enable codegen of per-dispatch key derivative formulas in derivatives.yaml (#82801)
`derivatives.yaml` can now take a `dispatch` entry which registers per-autograd dispatch key derivatives such as
```
name: foo(Tensor self, Tensor y) -> Tensor
dispatch:
  Default:
    x: grad
    y: grad.expand(y.sizes())
  AutogradNestedTensor:
    x: grad
    y:  NestedTensor_foo_backward(grad, y)
output_differentiabilty: [True]
```

However the old schema where there is no `dispatch` entry is still supported.

Would greatly appreciate feedback on *how to improve the testing strategy* of this PR, currently have registered an aten test op in TestOps.cpp with dummy gradients in derivatives.yaml and have some tests in test_autograd.py:TestAutogradMultipleDispatch but I am not sure whether these are sufficiently rigorous.

Additionally, this PR also makes the assumption that sets like [VIEW_FUNCTIONS](ff5399e528/tools/autograd/gen_inplace_or_view_type.py (L60)) are per-native-function and not per-native-function-and-dispatch-key. I'm not sure whether this is necessarily the case, *would there ever be a situation where (e.g. a nested_tensor op is a view op but the aten function is not or vice versa?)*

* __->__ #82801
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82801
Approved by: https://github.com/bhosmer, https://github.com/albanD
2022-08-10 19:26:29 +00:00
Nikolay Korovaiko
d2c47d559c Revert "Revert "Enabling SymInt in autograd; take 3 (#81145)"" ; make sure is_intlist checks for symintnodes (#82189)
### Description
<!-- What did you change and why was it needed? -->

### Issue
<!-- Link to Issue ticket or RFP -->

### Testing
<!-- How did you test your change? -->

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189
Approved by: https://github.com/ezyang
2022-07-26 20:47:11 +00:00
PyTorch MergeBot
c078476eb0 Revert "Enabling SymInt in autograd; take 3 (#81145)"
This reverts commit 032facd6e6.

Reverted https://github.com/pytorch/pytorch/pull/81145 on behalf of https://github.com/jeanschmidt due to breaking internal builds
2022-07-22 11:15:20 +00:00
Nikolay Korovaiko
032facd6e6 Enabling SymInt in autograd; take 3 (#81145)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81145
Approved by: https://github.com/ezyang
2022-07-22 00:14:50 +00:00
Huy Do
347b036350 Apply ufmt linter to all py files under tools (#81285)
With ufmt in place https://github.com/pytorch/pytorch/pull/81157, we can now use it to gradually format all files. I'm breaking this down into multiple smaller batches to avoid too many merge conflicts later on.

This batch (as copied from the current BLACK linter config):
* `tools/**/*.py`

Upcoming batchs:
* `torchgen/**/*.py`
* `torch/package/**/*.py`
* `torch/onnx/**/*.py`
* `torch/_refs/**/*.py`
* `torch/_prims/**/*.py`
* `torch/_meta_registrations.py`
* `torch/_decomp/**/*.py`
* `test/onnx/**/*.py`

Once they are all formatted, BLACK linter will be removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81285
Approved by: https://github.com/suo
2022-07-13 07:59:22 +00:00
Nikolay Korovaiko
efc7343743 Revert "Revert "Put symint overloads on a different name"" (#79680)
This relands https://github.com/pytorch/pytorch/pull/79281

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79680
Approved by: https://github.com/malfet
2022-06-21 07:06:33 +00:00
PyTorch MergeBot
b9bb52d97b Revert "Put symint overloads on a different name"
This reverts commit 213a8fc992.

Reverted https://github.com/pytorch/pytorch/pull/79281 on behalf of https://github.com/bigfootjon due to Diff reverted internally
2022-06-15 17:15:21 +00:00
Edward Z. Yang
213a8fc992 Put symint overloads on a different name
Due to implicit conversion shenanigans, having both IntArrayRef
and SymIntArrayRef overloads makes {} ambiguous.  While we could
fix this by making a single unified type that accepts all the overloads
we want, an easier fix was to just push the SymIntArrayRef overload
to its own name.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79281

Approved by: https://github.com/suo
2022-06-12 14:36:39 +00:00
Brian Hirsh
0161e9eb00 [test] attempt to functionalize ops with mutable positional-only args
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76320

Approved by: https://github.com/ezyang
2022-05-19 18:50:34 +00:00
anjali411
b204ad863f Revert "Revert "Allow specifying tags for aten operators in native_functions.yaml""
This reverts commit ea44645c9a.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76456

Approved by: https://github.com/osalpekar
2022-04-28 02:04:57 +00:00
Brian Hirsh
40d96f0afd Revert "functionalization: add support for zero_()"
This reverts commit 7d44b3675b.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76375

Approved by: https://github.com/datumbox, https://github.com/albanD
2022-04-26 19:27:27 +00:00
Brian Hirsh
7d44b3675b functionalization: add support for zero_()
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75913

Approved by: https://github.com/albanD
2022-04-25 21:31:48 +00:00
Edward Yang
36420b5e8c Rename tools/codegen to torchgen (#76275)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76275

In preparation for addressing
https://github.com/pytorch/pytorch/issues/73212

Diff was generated with:

```
git mv tools/codegen torchgen
git grep -l 'tools.codegen' | xargs sed -i 's/tools.codegen/torchgen/g'
sed -i "s/\${TOOLS_PATH}\/codegen/\${TORCH_ROOT}\/torchgen/g" caffe2/CMakeLists.txt
```

and a manual edits to:

* tools/test/test_gen_backend_stubs.py
* torchgen/build.bzl
* torchgen/gen_backend_stubs.py

aka this diff:

```
 diff --git a/tools/test/test_gen_backend_stubs.py b/tools/test/test_gen_backend_stubs.py
index 3dc26c6d2d..104054575e 100644
 --- a/tools/test/test_gen_backend_stubs.py
+++ b/tools/test/test_gen_backend_stubs.py
@@ -9,7 +9,7 @@ from torchgen.gen_backend_stubs import run
 from torchgen.gen import _GLOBAL_PARSE_NATIVE_YAML_CACHE  # noqa: F401

 path = os.path.dirname(os.path.realpath(__file__))
-gen_backend_stubs_path = os.path.join(path, '../torchgen/gen_backend_stubs.py')
+gen_backend_stubs_path = os.path.join(path, '../../torchgen/gen_backend_stubs.py')

 # gen_backend_stubs.py is an integration point that is called directly by external backends.
 # The tests here are to confirm that badly formed inputs result in reasonable error messages.
 diff --git a/torchgen/build.bzl b/torchgen/build.bzl
index ed04e35a43..d00078a3cf 100644
 --- a/torchgen/build.bzl
+++ b/torchgen/build.bzl
@@ -1,6 +1,6 @@
 def define_targets(rules):
     rules.py_library(
-        name = "codegen",
+        name = "torchgen",
         srcs = rules.glob(["**/*.py"]),
         deps = [
             rules.requirement("PyYAML"),
@@ -11,6 +11,6 @@ def define_targets(rules):

     rules.py_binary(
         name = "gen",
-        srcs = [":codegen"],
+        srcs = [":torchgen"],
         visibility = ["//visibility:public"],
     )
 diff --git a/torchgen/gen_backend_stubs.py b/torchgen/gen_backend_stubs.py
index c1a672a655..beee7a15e0 100644
 --- a/torchgen/gen_backend_stubs.py
+++ b/torchgen/gen_backend_stubs.py
@@ -474,7 +474,7 @@ def run(
 ) -> None:

     # Assumes that this file lives at PYTORCH_ROOT/torchgen/gen_backend_stubs.py
-    pytorch_root = pathlib.Path(__file__).parent.parent.parent.absolute()
+    pytorch_root = pathlib.Path(__file__).parent.parent.absolute()
     template_dir = os.path.join(pytorch_root, "aten/src/ATen/templates")

     def make_file_manager(install_dir: str) -> FileManager:
```

run_all_fbandroid_tests

Test Plan: sandcastle

Reviewed By: albanD, ngimel

Differential Revision: D35770317

fbshipit-source-id: 153ac4a7fef15b1e750812a90bfafdbc8f1ebcdf
(cherry picked from commit c6d485d1d4648fa1c8a4c14c5bf3d8e899b9b4dd)
2022-04-25 01:38:06 +00:00
Edward Z. Yang
a11c1bbdd0 Run Black on all of tools/
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76089

Approved by: https://github.com/albanD
2022-04-20 17:29:41 +00:00
Ivan Yashchuk
bba4780232 Enable autograd wrt sparse CSR tensors
This pull request enables accumulating gradients for the CSR tensor.
Functions that work and are tested:
- tensor.abs()
- tensor.neg()
- tensor.conj_physical()
- torch.addmm

`torch.mm` also works, but tests will be added later.

In addition, this PR adds throwing an error when trying to access strides, storage, and contiguity info on a CSR tensor.

`tensor.to_sparse_csr().to_sparse_csr()` was failing and now fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75435
Approved by: https://github.com/cpuhrsch
2022-04-19 18:42:45 +00:00
soulitzer
d7b29f3ee6 Add forward AD codegen support for single formula returning multiple outputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75583

Approved by: https://github.com/albanD
2022-04-13 15:03:47 +00:00
Brian Hirsh
23b8414391 code-generate non-aliasing {view}_copy kernels (#73442)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73442

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D35016025

Pulled By: bdhirsh

fbshipit-source-id: 2a7f303ec76f5913b744c7822a531d55a57589c9
(cherry picked from commit 3abe13c2a787bcbe9c41b0a335c96e5a3d3642fb)
2022-04-11 19:48:55 +00:00
PyTorch MergeBot
ea44645c9a Revert "Allow specifying tags for aten operators in native_functions.yaml"
This reverts commit 1dab71ab25.

Reverted https://github.com/pytorch/pytorch/pull/72549 on behalf of https://github.com/malfet
2022-03-28 18:04:38 +00:00
anjali411
1dab71ab25 Allow specifying tags for aten operators in native_functions.yaml
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72549

Approved by: https://github.com/ezyang
2022-03-25 21:17:52 +00:00
Joel Schlosser
fc37e5b3ed Hook up general convolution to convolution_backward (#69584)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69584

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32936380

Pulled By: jbschlosser

fbshipit-source-id: c6fdd88db33bd1a9d0eabea47ae09a4d5b170e92
2021-12-12 17:30:01 -08:00
Edward Yang
ece0221854 Rename int to long, add more C++ types. (#66108)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66108

BC-breaking change: intT is now longT (which aligns it more accurately with how
the types are referred to in C++).  The benefit for this is we can idiomatically
express all C++ dtypes (with intT now mapping to int32_t).  These types are needed
for ufunc codegen in a latter patch.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D31385761

Pulled By: ezyang

fbshipit-source-id: ec6f3a0953794313470dbe14911f23ac116be425
2021-10-08 08:25:06 -07:00
Michael Dagitses
543185a0fd support using gradients named for outputs in derivatives (#63947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63947

Fixes #62196

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30541485

Pulled By: dagitses

fbshipit-source-id: ea1dd0edd1a51936a295631e52b85e9c022a9c87
2021-09-18 07:31:45 -07:00
Michael Dagitses
926a3d2e85 clarify implementation of check_grad_usage (#64439)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64439

1) remove unused fully_implemented
2) rename used_grad to uses_grad and make it a boolean
3) rename used_grads to num_grads_uses
4) add comments explaining what some of the checks mean

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D30733904

Pulled By: dagitses

fbshipit-source-id: dccbbef8a4be8713215ef91aa97a34124f06a7a1
2021-09-18 07:30:30 -07:00
Michael Dagitses
b737629ff0 simplify op name determination into a single forward pass (#64261)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64261

Note that this does not preserve byte-for-byte compatibility with
existing names.

Test Plan:
* Rely on CI to catch gross errors.
* Merge after release cut to catch subtle issues.

Reviewed By: albanD

Differential Revision: D30700647

Pulled By: dagitses

fbshipit-source-id: 7b02f34b8fae3041240cc78fbc6bcae498c3acd4
2021-09-02 07:32:11 -07:00
Richard Zou
389380ffcc [reland] Refactor Tensor::to to call a primitive that is not copy_. (#62262)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62262

Context
-------
functorch is unable to vmap(grad(f)) when f contains a .to
call. This is because .to (when it is not a no-op) decomposes
to .copy_ under grad and the .copy_ is not compatible with vmap.

Fix
 ---
The fix for this is to have all Tensor::to variants call a new operator,
`_to_copy`, that always copies and is a primitive w.r.t. autograd so
that autograd decomposes Tensor::to into a call to `_to_copy`.
(This is related to https://github.com/pytorch/pytorch/issues/60956,
please let me know if you want to bikeshed the naming).

In order to get this done I had to do a bit of refactoring. All of the
`::to` implementations now call `to_impl` which may call `_to_copy`.

Autograd codegen changes
------------------------

The second thing I had to do was modify the autograd codegen. Right now,
autograd assumes that every output is either statically known to be
differentiable or not differentiable at codegen time. `_to_copy` is a
little special because its differentiability depends on the output
dtype. e.g. `torch.randn(3, requires_grad=True).to(torch.long)` is non
differentiable. To get this to work:
- I changed how `output_differentiability` in derivatives.yaml work.
- output_differentiability can now accept "conditions" for each of the
output arguments. A "condition" is some C++ code.
- We currently only support `output_differentiability` with conditions
if there is a single output. This is for convenience and can be changed
in the future.
- I added a new `output_differentiability_conditions` field to
DifferentiabilityInfo. This gets populated in load_derivatives.yaml
- forward-mode and reverse-mode AD take
`output_differentiability_conditions` into account.

Here's how the generated code for `VariableType::_to_copy`
[looks
like](https://gist.github.com/zou3519/93462df4bda1837acee345205b7cc849)
No other autogenerated code gets modified by this PR.

Performance benchmarking
------------------------
- I benchmarked [three
cases that demonstrate overhead](https://gist.github.com/zou3519/5b6985e6906b80eec5a0dd94ed5b6a1a).
- Case A: No-op .to(). Instruction count went from 50223 to 25623. I
have no clue why but this is a good thing.
- Case B: not-no-op .to(). Instruction count went from 665291 to 671961.
This is expected; `_to_copy` adds an additional dispatch.
- Case C: not-no-op .to() forward pass and backward pass. Instruction count
went from 4022841 to 4030057. This PR adds
an additional dispatch to .to() (so there should be one additional
dispatch in the forward pass) so this number looks reasonable.

Test Plan
---------
- test_torch.py has a test_to
- test_cuda.py has test_to*
- test_autograd has tests (test_type_conversions) that exercise the
reverse-mode path
- test_ops.py has some tests (like log_softmax) that exercise the
reverse-mode and forward-mode AD path.
- test_quantization, test_namedtensor all exercise tensor.to as well.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D29934998

Pulled By: zou3519

fbshipit-source-id: 820069acd66fd5af97b98f42edfca68572c9eb1c
2021-07-29 10:49:32 -07:00
albanD
ab0354b650 All remaining linear/element-wise formulas (#59993)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59993

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29914594

Pulled By: albanD

fbshipit-source-id: 2ffc5993cb66586e1458d7016774a03dfe786863
2021-07-27 13:06:46 -07:00
albanD
4a36e2a223 Add forward AD inplace check and fix codegen (#60498)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60498

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29914593

Pulled By: albanD

fbshipit-source-id: bde649d5a03639a240dfe5fe027c6a3f758428a4
2021-07-27 13:04:55 -07:00
Nikita Shulga
478098aaac Revert D29801652: Refactor Tensor::to to call a primitive that is not copy_.
Test Plan: revert-hammer

Differential Revision:
D29801652 (29bb3f4647)

Original commit changeset: bb01eb1acf3d

fbshipit-source-id: 93693bad8068d47a3a4c16f34f300e03ea573897
2021-07-26 19:37:17 -07:00
Richard Zou
29bb3f4647 Refactor Tensor::to to call a primitive that is not copy_. (#61458)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61458

Context
-------
functorch is unable to vmap(grad(f)) when f contains a .to
call. This is because .to (when it is not a no-op) decomposes
to .copy_ under grad and the .copy_ is not compatible with vmap.

Fix
 ---
The fix for this is to have all Tensor::to variants call a new operator,
`_to_copy`, that always copies and is a primitive w.r.t. autograd so
that autograd decomposes Tensor::to into a call to `_to_copy`.
(This is related to https://github.com/pytorch/pytorch/issues/60956,
please let me know if you want to bikeshed the naming).

In order to get this done I had to do a bit of refactoring. All of the
`::to` implementations now call `to_impl` which may call `_to_copy`.

Autograd codegen changes
------------------------

The second thing I had to do was modify the autograd codegen. Right now,
autograd assumes that every output is either statically known to be
differentiable or not differentiable at codegen time. `_to_copy` is a
little special because its differentiability depends on the output
dtype. e.g. `torch.randn(3, requires_grad=True).to(torch.long)` is non
differentiable. To get this to work:
- I changed how `output_differentiability` in derivatives.yaml work.
- output_differentiability can now accept "conditions" for each of the
output arguments. A "condition" is some C++ code.
- We currently only support `output_differentiability` with conditions
if there is a single output. This is for convenience and can be changed
in the future.
- I added a new `output_differentiability_conditions` field to
DifferentiabilityInfo. This gets populated in load_derivatives.yaml
- forward-mode and reverse-mode AD take
`output_differentiability_conditions` into account.

Here's how the generated code for `VariableType::_to_copy`
[looks
like](https://gist.github.com/zou3519/93462df4bda1837acee345205b7cc849)
No other autogenerated code gets modified by this PR.

Performance benchmarking
------------------------
- I benchmarked [three
cases that demonstrate overhead](https://gist.github.com/zou3519/5b6985e6906b80eec5a0dd94ed5b6a1a).
- Case A: No-op .to(). Instruction count went from 50223 to 25623. I
have no clue why but this is a good thing.
- Case B: not-no-op .to(). Instruction count went from 665291 to 671961.
This is expected; `_to_copy` adds an additional dispatch.
- Case C: not-no-op .to() forward pass and backward pass. Instruction count
went from 4022841 to 4030057. This PR adds
an additional dispatch to .to() (so there should be one additional
dispatch in the forward pass) so this number looks reasonable.

Test Plan
---------
- test_torch.py has a test_to
- test_cuda.py has test_to*
- test_autograd has tests (test_type_conversions) that exercise the
reverse-mode path
- test_ops.py has some tests (like log_softmax) that exercise the
reverse-mode and forward-mode AD path.
- test_quantization, test_namedtensor all exercise tensor.to as well.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D29801652

Pulled By: zou3519

fbshipit-source-id: bb01eb1acf3d79d84f284150d1be4be3b4ace351
2021-07-26 13:02:39 -07:00
albanD
30a18fe318 refactor yaml loader import, no runtime change (#59850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59850

This whole stack does not change anything to the codegened code

Test Plan: Imported from OSS

Reviewed By: ailzhang

Differential Revision: D29063816

Pulled By: albanD

fbshipit-source-id: ca3067443d8e6282c1077d3dafa3b4f330d43b28
2021-06-12 06:58:34 -07:00
albanD
7143a6a189 Avoid unnecessary re-computation autograd codegen 21s -> 15s (#59847)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59847

This whole stack does not change anything to the codegened code

Test Plan: Imported from OSS

Reviewed By: ailzhang

Differential Revision: D29063817

Pulled By: albanD

fbshipit-source-id: 284c3e057029b7a67f43a1b034bb30863bd68c71
2021-06-12 06:57:19 -07:00
anjali411
3607478ecd Conjugate View (#54987)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987

Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype:

Here's a summary of the changes in this PR:
This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose).

1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor.
2. NEW API:
    a) `.conj()` -- now returning a view.
    b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory.
    c) `.conj_physical_()`, and `out=` variant
    d) `.resolve_conj()`  -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0.
    e) `.resolve_conj_()` in-place version of (d)
    f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors.
    g) `view_as_real` -- existing function, but now errors out on conjugated tensors.
3. Conjugate Fallback
    a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor.
    b) This fallback is well equipped to handle the following cases:
        - functional operation e.g., `torch.sin(input)`
        - Mutable inputs and in-place operations e.g., `tensor.add_(2)`
        - out-of-place operation e.g., `torch.sin(input, out=out)`
        - Tensorlist input args
        - NOTE: Meta tensors don't work with conjugate fallback.
4. Autograd
    a) `resolve_conj()` is an identity function w.r.t. autograd
    b) Everything else works as expected.
5. Testing:
    a) All method_tests run with conjugate view tensors.
    b) OpInfo tests that run with conjugate views
        - test_variant_consistency_eager/jit
        - gradcheck, gradgradcheck
        - test_conj_views (that only run for `torch.cfloat` dtype)

NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit.

Follow up work:
1. conjugate view RFC
2. Add neg bit to re-enable view operation on conjugated tensors
3. Update linalg functions to call into specialized functions that fast path with the hermitian operation.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D28227315

Pulled By: anjali411

fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 14:12:41 -07:00
albanD
d095ec75a1 Forward AD formulas batch 2 (#57863)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57863

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28387763

Pulled By: albanD

fbshipit-source-id: e1b60ab728bb05b9e3323ee0dc7e401aaf5b8817
2021-06-03 07:33:04 -07:00
albanD
09a1b1cf87 Forward AD formulas batch 1 (#57768)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57768

Note that this PR implements formulas only for ops that are supported by OpInfo.

Test Plan: Imported from OSS

Reviewed By: zou3519, malfet

Differential Revision: D28387766

Pulled By: albanD

fbshipit-source-id: b4ba1cf1ac1dfd46cdd889385c9c2d5df3cf7a71
2021-05-25 07:29:25 -07:00
Kurt Mohler
fe8e5eb260 Change native functions to take c10::string_view args instead of std::string (#57680)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57680

Reviewed By: malfet

Differential Revision: D28511799

Pulled By: ezyang

fbshipit-source-id: 43142f994d048b28b3279ccdb7a28cbaa3190973
2021-05-20 18:15:45 -07:00
Brian Hirsh
9354a68e7d [codegen] split out backend-specific information from NativeFunction in the model (#57361)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57361

Data model change in the codegen, which splits backend-specific information out of `NativeFunction`

### Overview
Currently in the codegen, native_functions.yaml has backend-specific information about each operator that is encoded directly into the data model, in the `NativeFunction` object. That's reasonable, since the native_functions.yaml is the source of truth for information about an operator, and the data model encodes that information into types.

Now that external backends can use the codegen though, that information is technically incomplete/inaccurate. In another PR, I tried patching the information on the `NativeFunction` object with the additional external information, by updating the `dispatch` entry to contain the external backend kernel name and dispatch key.

Instead, this PR tries to split out that information. The `NativeFunction` class contains all information about an operator from native_functions.yaml that's backend-independent and is known never to change regardless of what extra information backends provide. We also build up a backend "index", which is basically a mapping from [backend] -> [backend-specific-metadata]. Reading in an external backend yaml just involves updating that index with the new backend.

There were a few places where `NativeFunction` used the dispatch table directly, that I encoded as properties directly on the NativeFunction object (e.g. `is_abstract`). They were mostly around whether or not the operator has a composite kernel, which isn't something that's going to change for any external backends.

This has a few advantages:
- We can more easily re-use the existing logic in `native_function.py` and `register_dispatch_key.py` for both native and external backends, since they both involve a NativeFunction + a particular backend index
- The data in the data model will be the same regardless of how the codegen is run. Running the codegen with a new external backend doesn't change the data inside of NativeFunction or an existing backend index. It just adds a new index for that backend.
- There are several of codegen areas that don't care about backend-specific information: mostly the tracing and autograd codegen. We can reason about the codegen there more easily, knowing that backend-specific info is entirely uninvolved.

An alternative to this split would be to augment the NativeFunction objects with external backend information at the time that we create them. So the external codegen could read both native_functions.yaml and the external backend's yaml at the same time, and construct a NativeObject with a full dispatch table (including the XLA entry), and the correct setting of structured (taking into account both yamls). One disadvantage to this approach is that NativeFunction objects now contain different stuff depending on how you ran the codegen, and you have to make sure that any changes to the codegen can properly handle all the different variants.

### Data Model Changes
Removed 3 classes, which are used by the external codegen:
- ExternalBackendFunction
- ExternalBackendFunctionsGroup
- ExternalBackendMetadata

And added two new ones:
- BackendIndex
- BackendMetadata

`BackendIndex` contains any info that's specific to that backend, plus a mapping from operator names to backend specific metadata about the operator. One example of backend-specific info that's not operator-dependent is the fact that XLA prefers to implement functional kernels instead of out kernels (and so when they eventually mark an op as structured, they're going to mark the functional op and not the out op).

`BackendMetadata` contains info specific to an (operator, backend) pair. Right now, that's just (a) the name of the kernel, and (b) whether or not that operator is structured.

### Questions
I wanted to get this PR up earlier so I could get feedback, but there are a few things I want to call out:

**Dealing with `structured`.**
This PR separates out the notion of `structured` into two bits of information:
- Does [operator] have a meta() function. This is backend-agnostic, and is represented by the `structured` property on `NativeFunction`, same as before. This is used, e.g., to decide what signatures to add to `MetaFunctions.h`.
- Does [operator, backend] have an impl() function. This is backend dependent; even though technically all in-tree backends are forced to write impl() functions for an operator when we port the op to structured in native_functions.yaml, out-of-tree backends can decide to opt in independently. This is represented as a property on `BackendMetadata`. This is used in most other cases, e.g. in `RegisterDispatchKey` when we're deciding whether or not to gen a structured or unstructured wrapper.

I also baked `is_structured_dispatch_key` directly into each BackendIndex. So for operators marked "structured" in native_functions.yaml, their corresponding CPU/CUDA BackendIndex entries will be marked structured, and all others (except for potentially external backends) will not.

I ended up trying to deal with `structured` in this change since it's technically backend dependent (XLA can opt kernels into structured separately from in-tree ops), but that may have been too ambitious: it's technically not relevant until we actually add support for structured external kernels. If it's not clear that this is the right path for dealing with structured and we want to push that off, I'm fine with backing out the bits of this PR that make `structured` backend-dependent. I don't see anything *too* controversial related to structured in the change, but I tried to call out any areas in the comments

**Localizing the fact that external backends follow Dispatcher convention.**
Another thing that's sort of backend specific that I didn't totally address in this PR is the fact the fact that in-tree backends follow the Native API while external backends follow the Dispatcher API. I painted over that in `native_functions.py` by adding a helper, `kernel_signature`, that takes in a native function and gives you the "correct" signature for the specified backend- NativeSignature for in-tree backends, and DispatcherSignature for out-of-tree backends. In order to make that fully useable though, we'll need `NativeSignature` and `DispatcherSignature` to have matching interfaces. I didn't bother with that in this PR, which is why `gen_external_aten_fallbacks.py` still has a bunch of direct references to the dispatcher API. Thinking of adding it in a later PR but wanted to see if anyone has other opinions.

Maybe `is_external()` shouldn't even be a property on the BackendMetadata, and anything the codegen does that requires asking for that information should just be better abstracted away.

**Thoughts on the `BackendIndex` / `BackendMetadata` breakdown.**
One thing that's annoying right now is that to query for various pieces of metadata, you call helper functions like `backend_index.structured(f)`, which queries that particular backend and tells you if that specific NativeFunctionGroup is structured for that backend. It has to return an `Optional[bool]` though, since you have to handle the case where that operator doesn't have a kernel for that backend at all. So users of those helpers end up with a bunch of optionals that they need to unpack, even if they know at some point that the result isn't None. I think it would be easier instead to just store the NativeFunction object as a field directly on the BackendMetadata. Curious if there are any other opinions on a better way to model it though.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D28474362

Pulled By: bdhirsh

fbshipit-source-id: 41a00821acf172467d764cb41e771e096542f661
2021-05-17 12:25:35 -07:00
Sam Estep
75024e228c Add lint for unqualified type: ignore (#56290)
Summary:
The other half of https://github.com/pytorch/pytorch/issues/56272.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290

Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed:

- https://github.com/pytorch/pytorch/runs/2384511062
- https://github.com/pytorch/pytorch/actions/runs/765036024

Reviewed By: seemethere

Differential Revision: D27867219

Pulled By: samestep

fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235
2021-04-21 08:07:23 -07:00
Brian Hirsh
eca98fedb5 split out NamedCType from CType. Remove direct string comparison from autograd codegen (#55334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55334

The goal of this PR is to clean up some of the autograd codegen to compare C++ types using `CType` objects instead of raw strings. My last PR in the stack made that string comparison a little more fragile, since the raw C++ strings needed to be namespace-aware.

I confirmed byte-for-byte no codegen changes vs. the last PR (which added namespaces to the codegen) by running `diff -qr ../pytorch-common_test/torch/csrc/autograd/generated/ ../pytorch-callgrind_test_after2/torch/csrc/autograd/generated/` and `diff -qr ../pytorch-common_test/build/aten/src/ATen/ ../pytorch-callgrind_test_after2/build/aten/src/ATen/`

Note that a better end-state for the autograd codegen would be to do all of its type pattern matching directly off of JIT types, instead of off of CType’s (which are really just generated from JIT types, incorporating C++ specific semantics). That looks like it’ll require a pretty substantial change though, so I’m not doing it in this PR.

As part of this change (and after talking with ezyang), I split off the `CType` data class into a separate `NamedCType` class, which holds a name and a `CType`. This way, `CType` only knows about actual C++ types, making it easier to compare CType’s to each other in the codegen when we only care about the type. The core change is in `types.py`, but it required a bunch of downstream changes to update all of the places where we create `CType`s to create `NamedCType`s instead.

The main change in the autograd codegen was that I updated `SavedAttribute` to store a `NamedCType`. The other autograd changes all pretty much came from that change.

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D27708347

Pulled By: bdhirsh

fbshipit-source-id: 3e07c80569c7b229c638f389e76e319bff6315f9
2021-04-16 11:43:08 -07:00
Brian Hirsh
947c7a8215 add C++ namespacing logic to ctypes (#55047)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55047

Added namespaces to all of the `CTypes` printed in the codegen. This is pretty much required if we want to use codegen externally, since we can no longer assume that we're inside of the `at::` namespace.

Important changes are in `types.py`.

How do we add the notion of namespaces to C++ types without people having to write "at::Tensor" everywhere? Before this PR, `CType` held a raw string representing the type, i.e. `BaseCType("Tensor", binds)`. This PR introduces a set of singleton base C++ types in `types.py`, that know how to print their namespace. Instead, we'd write `BaseCType(tensorT, binds)`, where printing `tensorT` will properly print out "at::Tensor".

This also means that you can't create arbitrary `CTypes`. If we need a new C++ type in the codegen, we need to add it to the list in `types.py`.

One blip in the design: we don't want to change `RegistrationDeclarations.yaml`, since that'll break external backends that ingest it. I added separate functions to display types without the namespace that are used to create RegistrationDeclarations.yaml`. With an external codegen API though, we can eventually kill it :)

I also didn't realize until this PR that `Declarations.yaml` is still directly in use, by some python/autograd codegen. Rather than keep that yaml byte-for-byte compatible, I just updated the callsites in the autograd codegen to work with namespaces. In the NEXT pr, I try to clean up some of the autograd codegen to stop using raw strings to match against C++ types.

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D27708349

Pulled By: bdhirsh

fbshipit-source-id: 56a4f81fc101795bcb9ee1f722121480fb2356ad
2021-04-16 11:43:06 -07:00
Brian Hirsh
164bee1d09 Return a CType instead of a string for returns, beef up CType (#55046)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55046

Updating `returns` in the codegen to return a CType instead of a raw string.

This has benefit of putting all stringifying logic through CType, which is useful in the followup PR when I add namespaces.

I also added new CTypes for other templated C++ types: array, vector and tuple. Mostly because it makes the namespacing logic in the next PR significantly easier. It also seems more natural to me that `BaseCType` shouldn't represent specializations of templated types.

There's a little bit of weirdness, types that are currently *only* used for returns, i.e. `TupleCType`. Returns aren't named, so I opted not to give it one- so we can add it in later if we discover that we need it.

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D27708348

Pulled By: bdhirsh

fbshipit-source-id: 230b210c3e53be1bd362105fbea8451055dc59a8
2021-04-16 11:41:46 -07:00
albanD
1d49fd31c4 [reland] Add formulas and basic tests (#56083)
Summary:
Reland of https://github.com/pytorch/pytorch/pull/49098
See original issue for details.

The only difference with previous PR is the fix of the _embedding_bag_dense_backward formula to stop declaring a backward formula for an argument that does not exists.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56083

Reviewed By: samestep

Differential Revision: D27778221

Pulled By: albanD

fbshipit-source-id: 159ef91ca931ef2ccfbc3d1c46c7880c32919dc9
2021-04-15 07:52:43 -07:00
Sam Estep
817fd932ac Revert D25607505: Add formulas and basic tests
Test Plan: revert-hammer

Differential Revision:
D25607505 (70f5905565)

Original commit changeset: fe2315d58768

fbshipit-source-id: 519d7426a6f32f0db51c4f360e5d5a79dbaac99d
2021-04-14 14:50:43 -07:00
albanD
70f5905565 Add formulas and basic tests (#49098)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49098

RFC: https://github.com/pytorch/rfcs/pull/11

This PR adds:
- Codegen support to define forward grad formulas and few manual formulas
- Codegen support to automatically generate formulas as well as few usage
- Tests for basic forward grad components

Codegen generated examples.
For each of them, the only part that is changed is the if statement before the return checking for fw grad defined.

- For manual entry:
```yaml
- name: max(Tensor self) -> Tensor
  self: evenly_distribute_backward(grad, self, result)
  result: max_forward(self_fw_grad, self, result)
```

```cpp
Tensor max(const Tensor & self) {
  auto& self_ = unpack(self, "self", 0);
  auto _any_requires_grad = compute_requires_grad( self );
  std::shared_ptr<MaxBackward1> grad_fn;
  if (_any_requires_grad) {
    grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode);
    grad_fn->set_next_edges(collect_next_edges( self ));
    grad_fn->self_ = SavedVariable(self, false);
  }
  #ifndef NDEBUG
  c10::optional<Storage> self__storage_saved =
    self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt;
  c10::intrusive_ptr<TensorImpl> self__impl_saved;
  if (self_.defined()) self__impl_saved = self_.getIntrusivePtr();
  #endif
  auto tmp = ([&]() {
    at::AutoNonVariableTypeMode non_var_type_mode(true);
    return at::max(self_);
  })();
  auto result = std::move(tmp);
  #ifndef NDEBUG
  if (self__storage_saved.has_value())
    AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage()));
  if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr());
  #endif
  if (grad_fn) {
      set_history(flatten_tensor_args( result ), grad_fn);
  }
  throw_error_for_complex_autograd(result, "max");
  if (isFwGradDefined(self)) {
      auto self_fw_grad = toLegacyFwGrad(self);
      auto self_primal = toLegacyPrimal(self);
      auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result);
      if (result_new_fw_grad.defined()) {
        result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false);
      }
  }
  if (grad_fn) {
    grad_fn->result_ = SavedVariable(result, true);
  }
  return result;
}
```

- For element wise entry:
```yaml
- name: abs(Tensor self) -> Tensor
  self: grad * self.sgn()
  result: auto_element_wise
```

```cpp
Tensor abs(const Tensor & self) {
  auto& self_ = unpack(self, "self", 0);
  auto _any_requires_grad = compute_requires_grad( self );
  std::shared_ptr<AbsBackward> grad_fn;
  if (_any_requires_grad) {
    grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode);
    grad_fn->set_next_edges(collect_next_edges( self ));
    grad_fn->self_ = SavedVariable(self, false);
  }
  #ifndef NDEBUG
  c10::optional<Storage> self__storage_saved =
    self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt;
  c10::intrusive_ptr<TensorImpl> self__impl_saved;
  if (self_.defined()) self__impl_saved = self_.getIntrusivePtr();
  #endif
  auto tmp = ([&]() {
    at::AutoNonVariableTypeMode non_var_type_mode(true);
    return at::abs(self_);
  })();
  auto result = std::move(tmp);
  #ifndef NDEBUG
  if (self__storage_saved.has_value())
    AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage()));
  if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr());
  #endif
  if (grad_fn) {
      set_history(flatten_tensor_args( result ), grad_fn);
  }
  throw_error_for_complex_autograd(result, "abs");
  if (isFwGradDefined(self)) {
      auto self_fw_grad = toLegacyFwGrad(self);
      auto self_primal = toLegacyPrimal(self);
      auto result_new_fw_grad = self_fw_grad * self_primal.sgn();
      if (result_new_fw_grad.defined()) {
        result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false);
      }
  }
  return result;
}
```
- For linear entry:
```yaml
- name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor
  self: grad
  result: auto_linear
```

```cpp
Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) {
  auto& self_ = unpack(self, "self", 0);
  auto _any_requires_grad = compute_requires_grad( self );
  std::shared_ptr<CloneBackward> grad_fn;
  if (_any_requires_grad) {
    grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode);
    grad_fn->set_next_edges(collect_next_edges( self ));
  }
  #ifndef NDEBUG
  c10::optional<Storage> self__storage_saved =
    self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt;
  c10::intrusive_ptr<TensorImpl> self__impl_saved;
  if (self_.defined()) self__impl_saved = self_.getIntrusivePtr();
  #endif
  auto tmp = ([&]() {
    at::AutoNonVariableTypeMode non_var_type_mode(true);
    return at::clone(self_, memory_format);
  })();
  auto result = std::move(tmp);
  #ifndef NDEBUG
  if (self__storage_saved.has_value())
    AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage()));
  if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr());
  #endif
  if (grad_fn) {
      set_history(flatten_tensor_args( result ), grad_fn);
  }
  if (isFwGradDefined(self)) {
      auto self_fw_grad = toLegacyFwGrad(self);
      auto result_new_fw_grad = at::clone(self_fw_grad, memory_format);
      if (result_new_fw_grad.defined()) {
        result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false);
      }
  }
  return result;
}
```

- For no entry:
```yaml
- name: angle(Tensor self) -> Tensor
  self: angle_backward(grad, self)
```

```cpp
Tensor angle(const Tensor & self) {
  auto& self_ = unpack(self, "self", 0);
  auto _any_requires_grad = compute_requires_grad( self );
  std::shared_ptr<AngleBackward> grad_fn;
  if (_any_requires_grad) {
    grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode);
    grad_fn->set_next_edges(collect_next_edges( self ));
    grad_fn->self_ = SavedVariable(self, false);
  }
  #ifndef NDEBUG
  c10::optional<Storage> self__storage_saved =
    self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt;
  c10::intrusive_ptr<TensorImpl> self__impl_saved;
  if (self_.defined()) self__impl_saved = self_.getIntrusivePtr();
  #endif
  auto tmp = ([&]() {
    at::AutoNonVariableTypeMode non_var_type_mode(true);
    return at::angle(self_);
  })();
  auto result = std::move(tmp);
  #ifndef NDEBUG
  if (self__storage_saved.has_value())
    AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage()));
  if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr());
  #endif
  if (grad_fn) {
      set_history(flatten_tensor_args( result ), grad_fn);
  }
  throw_error_for_complex_autograd(result, "angle");
  TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it.");
  return result;
}
```

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25607505

Pulled By: albanD

fbshipit-source-id: fe2315d587689af1cd5968536fa26c680b8b8829
2021-04-14 14:13:30 -07:00
Sam Estep
4753100a3b Un-ignore F403 in .flake8 (#55838)
Summary:
Generally wildcard imports are bad for the reasons described here: https://www.flake8rules.com/rules/F403.html

This PR replaces wildcard imports with an explicit list of imported items where possible, and adds a `# noqa: F403` comment in the other cases (mostly re-exports in `__init__.py` files).

This is a prerequisite for https://github.com/pytorch/pytorch/issues/55816, because currently [`tools/codegen/dest/register_dispatch_key.py` simply fails if you sort its imports](https://github.com/pytorch/pytorch/actions/runs/742505908).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55838

Test Plan: CI. You can also run `flake8` locally.

Reviewed By: jbschlosser

Differential Revision: D27724232

Pulled By: samestep

fbshipit-source-id: 269fb09cb4168f8a51fd65bfaacc6cda7fb87c34
2021-04-13 09:24:07 -07:00
kedejesu
53d8778b4d Update clang-format linux hash and yaml import calls (#53932)
Summary:
Fixing Bandit security issues.
- yaml_load: Use of unsafe yaml load. Allows instantiation of arbitrary objects. Consider yaml.safe_load().
Test ID: B506
Severity: MEDIUM
Confidence: HIGH
File: ./caffe2/contrib/aten/gen_op.py
More info: https://bandit.readthedocs.io/en/latest/plugins/b506_yaml_load.html
235 if __name__ == '__main__':
236     decls = yaml.load(read(os.path.join(args.yaml_dir, 'Declarations.yaml')), Loader=Loader)
237     factory_methods = find_factory_methods(decls)

- Blacklist: Use of insecure MD2 (6149a26adb), MD4 (fc7f026980), MD5 (7ea9d9af4e), or SHA1 hash function.
Test ID: B303
Severity: MEDIUM
Confidence: HIGH
File: ./tools/clang_format_utils.py
More info: https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b303-md5
36
37     hash = hashlib.sha1()
38

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53932

Reviewed By: jbschlosser

Differential Revision: D27072017

Pulled By: malfet

fbshipit-source-id: 2fef0119388797aee3cacdc880fc345bd2ba68ce
2021-03-18 17:11:58 -07:00
Ailing Zhang
9f75de278f Move common autograd utils functions from gen_variable_type.py to api/autograd.py. (#53340)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53340

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D26973914

Pulled By: ailzhang

fbshipit-source-id: 8367a08b27b25808782c77aadc3c67d07c354957
2021-03-11 19:58:45 -08:00
Edward Yang
93c4f9f972 Split out RegisterDispatchKey to its own file (#51508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51508

No substantive changes.  The codegen for this file was getting a
bit long so I moved it off into tools.codegen.dest submodule (I
wanted to do tools.codegen.gen but that conflicts with the existing
module; oy vey!)  To do this I had to move some other functions around
so that they were more generally accessible.  Otherwise
self-explanatory.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D26187856

Pulled By: ezyang

fbshipit-source-id: fd3784571d03d01c4acb7ca589fcde4492526408
2021-02-04 09:19:32 -08:00
anjali411
bd3ae117fc Fixes cat backward formula to return correct gradient values for R -> C case (#51681)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51681

Fixes https://github.com/pytorch/pytorch/issues/51627

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D26238748

Pulled By: anjali411

fbshipit-source-id: 1dc47f8ddddbf3f2c176f21e5dcee917f84f4c93
2021-02-03 21:29:55 -08:00
Edward Yang
8e20594b38 Construct CppSignatureGroup from NativeFunction (#49245)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49245

This will make it easier to implement the POC in
d534f7d4c5
see also https://github.com/pytorch/pytorch/pull/45666

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D25594005

Pulled By: ezyang

fbshipit-source-id: e458d3dc3a765ec77425761b9b17f23769cecf9e
2021-01-04 11:55:28 -08:00
Edward Yang
3efd5d8f01 Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122

cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way.  This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming.  The net result is shorter, cleaner and should be more robust to future changes.

This PR is a bit of a whopper.  Here is the order to review it.

- tools/codegen/api/types.py
  - Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
  - Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the  binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
  - I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
  - All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
   - We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
  - `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D25455887

Pulled By: ezyang

fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-16 16:18:40 -08:00
Jiakai Liu
de284b6d35 [pytorch][codegen] add autograd data model (#48249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48249

Introduced autograd related data models at tools.codegen.api.autograd.

Migrated load_derivatives.py to produce the new data models from derivatives.yaml.
It has clean mypy-strict result.

Changed both gen_autograd_functions.py and gen_variable_type.py to consume
the new data model.

Added type annotations to gen_autograd_functions.py - it has clean mypy-strict
result except for the .gen_autograd import (so haven't added it to the strict
config in this PR).

To limit the scope of the PR, gen_variable_type.py is not refactored, and the
main structure of load_derivatives.py / gen_autograd_functions.py is kept. We
only make necessary changes to make it work.

Confirmed byte-for-byte compatible with the old codegen:

```
Run it before and after this PR:
  .jenkins/pytorch/codegen-test.sh <baseline_output_dir>
  .jenkins/pytorch/codegen-test.sh <test_output_dir>

Then run diff to compare the generated files:
  diff -Naur <baseline_output_dir> <test_output_dir>
```

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D25086561

Pulled By: ljk53

fbshipit-source-id: 1f43ab0931d9814c24683b9a48ca497c5fc3d729
2020-11-19 21:47:05 -08:00
Richard Zou
9c8f40516f Batched grad for advanced indexing (index) (#47223)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47223

This PR enables batched gradient computation for advanced indexing.
Previously, the backward formula was writing parts of the grad tensori
in-place to zeros_like(self). Since grad is a BatchedTensor and self is
not a BatchedTensor, this is not possible.

To solve the problem, we instead create a new tensor with
`grad.new_zeros` and then write to that in-place. This new tensor will
have the same batchedness as the `grad` tensor.

To prevent regressions (the autograd codegen special cases zeros_like
to avoid saving the `self` tensor for backward), we teach the autograd
codegen how to save `self.options()`.

Test Plan:
- new tests
- run old indexing tests

Reviewed By: ejguan

Differential Revision: D24741684

Pulled By: zou3519

fbshipit-source-id: e267999dc079f4fe58c3f0bdf5c263f1879dca92
2020-11-05 18:25:33 -08:00
Kurt Mohler
28f8372bf4 Avoid mat1 references in mm_mat1_backward (#45777)
Summary:
Avoiding references to `mat1` in `mm_mat1_backward` is a first step to solving issue https://github.com/pytorch/pytorch/issues/42371

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45777

Reviewed By: malfet

Differential Revision: D24347967

Pulled By: albanD

fbshipit-source-id: f09a8149d9795481b5ed5b48fdd0e598ba027d0b
2020-10-16 13:52:44 -07:00
Xiong Wei
c73255801f Fix the autograd codegen for repeat function (#40766)
Summary:
Fix https://github.com/pytorch/pytorch/issues/40701

A new special case is added to let `dim()` save an int instead of self.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40766

Differential Revision: D22308354

Pulled By: albanD

fbshipit-source-id: 69008230d7398b9e06b8e074a549ae921c2bf603
2020-07-01 15:43:28 -07:00
Edward Yang
d125b5ffa2 Fix C412 lint from flake8-comprehensions update. (#24184)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24184

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D16764168

Pulled By: ezyang

fbshipit-source-id: cc252a860fd7e4b7fb2b95c5d9fcdbf6935ffeb6
2019-08-12 14:34:45 -07:00
Brian Vaughan
97a604ef57 Rereapply optional ScalarType interface changes that were reverted in D16079809 (#22456)
Summary:
re-apply changes reverted in:
https://github.com/pytorch/pytorch/pull/22412

Also change log_softmax to take positional arguments. Long-term we do want the kwarg-only interface, but seems to currently be incompatible with jit serialization.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22456

Differential Revision: D16097159

Pulled By: nairbv

fbshipit-source-id: 8cb73e9ca18fc66b35b873cf4a574b167a578b3d
2019-07-03 20:03:25 -07:00
Wanchao Liang
dff2c07183 Manual revert of D16012838
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22412

Reviewed By: nairbv, houseroad

Differential Revision: D16079809

fbshipit-source-id: ee0d805ff7a2bc5f98bcc65f90b8199751c840f6
2019-07-01 19:58:21 -07:00
Brian Vaughan
7707dee761 Re apply optional ScalarType changes (#22237)
Summary:
This is (mostly) the re-application of:
https://github.com/pytorch/pytorch/pull/21088

which was reverted due to an issue conflicting with changes in:
https://github.com/pytorch/pytorch/pull/22104
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22237

Differential Revision: D16012838

Pulled By: nairbv

fbshipit-source-id: 35f4a73c97ab68b4e2648aca96b2176f07b5a883
2019-06-26 13:36:25 -07:00
Michael Suo
e016a424ef Revert D15944971: [pytorch][PR] merge interfaces that have an optional scalartype parameter
Differential Revision:
D15944971

Original commit changeset: 53473c370813

fbshipit-source-id: a18158b448cb8993b12e1a3bf2c2a3e0d6df6b10
2019-06-24 09:41:33 -07:00
Brian Vaughan
142361a7e4 merge interfaces that have an optional scalartype parameter (#21088)
Summary:
This change is backwards incompatible in *C++ only* on mean(), sum(), and prod() interfaces that accepted either of:
```
Tensor sum(IntArrayRef dim, bool keepdim=false) const;
Tensor sum(IntArrayRef dim, ScalarType dtype) const;
```
but now to specify both the dim and dtype will require the keepdim parameter:
```
Tensor sum(IntArrayRef dim, bool keepdim=false, c10::optional<ScalarType> dtype=c10::nullopt) const;
```

[xla ci]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21088

Reviewed By: ailzhang

Differential Revision: D15944971

Pulled By: nairbv

fbshipit-source-id: 53473c370813d9470b190aa82764d0aea767ed74
2019-06-24 07:17:58 -07:00
Gregory Chanan
dd0ffd6864 Use schema string specification in derivatives.yaml. (#20916)
Summary:
For consistency, derivatives.yaml now uses the same schema specification as native_functions.yaml.

Note that there are some small downsides, e.g. changing the default values or return parameter names in native_functions.yaml also now requires updating derivatives.yaml as well.  But this has a few nice properties:
1) Able to copy-paste definitions from native_functions to derivatives.
2) Makes it impossible to write derivatives for operators without schemas (e.g. old TH operators).
3) Moves us closer to the ideal situation of co-locating forward and backwards declarations.

Note that this doesn't change any generated code; in particular, this has the same behavior of mapping in-place and out-of-place definitions together.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20916

Differential Revision: D15497800

Pulled By: gchanan

fbshipit-source-id: baee5caf56b675ce78dda4aaf6ce6a34575a6432
2019-06-10 13:47:55 -07:00
Gregory Chanan
83373e7755 Hook up non_differentiability in derivatives.yaml when no autograd function is generated. (#19520)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19520
ghimport-source-id: a1272aa0b23692fb189974c4daba7b2e4e0dad50

Differential Revision: D15021380

Pulled By: gchanan

fbshipit-source-id: ec83efd4bb6d17714c060f13a0527a33a10452db
2019-04-21 13:48:55 -07:00
Gregory Chanan
8868a4f20b Move non_differentiable_arg_names from autograd functions to differentiability_info. (#19519)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19519
ghimport-source-id: 74e603688b2e4ed33f6c46c7da9d009336140e74

Differential Revision: D15021378

Pulled By: gchanan

fbshipit-source-id: e366a914c67a90ba0552b67d0bf5b347edbaf189
2019-04-21 11:09:39 -07:00
Gregory Chanan
30b2953b8b Stop generating autograd functions for derivatives.yaml entries that only specify output differentiability. (#19424)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19424
ghimport-source-id: e9d1b86742607f5cbe39fb278fa7f378739cd6ef

Differential Revision: D15003380

Pulled By: gchanan

fbshipit-source-id: 8efb94fbc0b843863021bf25deab57c492086237
2019-04-19 10:56:20 -07:00
Gregory Chanan
ea6c738c8a Rename 'not_differentiable' to 'non_differentiable'. (#19272)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19272
ghimport-source-id: 755e91efa68c5a1c4377a6853f21b3eee3f8cab5

Differential Revision: D15003381

Pulled By: gchanan

fbshipit-source-id: 54db27c5c5e65acf65821543db3217de9dd9bdb5
2019-04-19 07:07:55 -07:00
Christian Puhrsch
eff672ef06 Remove Bool/IndexTensor from schema for native functions with derivatives (#17193)
Summary:
This only deals with four functions, but is an important first step towards removing BoolTensor and IndexTensor entirely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17193

Differential Revision: D14157829

Pulled By: cpuhrsch

fbshipit-source-id: a36f16d1d88171036c44cc7de60ac9dfed9d14f2
2019-02-26 17:54:33 -08:00
Edward Yang
4404762d7d Rename IntList to IntArrayRef. (#16751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751

This was made more complicated by the fact that ivalue::IntList
is a thing.  So I had to fix all of the sites where we referring
to IValue post facto.

The following codemods were run, in this order:

```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```

Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752

Reviewed By: dzhulgakov

Differential Revision: D13954363

fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
2019-02-05 14:54:34 -08:00
Gregory Chanan
fc61f1a1d1 Support named return arguments in native_functions. (#14100)
Summary:
Note there was a hacky way of doing this before by specifying "return:" lists manually; this makes the
return names part of the function declaration itself.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14100

Differential Revision: D13101810

Pulled By: gchanan

fbshipit-source-id: 1c80574cd4e8263764fc65126427b122fe36df35
2018-11-19 08:27:20 -08:00
Wanchao Liang
4e1c64caee Add c10::optional to type syntax (#12582)
Summary:
This PR adds optional type to ATen native, autograd, JIT schema and Python Arg parser, closes #9513. It allows us to use optional default values (including None) for function signature and implementations like clamp, etc., and also let us remove the python_default_init hack.

Follow up:

remove python_default_init completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12582

Differential Revision: D10417423

Pulled By: wanchaol

fbshipit-source-id: 1c80f0727bb528188b47c595629e2996be269b89
2018-10-25 16:08:29 -07:00
Tongzhou Wang
46162ccdb9 Autograd indices/values and sparse_coo ctor (#13001)
Summary:
Reopen of #11253 after fixing bug in index_select
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13001

Differential Revision: D10514987

Pulled By: SsnL

fbshipit-source-id: 399a83a1d3246877a3523baf99aaf1ce8066f33f
2018-10-24 10:00:22 -07:00
Richard Zou
cf2e176049 Fix error message for cat-ing zero-dim tensors (#5819)
Fixes #5552

* Fix error message for cat-ing zero-dim tensors

* Address comments
2018-03-19 16:06:27 -04:00
Richard Zou
11444a7273 Save self.numel() for backward (#5747) 2018-03-13 17:45:29 -04:00
Edward Z. Yang
7bd2db997e
Port cuDNN RNN bindings to ATen (#4881)
* Add transpose() to TensorGeometry.

This code is dead; I briefly used it in my RNN patchset but
eventually rewrote it to not be necessary.  However, it seemed
like a useful gadget so I kept it.  In general, it seems that it
would be useful for TensorGeometry to support all operations that
Tensor does, but it only computes the changes to sizes/strides
instead of actually doing the computation.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Turn on wrap_dim behavior for TensorGeometry

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support for hard-coded differentiable outputs.

Some outputs of functions are nondifferentiable, and should always
be returned with requires_grad=False.  Traditionally, we have used
the presence of 'grad' to signal that only the first output is
differentiable, and the rest are not, but cudnn_rnn (to be
implemented) breaks this pattern; its first three outputs are differentiable,
but its last output is a buffer that is just consumed by backwards.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* TensorGeometry constructor from just sizes

The sizes are assumed to form a contiguous tensor, and we compute
the strides we would get in that case.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support saving TensorList for backwards.

There is some back story here.  Saved TensorList in backwards will
be used by cudnn_rnn, and it is worth asking, why is it necessary to
save a list of tensors?  Indeed, *technically* speaking a list of
tensors is not necessary, we only need to save the sizes of each
of the weight tensors.  (We need the sizes because cuDNN is only
going to blast the derivative of weights into a flat buffer, but
we need to match the sizes of the views into the buffer when we
eventually return the derivatives.)

However, it was surprisingly awful trying to implement passing just
sizes, because as non-Tensor arguments, the JIT interpreter generation
code is expected to handle all non-Tensor arguments as attributes in the
trace, and our attributes struct doesn't actually know how to do
arrays of arrays.  Saved TensorList code was much easier to get working,
so that's what this patch does.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* MatrixRef - an ArrayRef with a stride, making it a 2D ArrayRef.

Like ArrayRef, this class does not own the underlying data, it is expected
to be used in situations where the data resides in some other buffer.
This is intended to be trivially copyable, so it should be passed by
value.

For now, 2D only (so the copies are actually cheap, without having
to write a SmallVector class) and contiguous only (so we can
return non-strided ArrayRef on index).

The intended use-case (not in this commit) is to make it easier to
work with RNN weights, which are num_weights x num_layers matrix of
parameters.

P.S. dimension 0 indexes rows, dimension 1 indexes columns

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Generalize getDataType in Descriptors.h

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Change copy_range to take Tensor, and change cat_tensors_backward accordingly

Should a backward function return a Variable or a Tensor?  For the most
part, all of our backward functions return Tensor, except cat_tensors_backward,
which returns a variable_list (which is really the only thing that matters,
because Tensor and Variable are interconvertible).  But this is kind of weird,
because it means that you can't implement a backwards in ATen that returns
a std::vector<Tensor>, and then hook it up transparently with the derivatives
code.  So I switched it over.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support 5-ary return Tensor tuple.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support code generation with mixed Tensor/TensorList in output.

I don't think I ended up using this in cudnn_rnn, but this seems
it might be useful for someone else later.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Support 4-ary boolean array

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add support for retain_variables in tools/autograd/derivatives.yaml

'retain_variables', a bool which is true if a user has specified
that saved variables should be retained in case the backwards is
run again later.  This allows an optimization where we can
destroy saved buffers if we know variables are not going to be retained,
e.g., it is (will be) used by _cudnn_rnn

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Lazily initialize cuDNN descriptors

Previously, cuDNN descriptors were eagerly allocated as soon
as a FooDescriptor object was created.  However, in some uses
of TensorDescriptor, this is problematic: some tensors are optional
and cuDNN's API expects to be given a nullptr TensorDescriptor
in this case, not an uninitialized (but allocated) descriptor.

Lazily initializing the descriptors makes it less likely for
us to use uninitialized memory and matches the usual semantics of
unique_ptr.  It's good sense!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Port cuDNN RNNs to ATen.

This brings three new functions:
  - _cudnn_rnn_flatten_weight: flatten a matrix of weight tensors into
    a single contiguous weight buffer as required by cuDNN
  - _cudnn_rnn: run RNN forwards
  - _cudnn_rnn_backward: run RNN backwards

RNNs have a lot of parameters, so we restructured what was previously
a single 'fn' object that recorded all the parameters into three
objects: RNNDescriptorParams, TensorDescriptorListParams and
DropoutDescriptorParams.

We make use of MatrixRef to organize the weight tensors (which are
weight/bias x number of layers), but I did not teach the codegen
how to pass these as arguments/return values natively, so instead
a MatrixRef is passed as its constituent ArrayRef and int64_t stride0.

cudnn_rnn has three differentiable outputs and one nondifferentiable
one, so it makes use of the support for hard-coded differentiable outputs.

I haven't deleted all of the descriptor code from Python, because dropout
initialization still goes through this codepath, that should be fixed soon
but I don't see it as essential for this PR.

This commit also removes the last use of NestedIOFunction from PyTorch.

There are some shenanigans with cuDNN dropout descriptor initialization,
see below:

Note [cuDNN dropout descriptor initialization]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In most cases, setting descriptors in cuDNN is cheap (e.g.,
cudnnSetTensorNdDescriptor).  However, this is not the case for
cudnnSetDropoutDescriptor: in cuDNN 6/7 (and possibly others) it does an
expensive precomputation to initialize the random number generator states.  In
cuDNN 6, this is the ONLY official mechanism to initialize a dropout descriptor,
which means that law-abiding clients were expected to generate a dropout
descriptor once and cache it.  However, our ATen interface is (1) stateless (so
we can't cache the descriptors) and (2) does not accept arbitrary user types in
its interface (so we can't pass the descriptor in).  This puts us in a pickle.

In cuDNN 7, a new function, cudnnRestoreDropoutDescriptor was added, which
forgoes the expensive initialization process, and can initialize the
descriptor with a pre-initialized state CUDA tensor.  This is great, because
it means we can simply pass in the state tensor and then initialize the
descriptor internally.  Unfortunately, this function is not available in
cuDNN 6.

To work around this, we break the cuDNN abstraction barrier, and have
the struct layout of the underlaying dropout descriptor.  With this struct,
we can reimplement cudnnRestoreDropoutDescriptor from scratch. Great!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Fix cuDNN 7 behavior.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Delete some unused, controversial methods from MatrixRef.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add missing filter_dim_a slice

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Replace nested for-loop with itertools.chain.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* CR comment on mut_desc()

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Refactor DropoutDescriptor API.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Use cached CurrentDeviceProperties from Context.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Document _cudnn_rnn outputs.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Improve fmap docs, convert some functions to use it.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Move IndexRange to autograd/function.h

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Elaborate on CUDNN_STATUS_INVALID_VALUE return some more.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add an all-in-one setter for RNNDescriptorParams.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Print what the unrecognized RNN mode was

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* RNN TensorDescriptor improvements

- Have an explicit size/stride overload for set TensorDescriptor,
  so you don't have to create a goofy view to feed in.

- Change the padding to 3D rather than 5D, which is all you actually
  need (it's just 2D that is not supported by cuDNN API.)

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Fix implementation of cudnnRestoreDropoutDescriptor, plus test.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Better comments about input layout.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add comment about no-DropoutDescriptor argument RNNDescriptor function.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Rename vocab_size back to input_size.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Don't use backslash in comment.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Bugfix for contiguous TensorGeometry calculation.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Don't allocate a dummy tensor when setting TensorDescriptor for flatten_weight.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Make contiguity errors more user-friendly.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* s/fn.dropout.train/fn_train/

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* s/_cudnn_rnn_backward_grad/_cudnn_rnn_backward_input/

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Make dcx properly undefined when not required.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Remove old TODO.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add state size check in cudnnRestoreDropoutDescriptor

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Explicitly narrow int64_t to size_t

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Restore copyParams comment.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Update benchmark numbers, and slight engineering improvements.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Typofix.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-02-05 13:54:11 -05:00
Edward Z. Yang
a249016044 New index computation strategy in Functions.cpp (Tensor/TensorList) (#4775)
When generating autograd::Function wrappers for ATen functions, we need
to take derivative expressions in derivatives.yaml (identified by name)
and correlate them with the correct index they should take in
grad_inputs (identified positionally only).  Previously, this
computation was done *statically* in load_derivatives.py (set_up_derivatives)
and then we hard-coded indices in the generated Functions.cpp.
This is sufficient for supporting ATen operations which consist solely
of Tensor arguments, or a single TensorList argument.  However, this
strategy will not work for mixed Tensor/TensorList arguments, as the
index of any Tensor after a TensorList is not known at codegen time,
since it will vary depending on the length of the TensorList, e.g.,

  foo({x1, x2}, y)      ==>  y is index 2
  foo({x1, x2, x3}, y)  ==>  y is index 3

This commit introduces a new strategy for generating these indices which
pushes index computation to *runtime* (though any decent C++ optimizer
can re-optimize the index computation back into constants; this was
verified in Godbolt.)  Instead of hard-coding constants, a small
IndexRangeGenerator object is created and used to generate the correct
index ranges (std::pair<size_t, size_t>) for each argument.

Here is an example of mm rewritten in the new codegen format:

  variable_list MmBackward::apply(const variable_list& grads) {
    IndexRangeGenerator gen;
    auto self_ix = gen.range(1);
    auto mat2_ix = gen.range(1);
    variable_list grad_inputs(gen.size());
    auto& grad = grads[0];
    auto self = self_.unpack();
    auto mat2 = mat2_.unpack();
    if (should_compute_output({ mat2_ix })) {
      auto grad_result = mm_mat2_backward(grad, self, mat2_sizes, mat2.strides(), 1);
      copy_range(grad_inputs, mat2_ix, grad_result);
    }
    if (should_compute_output({ self_ix })) {
      auto grad_result = mm_mat1_backward(grad, mat2, self_sizes, self.strides(), 1);
      copy_range(grad_inputs, self_ix, grad_result);
    }
    return grad_inputs;
  }

Unlike before, where self_ix and mat2_ix were hardcoded as 0 and 1,
we derive them by invoking IndexRangeGenerator (which internally
is just a little counter which bumps up each invocation of 'range').
Each _ix variable actually represents a range, as can be seen here.

  variable_list CatBackward::apply(const variable_list& grads) {
    IndexRangeGenerator gen;
    auto tensors_ix = gen.range(tensors_size_);
    variable_list grad_inputs(gen.size());
    auto& grad = grads[0];
    if (should_compute_output({ tensors_ix })) {
      auto grad_result = cat_tensors_backward(grad, tensors_sizes_dim, dim);
      copy_range(grad_inputs, tensors_ix, grad_result);
    }
    return grad_inputs;
  }

The invocation of 'copy_range' reads a TensorList returned by the
backward function into the correct entries in grad_inputs.
tensors_size_ is a new member of CatBackward which is filled with
the size of the forward input tensor when cat is originally invoked.

With this new code generation strategy, we can completely eliminate
the special cases for Tensor and TensorList in index selection, and
we can smoothly support mixed Tensor/TensorList by making multiple
invocations of gen.range() with non-one arguments.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-01-27 21:46:08 +01:00
Sam Gross
57549b7e44
Bind functions with out= arguments in VariableType (#4565)
This adds overrides in VariableType for the xxx_out ATen functions and
implements Python bindings. There is no support for automatic
differentiation. If any of the inputs (or outputs) requires grad, then the
function will throw an exception unless it's running in "no-grad" mode.

The bindings for calling torch.xxx functions on Variables are moved to a
different object. Previously, they were static method on VariableBase.
This change prevents users from accidentally calling static methods as if
they were instance methods.
2018-01-17 18:27:42 -05:00
Edward Z. Yang
c3b7baecea Fix #4422, use grad for cudnn_batch_norm derivative / don't use toTensor()
This commit fixes double-backwards on batch norm.  There were two
bugs:

- Returned buffers from batchnorm backwards were being marked as differentiable
  when they shouldn't be.  The fix for this is "easy": use 'grad' instead of
  'grads[0]' in cudnn_batch_norm's backward definition.  (More on this below.)

- I was using toTensor on a Scalar, which gives me a Tensor of the wrong
  type when I'm in CUDA world.  Using the Scalar add() overload directly
  solves the problem.

The differentiability of returned buffers was annoyingly subtle and I nearly
went off and implemented a big pile of infrastructure to "tell" the codegen how
to distinguish between differentiable and non-differentiable outputs before
realizing that there must be a way we do this legitimately, because it works for
THNN.  I documented this in derivatives.yaml, and also added tests for the
problem in load_derivatives.py to catch the various ways you could "get it
wrong".  Hope this helps someone else.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-01-16 12:24:55 -05:00
Sam Gross
04ad23252a
Refactor gen_variable_type (#4487)
The gen_variable_type.py script now is only responsible for generating
VariableType.h/cpp. The parent script, "gen_autograd.py", delegates to
gen_autograd_functions.py, gen_variable_type.py, and
gen_python_functions.py.

I've removed "fallthrough" functions. It's replaced by
DONT_RECORD_TRACE, DONT_PROFILE, and DONT_REQUIRE_DERIVATIVE.

In preparation for binding the _out variants, I changed some static
types to Tensor (from Variable) and we now unpack and name tuple return
values.
2018-01-08 13:43:09 -05:00
Edward Z. Yang
4e3a4bd688 Check for out of bounds grads access in derivatives.yaml
This test would have caught the OOB in thnn_conv_depthwise2d_backward

Fixes #4457

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-01-06 10:58:05 -05:00
Sam Gross
6a0c636d4e
Don't special case NN functions in gen_variable_type.py (#4395)
This modifies NN binding in ATen so that the xxx_forward functions now
return buffers instead of taking them as inputs. The NN functions with
no suffix are implemented in Type.cpp. They call the xxx_forward
variants and discard any returned buffers.

This simplifies derivatives for NN functions. The derivatives are now
defined on the xxx_forward functions and buffers are treated as any
other input.
2018-01-03 19:22:50 -05:00
Sam Gross
98f71912b0
Fix type signature of in-place NN functions (#4389)
This is a step towards removing the special casing of NN functions in gen_variable_type.py. It fixes the signature of in-place NN functions so that they return Tensor & instead of Tensor.
2017-12-28 16:50:09 -05:00
Sam Gross
f8a4b1a266
Split off load_derivatives and gen_autograd_functions from gen_variable_type (#4370) 2017-12-27 18:59:41 -05:00