Commit Graph

54 Commits

Author SHA1 Message Date
soulitzer
bac0f289a3 Add methods to access data and unpack_hook on SavedVariable (#164358)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164358
Approved by: https://github.com/albanD
2025-10-02 13:05:16 +00:00
Simon Fan
0a2da008f8 [ca] trace saved variable unpacking (#147242)
## Before

Previously, CA will always unpack all saved variables stored in the autograd graph before executing it. This meant that we can't capture unpack hooks as part of the CA graph, and they would fire out of order wrt to other backward hooks. For memory saving APIs built on top of saved tensor hooks like non-reentrant checkpointing and offloading, we couldn't achieve any savings because all activations would be recomputed/loaded and active at the same time, resulting in no-op.

## After

We add unpack hooks into the CA graph so that they can be executed progressively. The python hook and hook input themselves are wrapped by non-traceable code, so CA polyfills the wrapping as:
```python
# pseudocode
class SavedVariable:
  def unpack(self):
    if self.hook:
      return self.hook(self.packed_data)
    else:
      return self.packed_data

# This approach won't directly work when we add support for Forward AD or double-backward.
```

Directly executing the CA graph (without torch.compiling it) under checkpointing/offloading, memory profile is expected to stay the same as when using the eager autograd engine. If AOT backward is in the autograd graph, memory profile is expected to be better than the eager autograd engine, since we can now delay saved activations unpacking into the AOT backward's execution.

All tests pass when running the CA graph directly, the remaining issues are in Dynamo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147242
Approved by: https://github.com/jansel
2025-02-26 16:37:17 +00:00
PyTorch MergeBot
90e3a3d86d Revert "[ca] trace saved variable unpacking (#147242)"
This reverts commit 68ddca9449.

Reverted https://github.com/pytorch/pytorch/pull/147242 on behalf of https://github.com/wdvr due to failing tests in the slow workflow, see below ([comment](https://github.com/pytorch/pytorch/pull/147242#issuecomment-2683604547))
2025-02-26 00:40:16 +00:00
Simon Fan
68ddca9449 [ca] trace saved variable unpacking (#147242)
## Before

Previously, CA will always unpack all saved variables stored in the autograd graph before executing it. This meant that we can't capture unpack hooks as part of the CA graph, and they would fire out of order wrt to other backward hooks. For memory saving APIs built on top of saved tensor hooks like non-reentrant checkpointing and offloading, we couldn't achieve any savings because all activations would be recomputed/loaded and active at the same time, resulting in no-op.

## After

We add unpack hooks into the CA graph so that they can be executed progressively. The python hook and hook input themselves are wrapped by non-traceable code, so CA polyfills the wrapping as:
```python
# pseudocode
class SavedVariable:
  def unpack(self):
    if self.hook:
      return self.hook(self.packed_data)
    else:
      return self.packed_data

# This approach won't directly work when we add support for Forward AD or double-backward.
```

Directly executing the CA graph (without torch.compiling it) under checkpointing/offloading, memory profile is expected to stay the same as when using the eager autograd engine. If AOT backward is in the autograd graph, memory profile is expected to be better than the eager autograd engine, since we can now delay saved activations unpacking into the AOT backward's execution.

All tests pass when running the CA graph directly, the remaining issues are in Dynamo.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147242
Approved by: https://github.com/jansel
2025-02-25 20:38:51 +00:00
cyy
38d3c27849 [1/N] Enable cppcoreguidelines-special-member-functions (#137405)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137405
Approved by: https://github.com/ezyang
2024-10-23 00:16:53 +00:00
soulitzer
ebf25e128c [autograd] Do not stash version counter for saved tensor (#128545)
Fixes https://github.com/pytorch/pytorch/issues/128611

We detach using tensor_data, which already preserves the version counter, so there is no reason to save it prior to unpacking:
```
at::TensorBase VariableHooks::tensor_data(const at::TensorBase& self) const {
  TORCH_CHECK(self.defined(), "cannot call tensor_data() on undefined tensor");
  auto self_impl_copy = self.unsafeGetTensorImpl()->shallow_copy_and_detach(
      /*version_counter=*/self.unsafeGetTensorImpl()->version_counter(),
      /*allow_tensor_metadata_change=*/
      self.unsafeGetTensorImpl()->allow_tensor_metadata_change());
  return at::Tensor(self_impl_copy);
}
```
This changes the behavior when hooks are involved:
- Previously, if you had a hook that replaced the saved tensor with an entirely new tensor, we would've smashed the saved version counter onto that during unpack, which is not quite correct because the tensor returned by user's pack hook is not necessarily aliased to the tensor originally being saved (unlikely), and even if it were, the version counter would already be shared, if the user did their operations not in inference mode (unlikely).
- In this PR, we restore the version counter using the version counter from the unpack hook's output.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128545
Approved by: https://github.com/albanD
ghstack dependencies: #125795
2024-06-21 18:03:06 +00:00
Richard Barnes
ed327876f5 [codemod] c10:optional -> std::optional (#126135)
Generated by running the following from PyTorch root:
```
find . -regex ".*\.\(cpp\|h\|cu\|hpp\|cc\|cxx\)$" | grep -v "build/" | xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/'
```

`c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi
2024-05-14 19:35:51 +00:00
cyy
20f769544c [12/N] Apply clang-tidy and fix warnings in headers of torch/csrc (#116486)
This PR follows #116751.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116486
Approved by: https://github.com/albanD
2024-01-10 08:48:14 +00:00
PyTorch MergeBot
0aa50909f3 Revert "[12/N] Apply clang-tidy and fix warnings in headers of torch/csrc (#116486)"
This reverts commit 5aa258eb09.

Reverted https://github.com/pytorch/pytorch/pull/116486 on behalf of https://github.com/izaitsevfb due to Reverting, as it depends on https://github.com/pytorch/pytorch/pull/116353, which has to be reverted ([comment](https://github.com/pytorch/pytorch/pull/116486#issuecomment-1876042948))
2024-01-03 22:18:54 +00:00
cyy
5aa258eb09 [12/N] Apply clang-tidy and fix warnings in headers of torch/csrc (#116486)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116486
Approved by: https://github.com/albanD
2023-12-30 18:38:53 +00:00
Jason Ansel
e9fd815226 Misc visibility changes for compiled autograd (#105298)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105298
Approved by: https://github.com/albanD, https://github.com/soulitzer
2023-07-18 01:10:04 +00:00
soulitzer
10ad74cbec Update SavedVariable to support saving non-input leafs (#104039)
Fixes https://github.com/pytorch/pytorch/issues/103726
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104039
Approved by: https://github.com/albanD
2023-06-22 21:52:35 +00:00
Kazuaki Ishizaki
69aa6b4bb9 fix typo in comments under torch/csrc/autograd (#96061)
This PR fixes typos in comments of `.cpp` and `.h` files under `torch/csrc/autograd` directory
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96061
Approved by: https://github.com/soulitzer
2023-03-06 18:05:14 +00:00
cyy
85851b1e8f remove useless clang-tidy suppression (#92287)
remove NOLINTNEXTLINE(cppcoreguidelines-pro-type-member-init)
remove NOLINTNEXTLINE(performance-move-const-arg)
remove NOLINTNEXTLINE(performance-no-automatic-move)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92287
Approved by: https://github.com/albanD
2023-01-21 02:33:24 +00:00
Michael Suo
30fb2c4aba [lint] autoformat test/cpp and torch/csrc
Let's have some fun.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828

Approved by: https://github.com/ezyang
2022-06-11 21:11:16 +00:00
Peter Bell
fa09099ba3 Codegen: TraceType only includes operators being registered (#68691)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68691

TraceType is a sharded file, so by only including specific operator
headers, we ensure that changing one (non-method) operator only needs
one shard to be re-compiled.

This also changes all the included autograd and jit headers from
including `ATen/ATen.h` to just including `ATen/core/Tensor.h`.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D33336948

Pulled By: albanD

fbshipit-source-id: 4e40371592b9a5a7e7fcd1d8cecae11ffb873113
2022-01-02 13:09:19 -08:00
Nikita Shulga
26e32988bd Revert D32596264: Codegen: TraceType only includes operators being registered
Test Plan: revert-hammer

Differential Revision:
D32596264 (e66a8ab4f5)

Original commit changeset: 2f28b62d7b99

Original Phabricator Diff: D32596264 (e66a8ab4f5)

fbshipit-source-id: 7d18c4e77ce30dd7817a95f9c39b565cb246cd12
2021-12-17 11:20:12 -08:00
Peter Bell
e66a8ab4f5 Codegen: TraceType only includes operators being registered (#68691)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68691

TraceType is a sharded file, so by only including specific operator
headers, we ensure that changing one (non-method) operator only needs
one shard to be re-compiled.

This also changes all the included autograd and jit headers from
including `ATen/ATen.h` to just including `ATen/core/Tensor.h`.

Test Plan: Imported from OSS

Reviewed By: jbschlosser, malfet

Differential Revision: D32596264

Pulled By: albanD

fbshipit-source-id: 2f28b62d7b9932f30fad7daacd8ac5bb7f63c621
2021-12-17 10:35:05 -08:00
Peter Bell
b2e79ed5ec Remove WindowsTorchApiMacro.h in favor of Export.h (#69585)
Summary:
Follow up to https://github.com/pytorch/pytorch/issues/68095

This also changes the files from the ATen folder to include c10's `Export.h` instead since they can't ever be exporting `TORCH_PYTHON_API`.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69585

Reviewed By: mrshenli

Differential Revision: D32958594

Pulled By: albanD

fbshipit-source-id: 1ec7ef63764573fa2b486928955e3a1172150061
2021-12-09 17:30:09 -08:00
Shunting Zhang
289b0f7b04 Resent the reverted PR: Add register_frozenpython.cpp to the torch::deploy interpreter library in the OSS build (#67303)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67303

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D32016061

Pulled By: shunting314

fbshipit-source-id: 9460c90dd4f630f4c81dbfbbd772446ddffbabd0
2021-10-29 14:10:43 -07:00
Victor Quach
3bda4ea842 Avoid unnecessary copying data in Saved Variable (#61927)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61927

This is a refactor of `SavedVariable.cpp` to prevent ever defining the `data_` tensor if default hooks are set.

Before the refactor:

```c++
data_ = variable.tensor_data(); // this is wasteful if hooks are defined
register_hooks(Engine::get_default_engine().get_default_saved_variable_hooks());
```

After the refactor:
```c++
if (get_default_hooks_()) {
  save_metadata_(variable);
  register_hooks_(get_default_hooks_(), variable);
  return;
}
save_metadata_(variable);
data_ = variable.tensor_data(); // only needed if hooks are not defined
```

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D29848524

Pulled By: Varal7

fbshipit-source-id: abca1eee37a17b47841e28d8a576490913fce1ce
2021-08-03 07:09:47 -07:00
Nikita Shulga
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
Victor Quach
ff82394fc0 Apply saved tensor hooks (#60975)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60975

Fixes #58512

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29466227

fbshipit-source-id: c1498d52173aceb29638b5c4f521ac05356a5958
2021-07-18 08:42:51 -07:00
Victor Quach
ee5a97de11 Register Saved Tensors hooks (#60663)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60663

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29466223

fbshipit-source-id: 65dc3a935c18a0e6b93a37e24543c696e6ae0321
2021-07-15 08:09:55 -07:00
Victor Quach
a5e2ea4345 Add noop register hook (#60685)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60685

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D29466224

fbshipit-source-id: 68c8aa022ccffeefd45062f1443d15c9a6824f3d
2021-06-30 07:46:34 -07:00
Victor Quach
dab1e59652 Remove dead code in SavedVariable (#59838)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59838

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29069214

fbshipit-source-id: 5debf93a6c3d1c3d585efbe54438e8df92646d62
2021-06-16 16:44:16 -07:00
Victor Quach
1efa863837 Avoid un-necessary unwrapping of Tensor in SavedVariable (#59837)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59837

Fixes #58500

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29069215

fbshipit-source-id: 603db3c8a64b729e86385ed774825f01c6ce0f20
2021-06-16 16:43:04 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
albanD
c23808d8e8 Reland: Add base forward grad logic (#49734)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49734

RFC: https://github.com/pytorch/rfcs/pull/11

This PR add the basic logic to handle forward grad as dual Tensors.
It contains the following:
- Mechanism to save dual state on a Tensor and clear it up when the dual level ends
- C++ and python user facing API
- Updated view system that is able to track both forward and backward views

The current PR has the following limitations:
- Extensive tests are in the next PR in the stack as formulas are needed to write full tests.
- Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack)
- Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR.
- We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise.
- We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise.

Reading guide:
- Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view.
- New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development.
- Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677)
- API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243)
- c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9)
- python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d)
- python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8)
- c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3)
- Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433)
- Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030)

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D25678797

Pulled By: albanD

fbshipit-source-id: 3d58550c11b5f58b9b73fd30596d042b857fb9dd
2020-12-22 12:11:27 -08:00
Walter Shen
f5178bf151 Revert D25607503: Add base forward grad logic
Test Plan: revert-hammer

Differential Revision:
D25607503 (fdf02eff3d)

Original commit changeset: f1396290de1d

fbshipit-source-id: 057206e28ff48ee288856adfe3ca577d4880789f
2020-12-21 19:56:28 -08:00
albanD
fdf02eff3d Add base forward grad logic (#49097)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49097

RFC: https://github.com/pytorch/rfcs/pull/11

This PR add the basic logic to handle forward grad as dual Tensors.
It contains the following:
- Mechanism to save dual state on a Tensor and clear it up when the dual level ends
- C++ and python user facing API
- Updated view system that is able to track both forward and backward views

The current PR has the following limitations:
- Extensive tests are in the next PR in the stack as formulas are needed to write full tests.
- Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack)
- Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR.
- We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise.
- We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise.

Reading guide:
- Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view.
- New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development.
- Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677)
- API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243)
- c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9)
- python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d)
- python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8)
- c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3)
- Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433)
- Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030)

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D25607503

Pulled By: albanD

fbshipit-source-id: f1396290de1d75760f3d380c43cdd56e86fa6099
2020-12-21 14:39:43 -08:00
Sebastian Messmer
1542c41a67 Change C++ frontend to take optional<Tensor> arguments (#41947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41947

Previously, if an op took an optional `Tensor?` argument, the C++ frontend (i.e. `at::op()` and `Tensor::op()`)
were generated to take `Tensor`. A previous PR (https://github.com/pytorch/pytorch/pull/41610) changed the kernels
to be written with `c10::optional<Tensor>` instead of `Tensor`, but that did not touch the C++ frontend yet.

This PR changes the C++ frontend API to take `c10::optional<Tensor>` instead of `Tensor` as well.
This should be mostly bc conserving. Since `Tensor` implicitly converts to `c10::optional<Tensor>`, any old code
calling an op with a `Tensor` would still work. There are likely corner cases that get broken though.
For example, C++ only ever does *one* implicit conversion. So if you call an op with a non-tensor object
that gets implicitly converted to a `Tensor`, then that previously worked since the API took a `Tensor` and
C++ allows one implicit conversion. Now it wouldn't work anymore because it would require two implicit conversions
(to `Tensor` and then to `c10::optional<Tensor>`) and C++ doesn't do that.

The main reasons for doing this are
- Make the C++ API more sane. Those arguments are optional and that should be visible from the signature.
- Allow easier integration for XLA and Autocast. Those backends generate code to wrap operators and forward
  operator arguments to calls to at::op(). After https://github.com/pytorch/pytorch/pull/41610, there was
  a mismatch because they had to implement operators with `optional<Tensor>` but call `at::op()` with `Tensor`,
  so they had to manually convert between those. After this PR, they can just forward the `optional<Tensor>`
  in their call to `at::op()`.
ghstack-source-id: 108873705

Test Plan: unit tests

Reviewed By: bhosmer

Differential Revision: D22704832

fbshipit-source-id: f4c00d457b178fbc124be9e884a538a3653aae1f
2020-07-31 16:11:55 -07:00
Edward Yang
9e81616343 Merge Tensor and Variable types. (#28287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28287

This PR eliminates the static distinction between
Tensor and Variable.  Every Variable is a Tensor, no need to static_cast
or call the Variable constructor.

To do this, I need Tensor to have API parity with Variable. I have already
moved most of the methods I don't want in Tensor off Variable.
These implementations are all placed in Tensor.cpp.

One API difference is that all Variable methods now have const, so we no longer
have faux const-correctness (see https://github.com/zdevito/ATen/issues/27 for
back story)

This diff is BC breaking in a few ways:
- Because torch::autograd::Variable is now just an alias of at::Tensor, ADL for
  `torch::autograd` functions no longer works, you have to explicitly qualify
  them with `torch::autograd` (examples: `torch/nn/parallel/data_parallel.h`)
- Because Variable and Tensor are now the same type, code which assumes that
  they are different types (e.g., for the purposes of templating, or enable_if checks)
  will not work until you delete the (now) redundant overload/specialization.
  (examples: `torch/nn/modules/container/any.h`, `torch/csrc/utils/pybind.h`)

Some other notes:
- I'm not sure what was going with the old template implementation of `extract_vars`,
  but I couldn't get the sfinae version to work. Replacing it with an overloading based version
  made it work.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D18571426

Pulled By: ezyang

fbshipit-source-id: 2ea8151e5f1d8512cdebf1345399642e68b707b8
2019-11-21 09:26:39 -08:00
Stefan Krah
5dd915cd1a @albanD's #15219 augmented with SavedVariable::weak_grad_fn_ (#23502)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/10532 via https://github.com/pytorch/pytorch/issues/15219 and lots of analysis by albanD.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23502

Differential Revision: D16881340

Pulled By: ezyang

fbshipit-source-id: b483fe6c89ed9d27674c3347c043fe509ba80007
2019-08-28 14:58:53 -07:00
mal
e7a9b0d62f Rename torch::autograd::Function to torch::autograd::Node
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23269

Test Plan: Imported from OSS

Differential Revision: D16454878

fbshipit-source-id: b1e840fc2d3901955280d141e5ad6efd5e9d66af
2019-07-23 20:52:22 -07:00
Mikhail Zolotukhin
6ca38d9840 Cleanup includes in torch/csrc/autograd/* (#19923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19923
ghimport-source-id: 54debdd21ca0f4230b1915905673de274807a2e5

Differential Revision: D15125016

Pulled By: ZolotukhinM

fbshipit-source-id: 8d54f436e4508067089a1d05ce192093220aa1bb
2019-05-06 13:48:42 -07:00
Will Feng
4ae59e4744 Move version_counter_ to TensorImpl (#18223)
Summary:
According to https://github.com/pytorch/pytorch/issues/13638#issuecomment-468055428, after the Variable/Tensor merge, we may capture variables without autograd metadata inside an autograd function, and we need a working version counter in these cases. This PR makes it possible by moving `version_counter_` out of autograd metadata and into TensorImpl, so that variables without autograd metadata still have version counters.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18223

Differential Revision: D14735123

Pulled By: yf225

fbshipit-source-id: 15f690311393ffd5a53522a226da82f5abb6c65b
2019-04-11 15:12:45 -07:00
Edward Yang
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
Owen Anderson
89d56ae435 Move function deletion from the stack to the heap. (#11611)
Summary:
This eliminates the need for any heuristics regarding stack size limits.

This is a re-do #11534 with a fix to properly handle cases where
multiple edges exist between a pair of functions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11611

Differential Revision: D9991198

Pulled By: resistor

fbshipit-source-id: fecd2c5cac7e78f82a0f20cf33268bb1617bb4a0
2018-09-21 16:11:03 -07:00
Peter Goldsborough
04939a4745 Match parameter names and = default (#9737)
Summary:
More clang tidy cleanups in `torch/csrc`. This time:

1. `hicpp-use-equals-default` recommends `= default` instead of `{}` for constructors/destructors. This is better practice because it expresses the intent better (https://stackoverflow.com/questions/6502828/what-does-default-mean-after-a-class-function-declaration)
2. `readability-inconsistent-declaration-parameter-name` enforces that parameter names in the declaration match parameter names in the definition. This is just generally useful and can prevent confusion and bugs.

Also updated my script a little bit.

apaszke ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9737

Differential Revision: D9069069

Pulled By: goldsborough

fbshipit-source-id: f7b3f3a4eb4c9fadc30425a153566d3b613a41ae
2018-07-30 14:10:00 -07:00
peter
53083b8353 Remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS and fix CUDA 8 build on Windows (#9491) (#9491)
Summary:
Fixes #9092.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9491
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9693

Differential Revision: D8946850

Pulled By: ezyang

fbshipit-source-id: bd816f459ab70f6b4a0983305a1ce341bb633707
2018-07-23 06:40:39 -07:00
Adam Paszke
aa7af94656 Make JIT tracing a thread-local property (#9414)
Summary:
As in the title. Lets us simplify a lot of code.

Depends on #9363, so please review only the last commit.

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414

Reviewed By: zdevito

Differential Revision: D8836496

Pulled By: apaszke

fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
2018-07-19 19:09:39 -07:00
Anders Papitto
4c615b1796 Introduce libtorch to setup.py build (#8792)
Summary:
Prior to this diff, there have been two ways of compiling the bulk of the torch codebase. There was no interaction between them - you had to pick one or the other.

1) with setup.py. This method
- used the setuptools C extension functionality
- worked on all platforms
- did not build test_jit/test_api binaries
- did not include the C++ api
- always included python functionality
- produced _C.so

2) with cpp_build. This method
- used CMake
- did not support Windows or ROCM
- was capable of building the test binaries
- included the C++ api
- did not build the python functionality
- produced libtorch.so

This diff combines the two.

1) cpp_build/CMakeLists.txt has become torch/CMakeLists.txt. This build
- is CMake-based
- works on all platforms
- builds the test binaries
- includes the C++ api
- does not include the python functionality
- produces libtorch.so

2) the setup.py build
- compiles the python functionality
- calls into the CMake build to build libtorch.so
- produces _C.so, which has a dependency on libtorch.so

In terms of code changes, this mostly means extending the cmake build to support the full variety of environments and platforms. There are also a small number of changes related to the fact that there are now two shared objects - in particular, windows requires annotating some symbols with dllimport/dllexport, and doesn't allow exposing thread_local globals directly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8792

Reviewed By: ezyang

Differential Revision: D8764181

Pulled By: anderspapitto

fbshipit-source-id: abec43834f739049da25f4583a0794b38eb0a94f
2018-07-18 14:59:33 -07:00
Peter Goldsborough
2d5fbe6e0d Improve Variable interface (#5127)
* Improve Variable interface

* Address comments from @apaszke and @colesbury

* string ::operator= is not noexcept

* Remove ir.h from tracer_state.h to improve build times

* Make Variable a struct and pack SavedVariable fields

* Implement as_variable_ref

* grad_fn_ptr() -> grad_fn_unsafe()

* Reduce hackiness of set_type hack

* Include variable.h and edge.h in tracer_state.h because it uses them

* class Variable -> struct Variable because Windows cant even

* Make Variable::output_nr uint32_t instead of int

* Add comment about tracing state

* Replaced more static_cast<Variable&> and improve docs

* Remove SavedVariable destructor and construct members in init list

* Clarify docs for Variable

* Variable::set_version -> set_version_counter
2018-02-12 23:26:26 -05:00
Sam Gross
d605058212
Replace Variable.volatile with torch.no_grad() (#3970)
This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().

In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()

Fixes #3627
2017-12-18 15:46:13 -05:00
Edward Z. Yang
1c0fbd27a1
CuDNN bindings rewrite (into ATen) (#3666)
* Comprehensive rewrite of Torch CuDNN bindings / a bit of ATen infra

The executive summary is that this moves the torch/csrc/cudnn
library into ATen, adding a number of new cudnn_ methods to ATen
for batchnorm, convolution, affine grid generator and grid sampler.

ATen infra changes:

- TensorGeometry was moved to ATen
- TensorGeometry was modified to make its interface resemble that of
  Tensor; in particular, sizes is no longer a field, it's a method.
- AT_CUDA_ENABLED macro is set via ATen/Config.h header which is
  generated at cmake configure time.
  Fixes https://github.com/zdevito/ATen/issues/168
- Change AT_CUDA_ENABLED macro to be a function macro, so that we
  error if it is not defined
- Introduce a new TensorArg class, which is a Tensor plus a little
  metadata.  This helps us give good error messages when checking
  dimensions/shapes of tensors.
  Fixes https://github.com/zdevito/ATen/issues/169
- Also introduce a TensorGeometryArg class, for when you don't
  need the actual tensor data (which is most of the time.)
- Add ATen/Check.h, which contains a number of utility functions
  for testing shapes, types and devices of input tensors.  This
  will be particulary useful for native methods, which don't get
  code generated input testing code.  These functions take a
  'CheckedFrom' argument, at the moment just a string, which
  specifies some extra information about what function was
  doing the actual checking; this greatly improves error messages.
    - Many check functions take initializer lists, which let you
      test that all tensors have some property.  This API is
      peculiar, in that we IGNORE undefined tensors in this case.
      This is handled by filterDefined.
- Add AT_CUDNN_ENABLED macro
- CuDNN linking from ATen was improved; for example, we now actually
  add the CuDNN headers to our include path.
- Add some missing override specifiers to some methods
- We now actually build tests with CUDA functionality accessible
  (previously, AT_CUDA_ENABLED was not defined, meaning that
  the headers were missing all CUDA-only functionality.)
- Native functions now support giving explicit names to return
  outputs in yaml.  This makes it possible to hook into the NN
  autogenerated derivatives codepath using native functions.

CuDNN rewrite changes:

- torch/csrc/cudnn now uses ATen (rather than passing around
  THVoidTensor) and lives in ATen.  This lets us remove tensorPointer
  shenanigans.  The functions are exposed to ATen as native functions
  described in aten/src/ATen/cudnn/cuDNN.yaml
- ATen now builds and links against CuDNN when enabled.  The cmake
  package script was taken from Caffe2.
- Some header reorganization was done to help reduce dependencies
  on headers (this reorg is no longer used but I've kept it)
- Rename CHECK to CUDNN_CHECK
- Rip out old shape/type testing code in favor of modern ATen/Check.h
  interface using TensorArg.  In many cases, increase the robustness of
  the checking code.
- Change the inputs of the public facing functions, so that they can
  be bound by ATen
  - Delete THCState*; this is retrieved from the global ATen context
  - Delete cudnnHandle_t, this is retrieved from the global Handles.h
  - Delete cudnnDataType_t, this is retrieved from the Tensor type
  - Delete Convolution class, instead its constituent arguments are
    passed individually
- Change functions to return tensors, rather than take an appropriately
  sized output tensor as an input.
- Redo how transposed convolution / backward convolution is implemented
  (knock on effect of returning tensors).  Previously it was assumed
  that you would always pass an appropriately sized output tensor, but
  we don't want to do this anymore.  For backwards, we instead give
  the desired output tensor (input, really) size, because that is
  readily available.  For *transposed* convolution, however, we take
  output_padding, and otherwise do the shape calculation.
- Redo how legacy group convolution is implemented (knock on effect from
  porting cudnn to ATen.)  Previously, group convolution was implemented
  by manually constructing sizes and strides and then outputting
  appropriate, with macros switching between individual groups and
  all-at-once based on CuDNN version.  Now, the code looks exactly what
  you'd expect: there's a top-level wrapping function that supports
  group convolution no matter the version of CuDNN, and a low-level
  wrapper which supports only what CuDNN supports.  The top-level
  function conditions on CuDNN version, and invokes the low-level
  interface 1 or n times.
- There is now a debugging printer for tensor descriptors.
- Convolution struct is replaced with ConvolutionArgs, which is not
  part of the public API but is used internally to conveniently
  pass around all of the arguments needed for Convolution.
- Add some constexprs for well-known dimensions, reduce amount of
  magic numbers in code.
- Put 'deterministic' in to ConvParams.  Fixes #3659
- Lots more comments.
- Some pessimizations, in the name of code clarity:
  - The descriptors are initialized on every invocation of convolution
    forward/backward.  Previously, the descriptors were cached, so that
    you didn't have to initialize them again on backwards.  This is
    difficult to support in the ATen interface so I didn't support it.
  - Legacy group convolution initializes its workspace for *every* group
    it performs.  I did not feel motivated to fix this because the
    legacy codepath is already quite slow.
- Affine grid generator and grid sampler automatically call contiguous
  on their arguments as necessary.
- Batchnorm input checking is greatly beefed up, it now checks for
  the following input characteristics:
    - Definedness
    - GPU location
    - Type
    - Contiguity
    - Size

PyTorch binding code changes

- batchnorm now uses consistent var/data naming
- batchnorm and convolution make use of new ATen bindings
- Affine grid generator and grid sampler make use of ATen CuDNN
  bindings via derivatives.yaml.  This means I had to restructure
  the code a little, since the THNN bindings still go through
  a legacy Python class.
- I fixed some warnings:
  - s/friend class/friend struct/ on InterpreterStateImpl
  - Removed pessimizing move 'detached' in torch/csrc/autograd/variable.cpp
  - Removed unused pack_list on Scalar

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

GCC 4.8 buildfix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Add TensorGeometry to ATen.h

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

CUDNN_CHECK

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Update TODO comment

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Delete return in cudnn_grid_sampler

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

s/cudnnSetStreamToCurrent/setCuDNNStreamToCurrent/g

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Don't allocate a new vector when filtering defined.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Remove Check overloads, convert to pass references.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Some more microbenchmarking.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-11-30 23:06:58 -05:00
Sam Gross
b09d66e60d
Fix a reference cycle when in-place ops on views save the output (#3679)
Previously, an in-place operation that saves its output (such as
relu/threshold) would create a reference cycle when applied to the a
view. There were two cycles created:

1) The cycle base.grad_fn.fn.input_.base
   base.grad_fn is a CopySlices
   base.grad_fn.fn is ThresholdBackward
   base.grad_fn.fn.input_ is a SavedVariable with base pointing to base

2) The cycle base.grad_fn.fn.input_.grad_fn.next_functions[0]
   base.grad_fn.fn.input_.grad_fn is AsStridedBackward
   and next_functions[0] points to base.grad_fn

Generally, we avoid cycles because the AD graph is mostly immutable. Two
notable exceptions are:

a) Variable.grad_fn can change to point to a new grad_fn
b) SavedVariables in a function can be set after the function is created

The first case is not a problem if grad_fns do not hold strong references
to Variables. Removing "base" from SavedVariable removes the strong ref.

For the second case, we need to avoid saving the grad_fn of outputs. We
were incorrectly saving the grad_fns of outputs when they were the
result of in-place ops on views.
2017-11-15 15:19:41 -05:00
Sam Gross
fde355f7d4
Allow in-place operations on views (#3384)
Allow in-place operations on views

Adds VariableViewImpl, a subclass of VariableImpl which has a pointer to
the base Variable on which it is a view. In-place operations on views
change the grad_fn of the base.

Note that in-place operations only work on views that are the first output of the function that created them. All C++/ATen implemented functions have this behavior, but it's possible to write Python-implemented autograd functions that do not. In-place operations on these view will raise an exception.

Fixes #3313
2017-11-06 18:19:56 -05:00
Edward Z. Yang
3696300fcf Include Python.h less using a new stub header.
In many "non-Python" headers, we include Python.h because we need
to declare a pointer to PyObject, and solely because of that.  It
would be a lot better if we had a simpler version of Python.h that
just declared PyObject available for pointers, without anything
else.  This is what torch/csrc/utils/python_stub.h does.

The good thing about not including Python.h is that it is easy to
be warning-less; no more ugly insertions of Python.h on headers
where it has no good reason to be.

This makes PyTorch warning clean again.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-10-19 23:04:19 -04:00
Sam Gross
e970d35091 Make VariableVersion refcounting thread-safe (#3184)
I've also made the version counter and the "live" reference count
atomics.

Note that it's not safe to set the version counter (operator=) from
multiple threads, because shared_ptr assignment isn't thread safe.
Currently, the only call sites to these functions are on newly created
variables before they can be accessed from other threads.

See #3111
2017-10-19 17:22:01 -04:00