pytorch/torch/csrc/autograd/cpp_hook.cpp
soulitzer 049838f249 Improve hooks ordering behavior (#85849)
Addresses: https://github.com/pytorch/pytorch/issues/35802

Design doc: https://docs.google.com/document/d/19xSib7FFknRQ5f3ptGFUmiOt3BrgXSUlTQH2xMcZJYg/edit#

### Changes in this PR

#### Implementation
- We have now have 3 fields: pre_hooks, retains_grad_hooks, and tensor_pre_hooks so that we can more precisely define their ordering and when they are executed.
- Since retains grad uses an entirely new field, we cannot reuse the old retains grad, logic. We refactor retains grad to call directly into the variable.cpp logic. Other logic in variable.cpp that handle cpp hooks must also be updated.

#### Hooks ordering and execution:
- Defines pre-hooks registered on tensor to run before pre-hooks registered on grad_fn
- Updates pre-hooks registered on tensor to always run, even if they are the inputs= to .grad()
- Post hooks (and pre hooks) can now observe the modifications to gradient by the tensor pre hook

#### Retains grad hooks
- retains grad hooks always execute last, even if there are other tensor pre-hooks registered

#### Unchanged:
- pre_hooks registered to grad_fn aren't expected to execute if they are the inputs= to .grad()

Follow ups:
- simplify retains_grad field to not be a vector, since it always holds a single hook
- potentially merge capture hooks with tensor pre hooks, this would involve some additional refactoring since
- python hooks registered to tensor behavior on in-place is still wrong

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85849
Approved by: https://github.com/albanD
2023-01-17 16:23:21 +00:00

53 lines
1.4 KiB
C++

#include <c10/util/irange.h>
#include <torch/csrc/autograd/cpp_hook.h>
#include <torch/csrc/autograd/custom_function.h>
#include <torch/csrc/autograd/variable.h>
namespace {
using torch::autograd::Variable;
void check_single_result(
const at::TensorBase& value,
const at::TensorBase& result,
std::string hook_name) {
if (!value.defined()) {
throw std::runtime_error(
"can't replace a empty gradient with a non-empty value");
}
torch::autograd::check_variable_result(value, result, hook_name);
}
} // namespace
namespace torch {
namespace autograd {
// NOLINTNEXTLINE(modernize-pass-by-value)
CppFunctionTensorPreHook::CppFunctionTensorPreHook(
const std::shared_ptr<hooks_list>& hooks,
int value_idx)
: hooks_(hooks), value_idx_(value_idx) {}
variable_list CppFunctionTensorPreHook::operator()(
const variable_list& values) {
auto value = values[value_idx_];
for (const auto i : c10::irange(hooks_->size())) {
auto& hook = (*hooks_)[i];
if (!hook) {
// hook was removed
continue;
}
auto res = hook(value);
if (!res.defined()) {
// Don't change gradient
continue;
}
check_single_result(value, res, c10::to_string(i));
value = std::move(res);
}
variable_list results(values);
results[value_idx_] = value;
return results;
}
} // namespace autograd
} // namespace torch