This PR is part of a series attempting to re-submit https://github.com/pytorch/pytorch/pull/134592 as smaller PRs.
In jit tests:
- Add and use a common raise_on_run_directly method for when a user runs a test file directly which should not be run this way. Print the file which the user should have run.
- Raise a RuntimeError on tests which have been disabled (not run)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154725
Approved by: https://github.com/clee2000
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.
Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79498
Approved by: https://github.com/eellison
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.
Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78875
Approved by: https://github.com/eellison
When autodiff is constructing the Gradient object, it looks at the
forward graph and records all the outputs that requires_grad into
df_input_vjps. Then at runtime, graph_executor.cpp will detach the
tensors before running the autodiff forward graph, and then add
requires_grad back onto the outputs if they need requires_grad.
Before, the require_grad check was done by just checking
`output->requires_grad()`. But at the point when autodiff is called by
profiling executor, the profiled information is still in the profile
nodes, not on values. So requires_grad would not be set on the output
values, and requires_grad() would default to True on all tensors. As a
result more output tensors than expected would require_grad.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78392
Approved by: https://github.com/eellison