Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.
Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79498
Approved by: https://github.com/eellison
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.
Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78875
Approved by: https://github.com/eellison
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57575
This PR does two things:
1. reverts "Manual revert of D27369251 (f88a3fff65) (#56080)" in commit
92a09fb87a.
2. fixing DifferentiableGraph output with wrong requires_grad flag
Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is
retrieved from profiling information. We previously only retrieves the profiling
information on the first profile node in all its uses. However, in case where
control flows are present, we need to iteratively search for profile node with
profiling information available, in case the first use is in an inactive code
path.
e.g.
```
graph(%0 : Tensor,
%1 : Bool):
..., %2 : Tensor = prim::DifferentiableGraph_0(%0)
%3 : Tensor = prim::If(%1)
block0():
%4 : Tensor = prim::DifferentiableGraph_1(%2)
-> (%4)
block1():
%5 : Tensor = prim::DifferentiableGraph_2(%2)
-> (%5)
-> (%3)
with prim::DifferentiableGraph_0 = graph(%0 : Tensor):
...
%out : Tensor = aten::operation(...)
...
return (..., %out)
with prim::DifferentiableGraph_1 = graph(%0 : Tensor):
%temp : Tensor = prim::profile[profiled_type=Tensor](%0)
...
with prim::DifferentiableGraph_2 = graph(%0 : Tensor):
%temp : Tensor = prim::profile[profiled_type=Float(...)](%0)
...
```
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision: D29038773
Pulled By: Krovatkin
fbshipit-source-id: 6c0a851119f6b8f2f1afae5c74532407aae238fe
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54783
We need to be extra careful with the pattern to legitimately use `unchecked_unwrap_optional` in autodiff.
This would at least allow us to start support `Optional[Tensor]` in autodiff, which is quite common in composite layers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55565
Reviewed By: ejguan
Differential Revision: D27825336
Pulled By: Krovatkin
fbshipit-source-id: a8562eb10ea741effff430d7417d313b1eb53dfe
Summary:
The retrieval of profile node is much easier prior to inserting guard node.
test cases updated to reflect the patch on a previously failing cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55701
Reviewed By: pbelevich
Differential Revision: D27701216
Pulled By: Krovatkin
fbshipit-source-id: e2e6b64b682377e622b75c762e85ff7967e45118
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54040
`prim::RequiresGradCheck` guarantees that requires_grad properties
of input tensors will match the profiled, otherwise a fallback path
will be triggered. This allow us to prune off gradients in backward
graph for inputs that don't need gradients. We transfer requires_grad
properties from inputs to the `prim::DifferentiableGraph` onto inputs to the
differentiable graph. Autodiff will inspect these properties and prune
off gradients that aren't required
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54374
Reviewed By: H-Huang
Differential Revision: D27369251
Pulled By: Krovatkin
fbshipit-source-id: 2bce7a2d7f2ec091db9bf4c4b91d8b29edd5be11
Summary:
This adds guarding for DifferentiableGraph nodes in order to not depend on
Also bailing out on required gradients for the CUDA fuser.
Fixes https://github.com/pytorch/pytorch/issues/49299
I still need to look into a handful of failing tests, but maybe it can be a discussion basis.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49433
Reviewed By: ngimel
Differential Revision: D25681374
Pulled By: Krovatkin
fbshipit-source-id: 8e7be53a335c845560436c0cceeb5e154c9cf296
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42141
Update alias db in-place instead of having to construct alias db from scratch on each change, causing O(n^2) behavior.
Description from https://github.com/pytorch/pytorch/pull/37106 holds pretty well:
"""
Recomputing the aliasdb on every fusion iteration + in every subblock
is hugely expensive. Instead, update it in-place when doing fusion.
The graph fuser pass operates by pushing nodes into a fusion group. So
we start with
`x, y = f(a, b, c)`
and end with:
```
x_out, y_out = prim::fusionGroup(a, b, c)
x_in, y_in = f(a_in, b_in, c_in)
-> x_in, y_in
```
We destroy the x and y Value*s in the process. This operation is
easy to express as an update to the aliasDb--x_out just takes on all
the aliasing information x used to have. In particular, since we know
f and prim::fusionGroup are purely functional, we don't have to mess
with any write information.
"""
The one difficulty here is mapping x, y to x_out, y_out is not trivial in merging nodes into the autodiff subgraph node.
There are a few options:
- attempt to make all subgraph utils & ir cloning logic update a map
- mirror the subgraph utils implementation in create_autodiff_subgraph
- uniquely map x, y and x_in, y_in so you can back out the correspondence.
I went with the third option.
This shouldn't affect the results of the pass at all. LMK if you think there's anything else I should be doing to test, I was thinking about maybe exposing an option to run create autodiff subgraphs without the post processor and check that the alias db was correctly updated.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D22798377
Pulled By: eellison
fbshipit-source-id: 9a133bcaa3b051c0fb565afb23a3eed56dbe71f9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39497
Previously, we didn't consider side effects at all when moving nodes in alias analysis. It is never valid to reorder a node with a side effect. This has led to bugs when used with Bailouts.
Unfortunately this will might cause regressions but it wasn't correct prior :/
Test Plan: Imported from OSS
Differential Revision: D21963774
Pulled By: eellison
fbshipit-source-id: 656995d1b82534eca65437ed4e397b2bf08a4dec
Summary:
The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825
Differential Revision: D21404611
Pulled By: eellison
fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445
Create distributed and rpc directories under caffe/test for better management
of unit tests.
Differential Revision: D18702786
fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29249
This splits out all the tests that are "easy", leaving `TestJit`,
`TestScript`, the autogenerated tests, and a small docs test.
Splitting those into reasonable chunks is more effort which is less
mechanical.
Differential Revision: D18339007
Test Plan: Imported from OSS
Pulled By: suo
fbshipit-source-id: 69164b9f9a2c379fe8923a846c98dd3c37ccb70e