Summary:
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Test Plan:
contbuild & OSS CI,
Rollback Plan:
Reviewed By: malfet
Differential Revision: D77639021
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157472
Approved by: https://github.com/aorenste
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.
sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.
ex:
bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.
In this PR I only handle default contiguity, will follow up with changes for other formats like channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
Original issue: https://github.com/pytorch/pytorch/issues/154820
The issue happens when there is a mutation for the same input in forward AND in backward.
AOTD emited copy_ after joint_function tracing. This made this fx-node to correspond to the side effects of both mutations (in forward and in backward).
After that partitioner can put it either in forward or in backward.
The fix:
1/ Introduce joint_function.handle that allows to set "post_forward" callback, to be able to check inputs state after forward
We do not want to apply the mutation after joint, if we already applied it in forward. For that we need "mutation_counter" and memorize the version of mutation that we applied for forward mutation.
2/ Exposing mutation_counter to python
We want to keep invariant that copy_ exist only in the end of joint graph.
3/ We memorize mutation_counter and state of the inputs after forward, using the handle post_forward.
Emit post_forward mutations after joint graph fully traced.
add for post_forward mutations "must_be_in_forward" tag (similar to existing "must_be_in_backward") to keep them in forward.
4/ Ban recompute of the source of mutation. Recompute can apply the same op (e.g. add) in forward and backward.
For this set MUST_SAVE for the source of mutation in forward.
proxy_tensor changes:
By default proxy tensor updates tensor_tracker. In this case applied mutations will be chained.
But we want that this copy_ will be independent and applied just to primals.
For this introducing a contextmanager to be able to disable update of tensor_tracker for adding forward mutations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155354
Approved by: https://github.com/bdhirsh
This PR runs an automatic check as part of dynamo_wrapped to make sure that all unimplemented_v2() callsites are mapped to the JSON file. It also fixes the issue of the CI not able to expand the hints, which was the root cause of the previous workflow failure. If not, the dev gets a message giving them instructions on how to update the JSON file. I also updated a dynamic gb_type to static and updated its test_error_message to include the GBID link for the graph break (before the link would not be produced).
Testing:
I ran the file with the argument to ensure all cases were covered, and also tested the test in CI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156237
Approved by: https://github.com/williamwen42
Original issue: https://github.com/pytorch/pytorch/issues/154820
The issue happens when there is a mutation for the same input in forward AND in backward.
AOTD emited copy_ after joint_function tracing. This made this fx-node to correspond to the side effects of both mutations (in forward and in backward).
After that partitioner can put it either in forward or in backward.
The fix:
1/ Introduce joint_function.handle that allows to set "post_forward" callback, to be able to check inputs state after forward
We do not want to apply the mutation after joint, if we already applied it in forward. For that we need "mutation_counter" and memorize the version of mutation that we applied for forward mutation.
2/ Exposing mutation_counter to python
We want to keep invariant that copy_ exist only in the end of joint graph.
3/ We memorize mutation_counter and state of the inputs after forward, using the handle post_forward.
Emit post_forward mutations after joint graph fully traced.
add for post_forward mutations "must_be_in_forward" tag (similar to existing "must_be_in_backward") to keep them in forward.
4/ Ban recompute of the source of mutation. Recompute can apply the same op (e.g. add) in forward and backward.
For this set MUST_SAVE for the source of mutation in forward.
proxy_tensor changes:
By default proxy tensor updates tensor_tracker. In this case applied mutations will be chained.
But we want that this copy_ will be independent and applied just to primals.
For this introducing a contextmanager to be able to disable update of tensor_tracker for adding forward mutations.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155354
Approved by: https://github.com/bdhirsh
This PR includes the GBID weblink whenever a user encounters a graph break. I also had to include the JSON file in setup.py, so it can be part of the files that are packaged in during CI. It also fixes the issue of the hardcoded error messages stripping away one of the '/' in 'https'.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156033
Approved by: https://github.com/williamwen42
This PR includes the GBID weblink whenever a user encounters a graph break. I also had to include the JSON file in setup.py, so it can be part of the files that are packaged in during CI. It also fixes the issue of the hardcoded error messages stripping away one of the '/' in 'https'.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156033
Approved by: https://github.com/williamwen42
Whose version is controlled by `eigen_pin.txt`, but which will be installed only if BLAS providers could not be found.
Why this is good for CI: we don't really build with Eigen ever and gitlab can be down when github is up, which causes spurious CI failures in the past, for example.
Remove eigen submodule and replace it with eigen_pin.txt
Fixes https://github.com/pytorch/pytorch/issues/108773
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155955
Approved by: https://github.com/atalman