It's not clear to me what's the difference between `unfold` and `unfold_copy`, as this latter one is codegen'd
I also took this chance to clean the implementation of unfold and its reference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85629
Approved by: https://github.com/mruberry
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}
This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily
Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup
### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like
```python
## PRE-PR UX
def f(mode):
with mode.restore(): # user needs to understand this restore thing?
...
with Mode() as m:
pass
f(m)
```
Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation" step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
with mode:
...
f(Mode())
```
** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
The output striding channels-last preservation logic differs between cuda and cpu. For the meta kernel, we can peek at the fake tensor device and use that to determine whether to do cpu or cuda.
You could argue there's a leaking of abstraction here but this seems like a pretty minimal leak and I'm not sure there's a much cleaner way forward for device-specific striding tracing logic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82846
Approved by: https://github.com/ezyang
Previously, we would trace through the following with no error:
```
from torch.fx.experimental.proxy_tensor import make_fx
import torch
def f(x, y):
return x[0, y:]
```
Even though the output shape is dependent on the data of `y`. Now, throw on the conversion of `y` to an integer.
It would be nice to not break on constant tensors but I'll do that as the next PR (Edit: done with https://github.com/pytorch/pytorch/pull/84387). Sketching out how that would work (and keep in mind this is applicable Dynamo tracing and not just AOT Autograd)
I think to do that you would need to :
- hold strong refs to a set of constant tensors, and only allow them to be captured from `lift_fresh.copy`
- when you run a mutable op, either remove it from the set of constant tensors or run the operator for real
- limit to small constant tensors
Anything else ?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83567
Approved by: https://github.com/ezyang
Conditional decomposing aten::_to_copy to nvprim::convert_element_type to allow fusion with type casting, which is introduced during type promotion phase at torch decomposition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83782
Approved by: https://github.com/ngimel
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
Ref #82518
Starting small to minimize merge conflicts, this moves the top-level
class definitions and some helper functions into the `opinfos` folder.
It also brings `common_methods_invocations.py` to just below 1MB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82540
Approved by: https://github.com/albanD