when a tensor has unbacked symbols it can be general enough to represent both contiguous and non contiguous tensors.
in that case we cant really evaluate is_contiguous. In many places in the code base, we check for is_contiguous to take a fast path. but the general path usually works for both contiguous and not contiguous in that case we probably want
to use definitely _contiguous API.
This is appleid for reshape in this PR and also to tensor meta data computation, the meta data now will have an attribute that says that its contiguous when its always contiguous. We would store that only if definitely _contiguous is true now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153432
Approved by: https://github.com/bobrenjc93
#### change 1: if compute_strides stride fail for reshape just clone.
Lets consider the most general case, if torch compile is asked to reshape [u0, u1][u3, u4] -> [u5, u6] what shall it do?
The shape is general enough to represent both contiguous and non contiguous tensors, tensors where a clone free reshape can happen and other where a clone free cant happen. The current algorithm will fail due to data dependent errors.
The general idea is if its impossible to tell if the reshape can happen in place, (because for some concrete inputs
it will and other not) then its ok to take the general path and clone, instead of failing or asking the user to give hints.
**Because the user want a single graph (single compilations)** and this is the only way it can be done.
Had this been a view? then the user is explicitly asking for a copy-free reshape, we would fail asking for more
information (hints in torch.checks form).
with this change reshape works as the following:
1. if we know the input is contiguous we will convert the reshape to view.
2. if compute_strides succeed we will use view. (compute_strides was changed to not fail when when unbacked presented instead it will just return nullptr if it cant compute the strides meaning we shall use a clone).
3. if neither 1, 2 works clone and use a view.
Side note: having a view does not mean that inductor will not clone, for inductor there is a pass that converts all views back to reshapes and inductor has its logic dealing with those.
#### change 2 : skip _reshape_view_helper and fall back to simpler logic if it fail.
We trace _reshape_view_helper when doing fake tensor tracing , but not during proxy tracing. hence such tracing wont effect the graph (only compute output shapes of several operations). We should not fail there, because it should always be possible for us to pass it in case of reshape.
i.e. when reshape_symint was called we would have either cloned, or compute_strides succeeded so the view should pass. What I did is the following: we run _reshape_view_helper, if we fail due to unbacked we call _view_simple which will succeed always for reshapes, (might fail for views when its impossible to do the view, in such case we throw the dde that was thrown by the original algorithm).
Ideally I would want to register _view_simple as the meta for view and avoid calling _reshape_view_helper completely but I am running some issues with the dispatcher with subclasses and I do not have time to debug it. Namely one test
would end up calling some c++ view function that does not support symints during meta dispatch when i register a
python meta decompositions
```python test/dynamo/test_subclasses.py SubclassTests.test_subclass_views_dynamic_True ```
https://github.com/pytorch/pytorch/issues/153303.I will follow up with that change in a separate PR. cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @bdhirsh
Two other alternatives for registering _view_simple as meta and the try catch approach in this PR is:
1. call _view_simple if any input is dynamic see #153521
2. if we make is_compiling works for framework code tracing (does not work rn) we can call _view_simple
is if is_compiling.
#### Note:
Reshape can still fail when is_contiguous is called, Next PR will handle that by calling is_known_contiguous.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153198
Approved by: https://github.com/etaf, https://github.com/bobrenjc93
An internal test case ran into a weird issue when exporting, where the model imported a file which creates tensor constants upon importing [(code ptr)](https://fburl.com/code/xwmhxm7n). This causes the tracer to create some tensor constants even though it's not used in the model code. This PR updates the lift_constant_tensors pass to remove constant nodes that are not being used instead of lifting them as tensor constants.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153800
Approved by: https://github.com/dolpm, https://github.com/pianpwk
so two things other than cleanups and refactoring
1) do not use propagate_real_tensors to resolve eval under guard_or_true/guard_or_false .
2) do not guard for dimensions of type DimDynamic.OBLIVIOUS_SIZE under guard_or_true/guard_or_false .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152657
Approved by: https://github.com/pianpwk
so two things other than cleanups and refactoring
1) do not use propagate_real_tensors to resolve eval under guard_or_true/guard_or_false .
2) do not guard for dimensions of type DimDynamic.OBLIVIOUS_SIZE under guard_or_true/guard_or_false .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152657
Approved by: https://github.com/pianpwk
Basically adds native _IntWrapper support to dynamo. Here's my process of trying to make symint input support work on dynamo, and how I ended up with this approach [(doc)](https://docs.google.com/document/d/1GvNRQd8BnxlMay_hrEVgEta6VUeUW_hcFeRuB7q1nDY/edit?tab=t.0).
What I did was, before passing inputs to dynamo.export, I first wrap them with a class, `_IntWrapper`. When processing dynamic shapes, I will then add the corresponding dynamic shape specification to the `dynamism` field stored on the `_IntWrapper`. If there is no dynamism specified, then this will get unwrapped back to an integer. When dynamo tracing, when we encounter an `_IntWrapper`, we will convert this to a symint if the dynamism was specified as `Dim.DYNAMIC/AUTO`. Dynamo will then trace a graph that contains symint inputs, which will get passed to AOTAutograd and so on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152677
Approved by: https://github.com/pianpwk
Summary:
1. Adding `input` field to `_adapt_flat_args` function
2. In `process_forward_inputs`, `reorder_kwargs` will now do nothing if no kwargs are provided (previously would error)
3. Pass `args` as input to `_adapt_flat_args`
These changes are made to update the InputAdapter
see more context in D73811508
Test Plan: see D73811508
Differential Revision: D73945419
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152575
Approved by: https://github.com/angelayi
Fixes#151582
example warning for Dim.AUTO:
```
torch/_export/non_strict_utils.py:499] dimension inputs['x'].shape[1] 0/1 specialized; Dim.AUTO was specified along with a sample input with hint = 1.
```
example error when Dim.DYNAMIC specializes:
```
- Received user-specified dim hint Dim.DYNAMIC(min=None, max=None), but export 0/1 specialized due to hint of 0 for dimension inputs['x'].shape[0].
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151827
Approved by: https://github.com/angelayi
We need to make function schema proxyable to trace a the auto_functionalized hop that takes function schema as inputs. The implementation basically follows how we support torchbind object:
1. upon seeing an untracked function schema arg, we creates a constant get_attr node
2. we track the function schema argument in export to support lift/unlift.
3. we need to support serde for functional schema. We'll add support for this in follow-up PRs.
However, compared with torchbind object:
1. we don't need a dynamo implementation, because the function schema is added when we auto_functionalize a hop to the argument of auto_functionalized. One potential use case is users re-traces an exported program with strict mode. Since non-strict is the default now, we don't see a use case yet.
2. we don't need an inductor implementation, because the function schema will go away after auto_functionalized re-inplacing pass.
edit: we greatly simplifies (and generalizes) the implementation following @zou3519 's suggestion of using pytree.register_constant
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152073
Approved by: https://github.com/zou3519
ghstack dependencies: #152072
Summary:
Fixes https://github.com/pytorch/pytorch/issues/151476
The `custom_meta` collected from `mod` has keys that follow name of nodes in `mod`, which are inconsistent with the node names after the naming pass. For example a constant `b` will become `c_b`.
Test Plan: buck2 run caffe2/test:test_export -- -r test_run_decompositions_keep_tensor_constant_metadata
Differential Revision: D73703068
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152241
Approved by: https://github.com/angelayi
With `AdditionalInputs`, the behavior is the same as with tensors:
```python
class M(torch.nn.Module):
def forward(self, x, y):
return x + y
additional_inputs = torch.export.AdditionalInputs()
additional_inputs.add((5, 5))
additional_inputs.add((3, 5))
additional_inputs.add((5, 4))
ep = torch.export.export(
M(), (6, 7), dynamic_shapes=additional_inputs, strict=False
)
```
With `ShapesCollection`, we now need to wrap integer inputs as `_IntWrapper` so that we can have a unique identifier for each integer input.
```python
class M(torch.nn.Module):
def forward(self, x, y):
return x + y
from torch.export.dynamic_shapes import _IntWrapper
args = (_IntWrapper(5), _IntWrapper(5))
# Or we can do `args = pytree.tree_map_only(int, lambda a: _IntWrapper(a), orig_args)`
shapes_collection = torch.export.ShapesCollection()
shapes_collection[args[0]] = Dim.DYNAMIC
shapes_collection[args[1]] = Dim.DYNAMIC
ep = torch.export.export(
M(), args, dynamic_shapes=shapes_collection, strict=False
)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151842
Approved by: https://github.com/pianpwk
Summary: tree_flatten_with_map will internally call unflatten function with user supplied function. But this function was not returning anything causing the leaves to be None. This is wrong when the constructor is sensitive to this behaviour
Test Plan: CI
Differential Revision: D73388529
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151824
Approved by: https://github.com/bdhirsh