Commit Graph

121 Commits

Author SHA1 Message Date
rzou
154f27455a [triton_op] fix autotuning (#131363)
The problem was we were shoving SymInts into the constant_args side
table. The root problem is that torch.fx.node.base_types, which we use
to determine what can be put in the graph, doesn't actually have SymInt
in it. This PR fixes base_types to include SymInt.

Test Plan:
- tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131363
Approved by: https://github.com/oulgen
2024-07-23 16:15:00 +00:00
Shangdi Yu
29e2e2afb6 Revert D59561509: Multisect successfully blamed "D59561509: [FX][export] DCE pass, check schema for node impurity (#130395)" for one test failure (#131341)
Summary:
This diff reverts D59561509
D59561509: [FX][export] DCE pass, check schema for node impurity (#130395) by yushangdi causes the following test failure:

Tests affected:
- [cogwheel:cogwheel_mtia_cmf_m5_shrunk_test#test_flow_with_verification](https://www.internalfb.com/intern/test/844425041436985/)

Here's the Multisect link:
https://www.internalfb.com/multisect/6533402
Here are the tasks that are relevant to this breakage:
T191383430: 10+ tests unhealthy for ads_mtia_inference

The backout may land if someone accepts it.

If this diff has been generated in error, you can Commandeer and Abandon it.

Test Plan: NA

Differential Revision: D60029318

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131341
Approved by: https://github.com/angelayi
2024-07-23 05:23:47 +00:00
Shangdi Yu
27ded03545 [FX][export] DCE pass, check schema for node impurity (#130395)
Change the default DCE pass to check node schema for impure nodes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130395
Approved by: https://github.com/angelayi, https://github.com/jgong5
2024-07-18 16:31:40 +00:00
PyTorch MergeBot
433ef4e444 Revert "[FX][export] DCE pass, check schema for node impurity (#130395)"
This reverts commit e22b0acc76.

Reverted https://github.com/pytorch/pytorch/pull/130395 on behalf of https://github.com/yushangdi due to breaking tests, need to rebase and fix ([comment](https://github.com/pytorch/pytorch/pull/130395#issuecomment-2235192986))
2024-07-18 02:46:03 +00:00
Shangdi Yu
e22b0acc76 [FX][export] DCE pass, check schema for node impurity (#130395)
Change the default DCE pass to check node schema for impure nodes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130395
Approved by: https://github.com/angelayi, https://github.com/jgong5
2024-07-18 00:55:20 +00:00
Brian Hirsh
a4d7aa498b [Traceable FSDP2] Add auto-functionalize support for mutable list[Tensor] (copy from Brian's PR #127347); enable E2E inductor unit test for transformer model (#129502)
Copy of Brian's PR: https://github.com/pytorch/pytorch/pull/127347 with additional changes to support mutable `List[Tensor]` in Inductor. Also enable E2E inductor unit test for Traceable FSDP2 + transformer model.

Test commands:
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_trace_fsdp_set_`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_inductor`
- `pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_fullgraph_backend_aot_eager`
- `pytest -rA test/dynamo/test_misc.py::MiscTests::test_auto_functionalize_tensorlist`
- `pytest -rA  test/inductor/test_torchinductor.py::GPUTests::test_fallback_mutable_op_list_cuda`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129502
Approved by: https://github.com/zou3519
2024-06-27 17:50:57 +00:00
PyTorch MergeBot
45b2931b7e Revert "[Traceable FSDP2] Don't decompose fsdp.split_with_sizes_copy (#129414)"
This reverts commit b24787b757.

Reverted https://github.com/pytorch/pytorch/pull/129414 on behalf of https://github.com/ZainRizvi due to This PR is seems to be causing multiple macos failures.  Looks like it was merged before trunk jobs were started, which would have run those tests ([comment](https://github.com/pytorch/pytorch/pull/129414#issuecomment-2189479505))
2024-06-25 17:05:55 +00:00
Will Feng
b24787b757 [Traceable FSDP2] Don't decompose fsdp.split_with_sizes_copy (#129414)
This makes it easier to do pattern-matching on `fsdp.split_with_sizes_copy` in Inductor passes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129414
Approved by: https://github.com/bdhirsh
2024-06-25 03:08:56 +00:00
Brian Hirsh
b91a9dc328 [Brian's PR #128754] Use torch.ops.fsdp.set_ for FSDP2 storage resize; dont functionalize resize_, set_, split_with_sizes_copy.out (#129203)
This is a copy of Brian's PR https://github.com/pytorch/pytorch/pull/128754, with some changes in the test_distributed_patterns.py unit tests to more closely reflect FSDP2 patterns. Also disabled two tests `test_input_mutation_storage_resize_up_down` and `test_input_mutation_storage_resize_not_supported` in test_aotdispatch.py until we figure out the right behavior for them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129203
Approved by: https://github.com/bdhirsh
2024-06-23 06:07:19 +00:00
Will Feng
e165a5971f [Traceable FSDP2] Fix support for CUDA resize_storage_bytes_ (#129215)
Currently if `x` is a CUDA tensor, calling `x.untyped_storage().resize_()` seems to always go into the `built without cuda` branch of `resize_storage_bytes_()` regardless of whether PyTorch is built with CUDA. I suspect this is because `inductor_ops.cpp` is only included in `libtorch_cpu.so` thus doesn't have the `USE_CUDA` information or ability to link to CUDA-related functions.

This PR moves `resize_storage_bytes_()` related custom op functions out of `inductor_ops.cpp` into its standalone file `resize_storage_bytes.cpp` to be included in `libtorch_python.so` instead. This mimics the setup for `StorageMethods.cpp`. This way, `resize_storage_bytes_()` can have access to the CUDA-related functions, which passes the CUDA unit test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129215
Approved by: https://github.com/jansel
2024-06-22 18:38:47 +00:00
Oguz Ulgen
5b5d269d34 Speed up fx graph iteration by implementing it in C++ (#128288)
Before this change
```
python benchmarks/dynamo/microbenchmarks/fx_microbenchmarks.py
iterating over 100000000 FX nodes took 19.5s (5132266 nodes/s)
```

After this change
```
python benchmarks/dynamo/microbenchmarks/fx_microbenchmarks.py
iterating over 100000000 FX nodes took 3.4s (29114001 nodes/s)
```

5.7x improvement

Differential Revision: [D58343997](https://our.internmc.facebook.com/intern/diff/D58343997)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128288
Approved by: https://github.com/jansel, https://github.com/albanD
2024-06-11 05:48:31 +00:00
Aaron Orenstein
7a60a75256 Add typing annotations to pattern_matcher.py (#127458)
Turn on `mypy: disallow-untyped-defs` in pattern_matcher.py and fix the fallout.

There are still a bunch of `type: ignore` annotations which should eventually be ironed out.

In the processs found a bug: #127457

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127458
Approved by: https://github.com/Skylion007
ghstack dependencies: #127457
2024-06-04 15:24:47 +00:00
Brian Hirsh
9c3b87833a AOTAutograd: keep set_() input mutations in the graph, ban other cases (#122981)
We have some (limited) support for `set_()` input mutations in `torch.compile`, but one restriction today is that we force them to run outside of the graph, in the opaque runtime epilogue.

This is a problem for ppFSDP. Why? The usage pattern of ppFSDP forward graphs look something like this:
```
def forward_fsdp(sacrificial_param, sharded_param, inp):
    allgathered_param = allgather(sharded_param)
    sacrificial_param.set_(allgathered_param)  # hidden in an autograd.Function that we trace
    out = matmul(sacrificial_param, inp)
    sacrificial_param.untyped_storage().resize_(0)
    return out
```
When we functionalize this graph, `sacrificial_param` sees two distinct types of input mutations, that we must preserve: a `set_`, and a `resize_`. Importantly, at runtime the `set_()` must run **before** the `resize_()`. Why? the `set_()` updates the storage of our sacrificial param to the allgather'd data, which allows the call to `sacrificial_param.resize_()` to free the allgathered data later. If we run the two mutations in reverse order, we will never free the allgathered data.

We want to put the `resize_()` mutation op inside of the graph (see next PR, also there's a much longer description in that PR for anyone interested). However, this will require us to put `set_()` in the graph as well, in order for them to run in the correct order.

In order to do this, I had to add some extra restrictions: You are now required to run `set_()` under `no_grad()` if you use it with `torch.compile`, and if you perform any other mutations to the input, those must be under no_grad as well (otherwise, the mutations may mutate the `grad_fn` of the input, making it no longer safe to keep in the graph). These restrictions are hopefully reasonable, since `set_()` doesn't see much usage today (and the original impetus for adding set_() support a few months ago was for fsdp anyway)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122981
Approved by: https://github.com/jansel
ghstack dependencies: #122433, #123646
2024-04-11 18:21:57 +00:00
Oguz Ulgen
526a69f5ee Remove incorrect check (#123616)
Summary: This was a micro optimization that I thought would save time but it is not correct. For example, we cannot compare fake tensors.

Test Plan:
```
buck2 run 'fbcode//mode/opt' fbcode//langtech/edge/ns/tools/tests:test_ns_jit_traced_model_all_optimization_f328819347_portal_ns
```
now passes

Differential Revision: D55904083

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123616
Approved by: https://github.com/aakhundov
2024-04-09 08:45:34 +00:00
Oguz Ulgen
03b13851d9 [FX] Add side table to FX Graph for O(1) op/target query (#121565)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121565
Approved by: https://github.com/jansel
2024-04-07 18:51:05 +00:00
Jacob Szwejbka
41d24df08f [export] hack skip index_put_ in dce (#122683)
Summary: Ideally we should do whats in the todo. Just doing this for now to unblock llama capture

Test Plan: capturing llama and using pt2e to quantize it

Differential Revision: D55354487

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122683
Approved by: https://github.com/kimishpatel
2024-03-26 08:05:06 +00:00
Jason Ansel
18d94d7165 Make FX nodes sortable (#122071)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122071
Approved by: https://github.com/oulgen
2024-03-19 01:40:56 +00:00
Jason Ansel
75a6d6aef7 [inductor] Support storage resizing (#119749)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119749
Approved by: https://github.com/yf225
ghstack dependencies: #119647, #119671
2024-02-14 03:03:38 +00:00
Edward Z. Yang
9bce208dfb Replace follow_imports = silent with normal (#118414)
This is a lot of files changed! Don't panic! Here's how it works:

* Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file.
* When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded.
* The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors.
* Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list.
* Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves.
* torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state.
* There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many.

In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file.

The codemod was done with this script authored by GPT-4:

```
import glob

exclude_patterns = [
    ...
]

for pattern in exclude_patterns:
    for filepath in glob.glob(pattern, recursive=True):
        if filepath.endswith('.py'):
            with open(filepath, 'r+') as f:
                content = f.read()
                f.seek(0, 0)
                f.write('# mypy: ignore-errors\n\n' + content)
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414
Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD
2024-01-27 02:44:11 +00:00
Zhengxu Chen
abd759d50d [fx] Add hooks to intercept node replacements. (#117825)
Summary: Adding an experimental API to FX graph module to place "hooks" every time when we are changing or replacing nodes in a graph, so that we can properly update the new name in graph signature and potentially other places.

Test Plan:
buck test mode/opt  -c fbcode.enable_gpu_sections=true caffe2/test/distributed/_tensor/experimental:tp_transform

buck test mode/opt caffe2/test:test_export -- -r test_replace_hook

Differential Revision: D52896531

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117825
Approved by: https://github.com/avikchaudhuri
2024-01-23 22:28:40 +00:00
Edward Z. Yang
003c900d5e Add _assert_scalar (#117378)
Peeled off from https://github.com/pytorch/pytorch/pull/114148, because that PR is going to take a while to actually land.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117378
Approved by: https://github.com/jansel
2024-01-14 00:50:36 +00:00
PyTorch MergeBot
1174e82bde Revert "Add _assert_scalar and teach Inductor to codegen it (#114148)"
This reverts commit b6028acfa4.

Reverted https://github.com/pytorch/pytorch/pull/114148 on behalf of https://github.com/osalpekar due to Going to revert this given the broken torchrec PT2 tests internally: [D52648865](https://www.internalfb.com/diff/D52648865). Logs aren't too clear but @dstaay-fb can help debug as well ([comment](https://github.com/pytorch/pytorch/pull/114148#issuecomment-1886100368))
2024-01-11 02:30:22 +00:00
Edward Z. Yang
b6028acfa4 Add _assert_scalar and teach Inductor to codegen it (#114148)
Inductor codegen for `_assert_async` is currently disabled because we don't really understand how to codegen `scalar_to_tensor` on a Sympy expression. I initially tried to see if I could get this to work, but I got into some weird problem involving stride sorting, so I decided to fix it properly by not going through a tensor.

So we introduce an `_assert_scalar` which takes a scalar as an argument, avoiding needing to turn a SymBool into a tensor before asserting on it. I also add `_functional_assert_scalar` for good luck, although this doesn't do anything right now because https://github.com/pytorch/pytorch/pull/104203 still hasn't been landed.

I need to customize the codegen for this operator, so I decide to directly implement it in Inductor, rather than trying to treat it as a generic ExternKernel. This leads to the new AssertScalar IR node. This is written carefully so that it doesn't get DCE'd by Inductor.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114148
Approved by: https://github.com/jansel
2024-01-09 23:21:26 +00:00
Aaron Gokaslan
3fe437b24b [BE]: Update flake8 to v6.1.0 and fix lints (#116591)
Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling.
- Replace `assert(0)` with `raise AssertionError()`
- Remove extraneous parenthesis i.e.
  - `assert(a == b)` -> `assert a == b`
  - `if(x > y or y < z):`->`if x > y or y < z:`
  - And `return('...')` -> `return '...'`

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591
Approved by: https://github.com/albanD, https://github.com/malfet
2024-01-03 06:04:44 +00:00
Tugsbayasgalan Manlaibaatar
76b1d44d57 pre_dispatch aot_export (#115188)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188
Approved by: https://github.com/bdhirsh
2023-12-25 04:51:21 +00:00
PyTorch MergeBot
0567f71ac6 Revert " pre_dispatch aot_export (#115188)"
This reverts commit a267d67350.

Reverted https://github.com/pytorch/pytorch/pull/115188 on behalf of https://github.com/jeanschmidt due to sadly, it is required to revert this commit in order to revert https://github.com/pytorch/pytorch/pull/115454 ([comment](https://github.com/pytorch/pytorch/pull/115188#issuecomment-1866310014))
2023-12-21 14:03:18 +00:00
Tugsbayasgalan Manlaibaatar
a267d67350 pre_dispatch aot_export (#115188)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115188
Approved by: https://github.com/bdhirsh
2023-12-20 21:36:25 +00:00
Aaron Gokaslan
b7b2178204 [BE]: Remove useless lambdas (#113602)
Applies PLW0108 which removes useless lambda calls in Python, the rule is in preview so it is not ready to be enabled by default just yet. These are the autofixes from the rule.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113602
Approved by: https://github.com/albanD
2023-11-14 20:06:48 +00:00
Ken Jin
70064ac416 [Dynamo] Match closures by code ID (#109427)
Closes https://github.com/pytorch/pytorch/issues/107866

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109427
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-11-12 08:20:14 +00:00
Zhengxu Chen
64d75f72d4 [fx] Add a faster method for inserting positional argument. (#111974)
Summary:
Traditionally when user want to update the arguments for an FX node, the only way is to call the setter of .args property on nodes. This may be problematic when we insert a lot of arguments. Because of the semantics of the setter method, it has a worst case O(n) complexity.

Adding a new insert_arg provides us two benefits:
1. The operation is guaranteed to be O(1) cost.
2. User can express the intentation more directly, instead of writing code like `node.args = (arg,) + node.args`

Test Plan: caffe2/test:fx -- -r test_insert_arg

Reviewed By: suo

Differential Revision: D50574435

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111974
Approved by: https://github.com/angelayi
2023-10-26 02:30:42 +00:00
PyTorch MergeBot
b0087b4cf7 Revert "record_function: remove legacy internal operators (#72303)"
This reverts commit 0be84bb41e.

Reverted https://github.com/pytorch/pytorch/pull/72303 on behalf of https://github.com/izaitsevfb due to Apparently _record_function_enter is still used internally at Meta in several places and in lots of internal tests. ([comment](https://github.com/pytorch/pytorch/pull/72303#issuecomment-1777942975))
2023-10-24 20:01:14 +00:00
Peter Bell
0be84bb41e record_function: remove legacy internal operators (#72303)
These operators have not been used since #76420 but were preserved for TorchScript backward compatibility

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72303
Approved by: https://github.com/albanD
ghstack dependencies: #104535
2023-10-23 22:55:05 +00:00
Jason Ansel
a1154e673b [Compiled Autograd] Turn accumulate_grad into an op (#111700)
Relands #111271

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111700
Approved by: https://github.com/voznesenskym
2023-10-21 17:31:09 +00:00
PyTorch MergeBot
3eb5cae3af Revert "[Compiled Autograd] Turn accumulate_grad into an op (#111271)"
This reverts commit 04b04c0686.

Reverted https://github.com/pytorch/pytorch/pull/111271 on behalf of https://github.com/jeanschmidt due to Breaking internal CI ([comment](https://github.com/pytorch/pytorch/pull/111271#issuecomment-1768527932))
2023-10-18 14:02:34 +00:00
Jason Ansel
04b04c0686 [Compiled Autograd] Turn accumulate_grad into an op (#111271)
Rather than baking the behavior of `AccumulateGrad` nodes into the generated graph (either as `+=`, or as a return value of the graph).  This creates a new `accumulate_grad_` dispatcher op that is included in the generated graph like:
```
def forward(self, inputs, sizes, hooks):
    getitem = inputs[0]
    getitem_1 = inputs[1]
    getitem_2 = inputs[2]
    getitem_3 = inputs[3]
    getitem_4 = inputs[4]
    getitem_5 = inputs[5]
    getitem_6 = inputs[6]
    getitem_7 = inputs[7]
    getitem_8 = inputs[8]
    getitem_9 = inputs[9];  inputs = None
    expand = torch.ops.aten.expand.default(getitem, [2, 4]);  getitem = None
    threshold_backward = torch.ops.aten.threshold_backward.default(expand, getitem_1, 0);  expand = getitem_1 = None
    t = torch.ops.aten.t.default(getitem_3);  getitem_3 = None
    mm = torch.ops.aten.mm.default(threshold_backward, t);  t = None
    t_1 = torch.ops.aten.t.default(threshold_backward)
    mm_1 = torch.ops.aten.mm.default(t_1, getitem_2);  t_1 = getitem_2 = None
    t_2 = torch.ops.aten.t.default(mm_1);  mm_1 = None
    sum_1 = torch.ops.aten.sum.dim_IntList(threshold_backward, [0], True);  threshold_backward = None
    view = torch.ops.aten.view.default(sum_1, [4]);  sum_1 = None
    t_3 = torch.ops.aten.t.default(t_2);  t_2 = None
    accumulate_grad_ = torch.ops.inductor.accumulate_grad_.default(getitem_4, t_3);  getitem_4 = t_3 = None
    threshold_backward_1 = torch.ops.aten.threshold_backward.default(mm, getitem_5, 0);  mm = getitem_5 = None
    t_4 = torch.ops.aten.t.default(threshold_backward_1)
    mm_2 = torch.ops.aten.mm.default(t_4, getitem_6);  t_4 = getitem_6 = None
    t_5 = torch.ops.aten.t.default(mm_2);  mm_2 = None
    sum_2 = torch.ops.aten.sum.dim_IntList(threshold_backward_1, [0], True);  threshold_backward_1 = None
    view_1 = torch.ops.aten.view.default(sum_2, [4]);  sum_2 = None
    t_6 = torch.ops.aten.t.default(t_5);  t_5 = None
    accumulate_grad__1 = torch.ops.inductor.accumulate_grad_.default(getitem_7, t_6);  getitem_7 = t_6 = None
    accumulate_grad__2 = torch.ops.inductor.accumulate_grad_.default(getitem_8, view_1);  getitem_8 = view_1 = None
    accumulate_grad__3 = torch.ops.inductor.accumulate_grad_.default(getitem_9, view);  getitem_9 = view = None
    return []

```

The motivation here is `AccumulateGrad` nodes are causing trouble in FSDP tracing, since FSDP is in-place resizing parameters and parameter storage in hooks.  We will model this mutation in dynamo, but not during the initial compiled autograd capture.  This allows us to bypass failing shape checks in the initial capture.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111271
Approved by: https://github.com/voznesenskym
2023-10-16 21:16:17 +00:00
Michael Voznesensky
de0b18fad9 Use user directed names for variables where possible (#109092)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109092
Approved by: https://github.com/ezyang
ghstack dependencies: #108846
2023-09-13 07:44:04 +00:00
lezcano
4eac43d046 Trace through Tensor slots (#107159)
Namely
```
__delattr__
__delitem__
__getattribute__
__getitem__
__setattr__
__setitem__
__str__
```

We don't trace through `__init__`.

Fixes https://github.com/pytorch/pytorch/issues/106648

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107159
Approved by: https://github.com/Skylion007
2023-08-19 08:56:25 +00:00
Tugsbayasgalan Manlaibaatar
20c5add133 [export] Refactor constrain_as_value and constrain_as_size (#106591)
Some notable changes:
1. `constrain_as_size` allows min value to be less than 2 as it will unconditionally assume min >= 2 for compiler purposes. Instead, we add additional check to make sure max value is always greater than 2.
2. Previously, we used to runtime assert on the unbacked symint's val range which would be always between [2, max]. I modified this logic to assert on [0, max] unless user explicitly specifies the min range.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106591
Approved by: https://github.com/gmagogsfm, https://github.com/ezyang
2023-08-15 05:41:43 +00:00
PyTorch MergeBot
745d29b0cc Revert "[export] Refactor constrain_as_value and constrain_as_size (#106591)"
This reverts commit 18989890bf.

Reverted https://github.com/pytorch/pytorch/pull/106591 on behalf of https://github.com/izaitsevfb due to Breaks inductor test on trunk ([comment](https://github.com/pytorch/pytorch/pull/106591#issuecomment-1675069091))
2023-08-11 16:37:47 +00:00
Tugsbayasgalan Manlaibaatar
18989890bf [export] Refactor constrain_as_value and constrain_as_size (#106591)
Some notable changes:
1. `constrain_as_size` allows min value to be less than 2 as it will unconditionally assume min >= 2 for compiler purposes. Instead, we add additional check to make sure max value is always greater than 2.
2. Previously, we used to runtime assert on the unbacked symint's val range which would be always between [2, max]. I modified this logic to assert on [0, max] unless user explicitly specifies the min range.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106591
Approved by: https://github.com/gmagogsfm, https://github.com/ezyang
2023-08-11 05:29:22 +00:00
Aaron Gokaslan
6d43c89f37 [BE]: Update Ruff to 0.0.280 (#105724)
Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724
Approved by: https://github.com/ezyang, https://github.com/janeyx99
2023-07-22 23:03:34 +00:00
Michael Suo
a475ea4542 [fx] change from #users to num_users in graph printout (#101140)
`#users` means stuff in various chat apps, which makes it annoying to copypasta graphs into them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101140
Approved by: https://github.com/ezyang
2023-06-20 21:24:32 +00:00
xuanqi
b27c3558a4 [RFC]: Create aten native op for constrain_range (#103346)
At high current implementation of constrains functions (constrain_as_**) will raise exception for the following code snippets:
```
def f(x):
    a = x.item()
    constrain_as_size(a, 4, 7)
    return torch.empty((a, 4))

inp = torch.tensor([5])
ep = torch._export.export(f, (inp,))
```

The reason is because current constrain logic is:
1) Purely python so it won't survive AOT export (the full node is gone after AOT export since AOT export only maintains aten level op).
2) Utilize side effect to add range constraints for traced symbol's shape env ([code](9591e52880/torch/fx/experimental/symbolic_shapes.py (L370-L372))).
3) If runtime assertion is turned on (by default). [`_AddRuntimeAssertionsForConstraintsPass`](9591e52880/torch/_export/passes/add_runtime_assertions_for_constraints_pass.py (L98-L100)) will try to append assertion node based on range constrains extracted from shape env of symbol during another interpretation round.
4). However, since 1), in the round of AOT export, range constraints logic won't run for symbols generated during this round. And later there is no range constrains information available for assertion round and caused issue.
5) As a result of above, it will failure at `torch.empty((a, 4))` (there is no constrains for `a` that it must be positive).

The fix here is just to implement range constrain logic as a native aten op (CPU implementation as no-op) to make it be able to survive AOT export.

**NOTE:**
[Logic](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (L350-L365C15)) within [`constrain_range`](2d745b95d7/torch/fx/experimental/symbolic_shapes.py (LL313C74-L313C74)) is split out as `constrain_range_int` to capture case when non `SymInt` is passed in and reused in the new `_constrain_range`. The reason is when non `SymInt` is provided:
* If it directly calls `sym_constrain_range`, the C++ version will be called which will be no-op.
* So in this case it calls `constrain_range_int` instead to be able to capture issue like user provides a input whose tensor's shape could be out of range during exporting, like the following for above code example:
```
...
inp = torch.tensor([10])
ep = torch._export.export(f, (inp,)) # immediately raise error
```

Differential Revision: [D46734204](https://our.internmc.facebook.com/intern/diff/D46734204)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103346
Approved by: https://github.com/tugsbayasgalan
2023-06-16 14:55:40 +00:00
Animesh Jain
58d2c66a70 [activation checkpointing] Higher order functional rng op wrappers (#102934)
Introduces two higher order operators
* run_and_save_rng_state - Saves the current rng state and then runs the op.
* run_with_rng_state - Runs the op with the rng state supplied as an input

Ideally, we would like to use torch.compile for these operators. But currently the plan is to introduce these operators at the partitioner level, obviating the need to support them fully through the torch.compile stack. To ensure that we have good enough debugging with minifiers, we have ensure that they work with make_fx. In future, we can move on torch.compile.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102934
Approved by: https://github.com/jansel, https://github.com/zou3519
2023-06-12 22:54:17 +00:00
PyTorch MergeBot
66eef31444 Revert "[fx] change from #users to num_users in graph printout (#101140)"
This reverts commit e568c5a18d.

Reverted https://github.com/pytorch/pytorch/pull/101140 on behalf of https://github.com/jeanschmidt due to There are internal changes to this commit that are preventing landing, so I am reverting to unblock the diff train ([comment](https://github.com/pytorch/pytorch/pull/101140#issuecomment-1547989487))
2023-05-15 14:35:22 +00:00
Michael Suo
e568c5a18d [fx] change from #users to num_users in graph printout (#101140)
`#users` means stuff in various chat apps, which makes it annoying to copypasta graphs into them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101140
Approved by: https://github.com/ezyang
2023-05-12 04:34:01 +00:00
Tugsbayasgalan Manlaibaatar
d4bf76c2a4 Persist torch.assert in aten graph (#100101)
This PR introduces a new operator called aten._assert_async.msg, which allows passing a tensor value and assertion message as inputs. As part of TorchDynamo, we're replacing the use of torch._assert with this new operator so that make_fx also knows how to handle assertions. This is subset of https://github.com/pytorch/pytorch/pull/98878, refer there for historic reviews.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100101
Approved by: https://github.com/jansel
2023-04-28 07:31:43 +00:00
Shiyan Deng
82a54513ac [fx] Add a function to allow adding more functions to the side effect function set (#97288)
Summary: There're some customized functions that we would also like to keep during eliminate dead code pass. Add a function to help us to do.

Test Plan: Added a unit test

Differential Revision: D44273630

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97288
Approved by: https://github.com/houseroad
2023-04-22 04:42:24 +00:00
Kazuaki Ishizaki
105ef68f72 Fix typos under torch/fx directory (#97596)
This PR fixes typos in comments and messages of `.py` files under `torch/fx` directory

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97596
Approved by: https://github.com/dagitses, https://github.com/kit1980
2023-04-10 21:57:36 +00:00
Edward Z. Yang
37faa48844 DCE inference graphs too (#97275)
I added a bunch of asserts to verify that I didn't accidentally kill copy_ in the graph, hopefully this combined with our existing tests is good enough.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97275
Approved by: https://github.com/bdhirsh
2023-03-23 07:02:52 +00:00