Commit Graph

1635 Commits

Author SHA1 Message Date
Animesh Jain
08f14d5492 [refactor][dynamo][side-effects] Helper function for __new__ for user defined class (#133799)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133799
Approved by: https://github.com/jansel
ghstack dependencies: #133745, #133747, #133746
2024-08-19 17:21:48 +00:00
PyTorch MergeBot
35f36363ec Revert "[dtensor] move DTensor to public namespace (#133113)"
This reverts commit 2ee6b97464.

Reverted https://github.com/pytorch/pytorch/pull/133113 on behalf of https://github.com/wanchaol due to looks like it break some internal type imports ([comment](https://github.com/pytorch/pytorch/pull/133113#issuecomment-2295670911))
2024-08-19 05:00:19 +00:00
Animesh Jain
fed6096e73 [dynamo] Support object.__new__ call (#133746)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133746
Approved by: https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #133745, #133747
2024-08-18 07:18:52 +00:00
Animesh Jain
d56a395971 [dynamo] Support os.fspath (#133747)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133747
Approved by: https://github.com/yanboliang, https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #133745
2024-08-18 07:18:52 +00:00
Animesh Jain
4dc9795ebf [refactor][easy] Directly call var_getattr method for PythonModuleVariable (#133745)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133745
Approved by: https://github.com/yanboliang
2024-08-17 05:30:01 +00:00
Wanchao Liang
2ee6b97464 [dtensor] move DTensor to public namespace (#133113)
Moving DTensor to be in the public namespace, to formally add the
documentation page that includes all the public APIs. This includes:

* many path renames and path import fixes
* a dedicated doc page without too much content yet (adding in the next
  PRs)
* To preserve the BC for users still using the `torch.distributed._tensor`,
  I added a shim script to redirect old path calls to the new module

The BC preserving is evidented by the fact that all DTensor tests are still
working without changing the public imports. So it's safe to land the
changes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133113
Approved by: https://github.com/XilunWu
ghstack dependencies: #133305, #133306
2024-08-17 05:09:52 +00:00
Li, Xingyuan
dcfa415e6e [Inductor UT] Reuse inductor UT for intel GPU test/inductor/test_compiled_optimizers.py (#133083)
[Inductor UT] Reuse Inductor test case for Intel GPU.
Reuse `test/inductor/test_compiled_optimizers.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133083
Approved by: https://github.com/etaf, https://github.com/jansel, https://github.com/mlazos
2024-08-17 01:15:26 +00:00
Will Feng
f57b00704e [Traceable FSDP2][Dynamo] Support reconstructing CUDA event object within Dynamo graph (#133635)
`torch.cuda.Event` objects are different from `torch.cuda.Stream` in that events are not pooled, meaning we can't look up a previously created CUDA event object by ID. This prevents CUDA event object created outside of the Dynamo graph from being used within the graph (since Dynamo needs a way to emit a `call_function` line in the graph that does the retrieval of the event object for downstream op use). This PR adds a simple object pool within Dynamo utility, to support looking up CUDA event object by ID from within the Dynamo graph.

After this PR, if a user creates a CUDA event object outside of the graph and use that event within the graph, the behavior will exactly match eager.

Test commands:
- `pytest -rA test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_created_outside_of_graph`
- `pytest -rA test/dynamo/test_ctx_manager.py::CtxManagerTests::test_cuda_event_across_graph_break`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133635
Approved by: https://github.com/yifuwang
ghstack dependencies: #133532, #133531, #133636
2024-08-16 20:40:46 +00:00
Yanbo Liang
770086fe39 [Dynamo] Support torch.cuda.device ctx manager (#133385)
Fixes #128059

I'm not sure if this is the right way, since Inductor doesn't always respect the device id set by users, so probably we should just wrap it as null context manager and print a warning. cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @jansel @anijain2305 @mlazos @williamwen42

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133385
Approved by: https://github.com/jansel
2024-08-16 17:05:55 +00:00
Animesh Jain
8a2b064236 [dynamo][user_defined][stable-diffusion] Raise ObservedAttributeError on UserDefinedObject var_getattr (#132806)
Fixes https://github.com/pytorch/pytorch/issues/132551

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132806
Approved by: https://github.com/williamwen42
2024-08-16 04:30:06 +00:00
Animesh Jain
8a5708ba3d [dynamo] Support object creation of classes with custom __new__ (#132977)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132977
Approved by: https://github.com/jansel
2024-08-16 03:09:23 +00:00
Edward Z. Yang
90d2593b3e Revert #132806, #132736, #132539, #132487 (#133570)
This reverts commit 25df063f04.
This reverts commit de00c79583.
This reverts commit 419b76c4ac.
This reverts commit bc57d5b6ff.

Differential Revision: [D61335013](https://our.internmc.facebook.com/intern/diff/D61335013)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133570
Approved by: https://github.com/albanD, https://github.com/jansel, https://github.com/anijain2305
2024-08-15 20:54:21 +00:00
Xuehai Pan
758a0a88a2 [BE][Easy] enable ruff rule PIE790: unnecessary pass statement (#133200)
This PR removes unnecessary `pass` statement. This is semanticly safe because the bytecode for the Python code does not change.

Note that if there is a docstring in the function, a empty function does not need a `pass` statement as placeholder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133200
Approved by: https://github.com/malfet, https://github.com/eqy, https://github.com/kit1980
2024-08-15 15:50:19 +00:00
Isuru Fernando
e554f71d7e Implement filter in dynamo (#131674)
Fixes https://github.com/pytorch/pytorch/issues/128944

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131674
Approved by: https://github.com/amjames, https://github.com/jansel
2024-08-14 14:54:13 +00:00
Edward Z. Yang
b5711297a0 Add support for SetVariable.discard (#133317)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133317
Approved by: https://github.com/Skylion007
2024-08-14 09:10:36 +00:00
Will Feng
1206958d89 [Dynamo] add EventVariable reconstruct (#133236)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/133236
Approved by: https://github.com/yifuwang
2024-08-14 02:56:11 +00:00
Edward Z. Yang
80ed3e9ccd s/dipatch/dispatch/g (#133192)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133192
Approved by: https://github.com/albanD
2024-08-12 20:26:58 +00:00
Huanyu He
de48d54042 [TorchRec] Add Support for FakeProcessGroup (#133039)
Summary:
# context
* use FakeProcessGroup to mimic the multi-process tests
* can use `_test_compile_fake_pg_fn` as the single-process VB compile test
```
from torchrec.distributed.tests.test_pt2_multiprocess import _test_compile_fake_pg_fn
_test_compile_fake_pg_fn(
    rank=0,
    world_size=2,
)
```

reference: D59637444

Test Plan:
# run test
* run command and results: P1519228952, [tlparse](https://interncache-all.fbcdn.net/manifold/tlparse_reports/tree/logs/.tmpwMCK1E/index.html)
```
TORCH_TRACE=/var/tmp/tt TORCH_SHOW_CPP_STACKTRACES=1 TORCH_LOGS="+all" buck2 run fbcode//mode/opt fbcode//torchrec/distributed/tests:test_pt2_multiprocess
```

Differential Revision: D56124045

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133039
Approved by: https://github.com/ezyang
2024-08-10 01:10:47 +00:00
Tom Ritchford
6beb2be2ed Fix _dynamo.variables.torch_function.global_mangled_class_name (#132744)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132744
Approved by: https://github.com/zou3519
2024-08-09 22:19:01 +00:00
Yiming Zhou
7b8ab7eb3e [dynamo] Partially support random.Random class (#133037)
This partially fixes the graph break issue when instantiating a `random.Random` class in Python.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133037
Approved by: https://github.com/anijain2305
2024-08-09 07:15:42 +00:00
xinyu-intel
5ae979ab10 [Dynamo] Support torch.autograd._is_checkpoint_valid (#132611)
Hi, we got `torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor bool call_function <function _is_checkpoint_valid at 0x7f0b0d22e290>` while tracing activation [checkpointing function in deepspeed](324ee65cb0/deepspeed/runtime/activation_checkpointing/checkpointing.py (L630)). Consider to add it to constant_folding list which is similar with https://github.com/pytorch/pytorch/pull/126196

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132611
Approved by: https://github.com/anijain2305, https://github.com/williamwen42
2024-08-08 04:05:08 +00:00
Yiming Zhou
c69b2d24e3 [dynamo] Support remove method of set (#132943)
Fixes https://github.com/pytorch/pytorch/issues/132800

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132943
Approved by: https://github.com/anijain2305
2024-08-08 02:43:19 +00:00
Animesh Jain
194ec49d27 [dynamo][lists][stable diffusion] Do not add source on list slice (#132912)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132912
Approved by: https://github.com/williamwen42
ghstack dependencies: #132806, #132899
2024-08-08 02:23:07 +00:00
Animesh Jain
acad2050c1 [easy][dynamo] Add tx as an arg in getitem_const (#132899)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132899
Approved by: https://github.com/yanboliang
ghstack dependencies: #132806
2024-08-07 21:35:41 +00:00
Animesh Jain
25df063f04 [dynamo][user_defined][stable-diffusion] Raise ObservedAttributeError on UserDefinedObject var_getattr (#132806)
Fixes https://github.com/pytorch/pytorch/issues/132551

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132806
Approved by: https://github.com/williamwen42
2024-08-07 18:19:49 +00:00
Joel Schlosser
fb146fc3c6 Only store necessary tensor_dict fields in node meta (#132805)
Fixes #132290

This PR attempts a more invasive / complete solution than the one from #132338, which removes immediate tensor fields from the `tensor_dict` copy stored in node meta. The approach taken here is to store only those fields of the `tensor_dict` which are absolutely utilized somewhere else.

So far, this appears to be limited to:
* `_dynamo_static_input_type`
* `tag` (at least in the tests). Discussion at #94080 appears to indicate this is depended on for export

(CI may point out more)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132805
Approved by: https://github.com/mlazos
2024-08-07 13:35:16 +00:00
xinyu-intel
8333ecf085 Support hasattr tracing for more PythonModuleVariable (#132731)
Fixes #132237

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132731
Approved by: https://github.com/EikanWang, https://github.com/yanboliang
2024-08-07 09:15:17 +00:00
Apurva Jain
8bc5ef563e Grouped Query Attention (#132689)
### Approach: Using the current function declaration

**Constraint:** Q_Heads % KV_Heads == 0

**Major change:**
- Added a new argument enable_gqa: bool to sdpa function call
- It adds a meaning to the last third dimension.

Sample use cases this would enable:
LLama3

```
# LLama3 8b call to SDPA
query = torch.rand(batch, 32, seq_len_q, D)
key = torch.rand(batch, 8, seq_len_kv, D)
value = torch.rand(batch, 8, seq_len_kv, D)

output = scaled_dot_product_attention(query, key, value, is_causal=True, enable_gqa=True)

# Output Shape
(batch, 32, seq_len_q, D)
```

### Design Choice:

- Check if Query.size(-3) == Key.size(-3) == Value.size(-3) or, Query.size(-3) % Key.size(-3) == 0
- The function adjusts the key and value tensors to match the query tensor's head dimension by using repeat_interleave if their number of heads are not equal, facilitating correct and efficient computation in attention mechanisms.
- By default the enable_gqa flag is set to False, which ensures that regular sdpa functionality remains unchanged.

### Benchmarks:

- **sdpa.py: #130634**
For different batch sizes enable_gqa=True shows a substansial improvement in the run_time of sdpa

 | batch_size | q_num_heads | kv_num_heads | q_seq_len | kv_seq_len | embed_dim | forward_time when enable_gqa=True   |   forward_time when enable_gqa=False    |
| ------------ | ------------- | -------------- | ----------- | ------------ | ----------- | ----------- | ---------------- |
|     1      |     32      |      8       |   2048    |    2048    |   2048    |   100.71  |  119.70  |
|     8      |     32      |      8       |   2048    |    2048    |   2048    |   539.78  |  628.83  |
|     16     |     32      |      8       |   2048    |    2048    |   2048    |   1056.81  |  1225.48  |
|     32      |     32      |      8       |   2048    |    2048    |   2048    |   2099.54  |  2440.45  |

![Screenshot 2024-07-25 at 9 07 40 PM](https://github.com/user-attachments/assets/a3e5f716-c39f-4096-9e6c-82a735e57b7b)

- **TorchTitan: https://github.com/pytorch/torchtitan/pull/458**

Differential Revision: D60772086

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132689
Approved by: https://github.com/drisspg
2024-08-07 05:35:36 +00:00
Animesh Jain
de00c79583 [dynamo][inline_inbuilt_nn_modules] Mark nn module tensor static for cudagraphs (#132736)
Fixes https://github.com/pytorch/pytorch/issues/132714

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132736
Approved by: https://github.com/mlazos
ghstack dependencies: #132538
2024-08-06 20:13:28 +00:00
Brian Hirsh
e6eee04875 dynamo: use equality guards instead of id guards for Placement/DeviceMesh (#124401)
After talking to @anijain2305, we probably can't land this since it won't work for C++ guards. But we should still be able to do better than ID_MATCH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124401
Approved by: https://github.com/anijain2305
2024-08-06 17:14:44 +00:00
Michael Lazos
a8f0979962 Add cudagraph static inputs logging (#132726)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132726
Approved by: https://github.com/anijain2305
2024-08-06 12:01:20 +00:00
Aart Bik
a8490a0762 [traced-graph][sparse] propagate sparsity in fx graph (#131920)
This PR proceeds with implementing the feature request #117188 by generalizing more cases that already work with COO to work with the compressed sparse formats as well.

Feature request:
https://github.com/pytorch/pytorch/issues/117188

Rebranch of older PRs (for history):
https://github.com/pytorch/pytorch/pull/131474
https://github.com/pytorch/pytorch/pull/128549

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131920
Approved by: https://github.com/ezyang
2024-08-05 15:49:53 +00:00
William Wen
01cdcbf7c8 [dynamo] revert map/zip iterator related changes (#132528)
Need to revert due to internal hangs: S437700

This reverts commit b6c1490cc0.

Revert "[dynamo] implement IteratorVariable and polyfill fallbacks for enumerate (#131725)"

This reverts commit 2576dbbc35.

Revert "[dynamo] add itertools repeat/count bytecode reconstruction (#131716)"

This reverts commit 35b4de32fa.

Revert "[dynamo] add lazy IteratorVariable implementations for map and zip (#131413)"

This reverts commit 7d282d8755.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132528
Approved by: https://github.com/ZainRizvi
2024-08-04 18:46:55 +00:00
Oguz Ulgen
6e79932543 Add basic mypy annotations to dynamo (#132415)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132415
Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu
2024-08-04 18:43:36 +00:00
PyTorch MergeBot
3558a8cf4a Revert "Add basic mypy annotations to dynamo (#132415)"
This reverts commit 71e22e0959.

Reverted https://github.com/pytorch/pytorch/pull/132415 on behalf of https://github.com/ZainRizvi due to Sorry, this PR has entered a weird state in the diff train. Trying to revert it to skip it, and then we can try relanding it ([comment](https://github.com/pytorch/pytorch/pull/132415#issuecomment-2267631785))
2024-08-04 18:39:29 +00:00
PyTorch MergeBot
0a25666f92 Revert "[dynamo] revert map/zip iterator related changes (#132528)"
This reverts commit e81e74ca6c.

Reverted https://github.com/pytorch/pytorch/pull/132528 on behalf of https://github.com/ZainRizvi due to This stack entered a weird state in the diff train. Reverting and relanding to clean the state ([comment](https://github.com/pytorch/pytorch/pull/132528#issuecomment-2267628475))
2024-08-04 18:26:09 +00:00
Animesh Jain
06581c277a [dynamo][stable-diffusion] Support dict(obj) on constrained subclasses of dict and OrderedDict (#132558)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132558
Approved by: https://github.com/jansel
2024-08-03 06:31:00 +00:00
Yanbo Liang
373e9be457 [Inductor][FlexAttention] Add kwarg to top level for users to specify kernel params (#132015)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132015
Approved by: https://github.com/Chillee
2024-08-03 02:27:02 +00:00
Animesh Jain
419b76c4ac [dynamo] Reland 132308, 132314, 132318, 132334 - Make builtin nn modules attributes static (#132539)
Relanding 4 PRs ending at https://github.com/pytorch/pytorch/pull/132334

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132539
Approved by: https://github.com/Skylion007, https://github.com/yanboliang, https://github.com/mlazos
2024-08-03 02:08:22 +00:00
William Wen
f379bbd46d [dynamo] support inspect.signature.bind (#132330)
Fixes https://github.com/pytorch/pytorch/issues/93760.

This was not that small of a task...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132330
Approved by: https://github.com/jansel
ghstack dependencies: #132329
2024-08-02 20:37:05 +00:00
William Wen
e81e74ca6c [dynamo] revert map/zip iterator related changes (#132528)
Need to revert due to internal hangs: S437700

This reverts commit b6c1490cc0.

Revert "[dynamo] implement IteratorVariable and polyfill fallbacks for enumerate (#131725)"

This reverts commit 2576dbbc35.

Revert "[dynamo] add itertools repeat/count bytecode reconstruction (#131716)"

This reverts commit 35b4de32fa.

Revert "[dynamo] add lazy IteratorVariable implementations for map and zip (#131413)"

This reverts commit 7d282d8755.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132528
Approved by: https://github.com/ZainRizvi
2024-08-02 19:40:57 +00:00
PyTorch MergeBot
bcb4f7c172 Revert "Grouped Query Attention (#128898)"
This reverts commit 6b28af1b79.

Reverted https://github.com/pytorch/pytorch/pull/128898 on behalf of https://github.com/ZainRizvi due to Sorry, this broke a bunch of tests internally. See D60638265 ([comment](https://github.com/pytorch/pytorch/pull/128898#issuecomment-2265961038))
2024-08-02 18:58:46 +00:00
PyTorch MergeBot
24d0a32f98 Revert "[dynamo] Wrap unspecialized nn module getattr with UnspecializedNNModuleSource (#132308)"
This reverts commit aa0ed2496f.

Reverted https://github.com/pytorch/pytorch/pull/132308 on behalf of https://github.com/anijain2305 due to broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/132308#issuecomment-2265959993))
2024-08-02 18:55:51 +00:00
PyTorch MergeBot
e696f17467 Revert "[dynamo] Track builtin nn modules with UnspecializedBuiltinNNModuleVariable (#132314)"
This reverts commit d6a82ce39b.

Reverted https://github.com/pytorch/pytorch/pull/132314 on behalf of https://github.com/anijain2305 due to broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/132314#issuecomment-2265953367))
2024-08-02 18:52:38 +00:00
PyTorch MergeBot
193a19ee91 Revert "[dynamo] Treat attr of unspecialized buiitin nn modules as static (#132318)"
This reverts commit 7b816d7d6d.

Reverted https://github.com/pytorch/pytorch/pull/132318 on behalf of https://github.com/anijain2305 due to broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/132318#issuecomment-2265945433))
2024-08-02 18:43:32 +00:00
PyTorch MergeBot
b8f7019df0 Revert "[dynamo] Track params/buffers and mark them as static (#132334)"
This reverts commit babb249a89.

Reverted https://github.com/pytorch/pytorch/pull/132334 on behalf of https://github.com/anijain2305 due to broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/132334#issuecomment-2265942261))
2024-08-02 18:41:19 +00:00
Animesh Jain
56f2917bef [dynamo] Bugfix for recently added str handler (#132461)
There is probably more work to improve support. But this is hot fix to not fail on `.__func__`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132461
Approved by: https://github.com/williamwen42
ghstack dependencies: #132425
2024-08-02 13:16:39 +00:00
Michael Lazos
d2e9a8bf6d [Reland] Fix inlining module-scoped store global (#132439)
Reland https://github.com/pytorch/pytorch/pull/132224

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132439
Approved by: https://github.com/anijain2305
2024-08-02 09:13:52 +00:00
Animesh Jain
babb249a89 [dynamo] Track params/buffers and mark them as static (#132334)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132334
Approved by: https://github.com/ezyang, https://github.com/mlazos
2024-08-02 08:55:43 +00:00
Yanbo Liang
5ea0f51187 [Dynamo] Support abc.MutableMapping.get (#132363)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132363
Approved by: https://github.com/anijain2305, https://github.com/mlazos
2024-08-02 04:17:35 +00:00
Animesh Jain
6c4ce4331c [dynamo][exception] Raise Observed KeyError exception for dict __getitem__ (#132425)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132425
Approved by: https://github.com/yanboliang, https://github.com/Skylion007
2024-08-02 02:58:31 +00:00
Chen Haifeng
50ed6ce277 Support built-in id function for TensorVariable on parameters (#130100)
Fixes #130087

This patch tries to provide a built-in id function implementation for TensorVariable when the id function is called on tensors like module parameters. The id function call on intermediate tensors is not supported.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130100
Approved by: https://github.com/anijain2305
2024-08-02 01:19:25 +00:00
William Wen
625af2d27c [dynamo] fix add_push_null callsites with CALL_FUNCTION_EX (#132329)
Also fix a bug in `PyCodegen.add_push_null` where in Python <= 3.12, we may accidentally duplicate a NULL instead of the object on the stack before it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132329
Approved by: https://github.com/anijain2305
2024-08-02 00:29:21 +00:00
Oguz Ulgen
71e22e0959 Add basic mypy annotations to dynamo (#132415)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132415
Approved by: https://github.com/XuehaiPan, https://github.com/jamesjwu
2024-08-01 20:14:25 +00:00
PyTorch MergeBot
40c8f73099 Revert "Fix inlining module-scoped store global (#132224)"
This reverts commit c3a31d90e7.

Reverted https://github.com/pytorch/pytorch/pull/132224 on behalf of https://github.com/ZainRizvi due to Looks like the new import mock_store_global_crossfile_inline fails internally. Please see D60567756 for details ([comment](https://github.com/pytorch/pytorch/pull/132224#issuecomment-2263768729))
2024-08-01 19:06:36 +00:00
Animesh Jain
7b816d7d6d [dynamo] Treat attr of unspecialized buiitin nn modules as static (#132318)
This fixes the huge increase in compile time with +dynamic with inline_inbuilt_nn_modules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132318
Approved by: https://github.com/yanboliang, https://github.com/mlazos, https://github.com/ezyang
ghstack dependencies: #132302, #132304, #132312, #132308, #132314
2024-08-01 17:11:18 +00:00
Yiming Zhou
ee09d066d3 [dynamo] Add line number to _warn_capture_scalar_outputs() (#132333)
Fixes #127667.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132333
Approved by: https://github.com/anijain2305
2024-08-01 16:11:21 +00:00
Oguz Ulgen
72d2dba992 Add None return type to init (#132335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335
Approved by: https://github.com/albanD
2024-08-01 15:26:45 +00:00
Animesh Jain
d6a82ce39b [dynamo] Track builtin nn modules with UnspecializedBuiltinNNModuleVariable (#132314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132314
Approved by: https://github.com/yanboliang
ghstack dependencies: #132302, #132304, #132312, #132308
2024-08-01 06:21:05 +00:00
Animesh Jain
aa0ed2496f [dynamo] Wrap unspecialized nn module getattr with UnspecializedNNModuleSource (#132308)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132308
Approved by: https://github.com/yanboliang
ghstack dependencies: #132302, #132304, #132312
2024-08-01 06:21:05 +00:00
Animesh Jain
e772547d70 [dynamo][rename/refactor] Rename guard_source NN_MODULE to SPECIALIZED_NN_MODULE (#132302)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132302
Approved by: https://github.com/yanboliang
2024-08-01 04:35:43 +00:00
jainapurva
6b28af1b79 Grouped Query Attention (#128898)
### Approach: Using the current function declaration

**Constraint:** Q_Heads % KV_Heads == 0

**Major change:**
- Added a new argument enable_gqa: bool to sdpa function call
- It adds a meaning to the last third dimension.

Sample use cases this would enable:
LLama3

```
# LLama3 8b call to SDPA
query = torch.rand(batch, 32, seq_len_q, D)
key = torch.rand(batch, 8, seq_len_kv, D)
value = torch.rand(batch, 8, seq_len_kv, D)

output = scaled_dot_product_attention(query, key, value, is_causal=True, enable_gqa=True)

# Output Shape
(batch, 32, seq_len_q, D)
```

### Design Choice:

- Check if Query.size(-3) == Key.size(-3) == Value.size(-3) or, Query.size(-3) % Key.size(-3) == 0
- The function adjusts the key and value tensors to match the query tensor's head dimension by using repeat_interleave if their number of heads are not equal, facilitating correct and efficient computation in attention mechanisms.
- By default the enable_gqa flag is set to False, which ensures that regular sdpa functionality remains unchanged.

### Benchmarks:

- **sdpa.py: #130634**
For different batch sizes enable_gqa=True shows a substansial improvement in the run_time of sdpa

 | batch_size | q_num_heads | kv_num_heads | q_seq_len | kv_seq_len | embed_dim | forward_time when enable_gqa=True   |   forward_time when enable_gqa=False    |
| ------------ | ------------- | -------------- | ----------- | ------------ | ----------- | ----------- | ---------------- |
|     1      |     32      |      8       |   2048    |    2048    |   2048    |   100.71  |  119.70  |
|     8      |     32      |      8       |   2048    |    2048    |   2048    |   539.78  |  628.83  |
|     16     |     32      |      8       |   2048    |    2048    |   2048    |   1056.81  |  1225.48  |
|     32      |     32      |      8       |   2048    |    2048    |   2048    |   2099.54  |  2440.45  |

![Screenshot 2024-07-25 at 9 07 40 PM](https://github.com/user-attachments/assets/a3e5f716-c39f-4096-9e6c-82a735e57b7b)

- **TorchTitan: https://github.com/pytorch/torchtitan/pull/458**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128898
Approved by: https://github.com/drisspg
2024-07-31 22:58:51 +00:00
Xuehai Pan
e74ba1b34a [BE][Easy][15/19] enforce style for empty lines in import segments in torch/_d*/ (#129767)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129767
Approved by: https://github.com/anijain2305
2024-07-31 21:18:11 +00:00
Michael Lazos
c3a31d90e7 Fix inlining module-scoped store global (#132224)
Fixes https://github.com/pytorch/pytorch/issues/132165

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132224
Approved by: https://github.com/anijain2305
2024-07-31 17:37:43 +00:00
datagero
bdd7a0322d [Dynamo] Fix - str handler for UserDefinedObjectVariable (#130506)
Fixes #130301

Adjusted the call_str method to handle str conversion for UserDefinedObjectVariable.
Attempt in a clean branch for unrelated test errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130506
Approved by: https://github.com/oulgen, https://github.com/anijain2305
2024-07-31 16:39:59 +00:00
Luca Wehrstedt
f4f7aba75d Expose function to probe whether PyTorch was built with FlashAttention (#131894)
This is needed by downstream projects (e.g., xFormers) to determine whether they can count on FlashAttention in PyTorch or whether they need to build it themselves.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131894
Approved by: https://github.com/drisspg, https://github.com/eqy
2024-07-31 11:33:09 +00:00
ekamiti
9e473fd868 Make adding Buffers more like adding Parameters (#125971)
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.

Fixes #35735

Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971
Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos
2024-07-31 10:32:40 +00:00
rzou
19db4f6014 [capture_triton] fix special kwargs path (#132143)
I didn't test this path when creating the orchestrator. This PR fixes
that path to work in the capture_triton path. The problem is that we are
handling a value that is an int (in the capture_triton path) and a
ConstantVariable (in the Dynamo triton path) so we abstract that out in
the orchestrator.

Test Plan:
- new tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132143
Approved by: https://github.com/oulgen
2024-07-30 20:30:40 +00:00
Guilherme Leobas
a843178529 Let dynamo inline functional_call (#128646)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128646
Approved by: https://github.com/zou3519
2024-07-30 14:22:23 +00:00
PyTorch MergeBot
499ead96ff Revert "Grouped Query Attention (#128898)"
This reverts commit d039b14207.

Reverted https://github.com/pytorch/pytorch/pull/128898 on behalf of https://github.com/albanD due to Broken test on main ([comment](https://github.com/pytorch/pytorch/pull/128898#issuecomment-2258314481))
2024-07-30 13:11:24 +00:00
Animesh Jain
03e058189e [dynamo] Support dict unpack of MutableMapping objects (#131961)
Fixes https://github.com/pytorch/pytorch/issues/128067

The basic functionality was alredy introduced earlier. This just ensures
that we support UserDefinedObjectVariable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131961
Approved by: https://github.com/williamwen42, https://github.com/mlazos, https://github.com/yanboliang
ghstack dependencies: #131827, #131956
2024-07-30 05:49:58 +00:00
Animesh Jain
13457d1da0 [dynamo][log] Suggest to use pytree when graph-break on optree (#131827)
Discovered while working on https://github.com/pytorch/pytorch/issues/121369
On the model above, the log looks like this

~~~
/home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree._C.PyCapsule.flatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py.
  torch._dynamo.utils.warn_once(msg)
/home/anijain/local/pytorch2/torch/_dynamo/variables/functions.py:698: UserWarning: Graph break for an optree C/C++ function optree.PyCapsule.unflatten. Consider using torch._utils.pytree - https://github.com/pytorch/pytorch/blob/main/torch/utils/_pytree.py.
  torch._dynamo.utils.warn_once(msg)
  ~~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131827
Approved by: https://github.com/zou3519, https://github.com/mlazos
2024-07-30 05:49:58 +00:00
William Wen
b6c1490cc0 [dynamo] make more unpack_var_sequence calls forced (#132069)
Fixes [T197204962](https://www.internalfb.com/intern/tasks/?t=197204962) (example failure: https://www.internalfb.com/intern/testinfra/diagnostics/11540474088277914.281475138576374.1722221031/)

Added tests contain a simple repro for the observed failure (`test_map_unpack_vars`).

Also fixes https://github.com/pytorch/pytorch/issues/132044

Differential Revision: [D60420335](https://our.internmc.facebook.com/intern/diff/D60420335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132069
Approved by: https://github.com/anijain2305
2024-07-30 02:30:08 +00:00
jainapurva
d039b14207 Grouped Query Attention (#128898)
### Approach: Using the current function declaration

**Constraint:** Q_Heads % KV_Heads == 0

**Major change:**
- Added a new argument enable_gqa: bool to sdpa function call
- It adds a meaning to the last third dimension.

Sample use cases this would enable:
LLama3

```
# LLama3 8b call to SDPA
query = torch.rand(batch, 32, seq_len_q, D)
key = torch.rand(batch, 8, seq_len_kv, D)
value = torch.rand(batch, 8, seq_len_kv, D)

output = scaled_dot_product_attention(query, key, value, is_causal=True, enable_gqa=True)

# Output Shape
(batch, 32, seq_len_q, D)
```

### Design Choice:

- Check if Query.size(-3) == Key.size(-3) == Value.size(-3) or, Query.size(-3) % Key.size(-3) == 0
- The function adjusts the key and value tensors to match the query tensor's head dimension by using repeat_interleave if their number of heads are not equal, facilitating correct and efficient computation in attention mechanisms.
- By default the enable_gqa flag is set to False, which ensures that regular sdpa functionality remains unchanged.

### Benchmarks:

- **sdpa.py: #130634**
For different batch sizes enable_gqa=True shows a substansial improvement in the run_time of sdpa

 | batch_size | q_num_heads | kv_num_heads | q_seq_len | kv_seq_len | embed_dim | forward_time when enable_gqa=True   |   forward_time when enable_gqa=False    |
| ------------ | ------------- | -------------- | ----------- | ------------ | ----------- | ----------- | ---------------- |
|     1      |     32      |      8       |   2048    |    2048    |   2048    |   100.71  |  119.70  |
|     8      |     32      |      8       |   2048    |    2048    |   2048    |   539.78  |  628.83  |
|     16     |     32      |      8       |   2048    |    2048    |   2048    |   1056.81  |  1225.48  |
|     32      |     32      |      8       |   2048    |    2048    |   2048    |   2099.54  |  2440.45  |

![Screenshot 2024-07-25 at 9 07 40 PM](https://github.com/user-attachments/assets/a3e5f716-c39f-4096-9e6c-82a735e57b7b)

- **TorchTitan: https://github.com/pytorch/torchtitan/pull/458**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128898
Approved by: https://github.com/drisspg
2024-07-29 21:49:06 +00:00
PyTorch MergeBot
f72266ecea Revert "Let dynamo inline functional_call (#128646)"
This reverts commit 5aab1acc84.

Reverted https://github.com/pytorch/pytorch/pull/128646 on behalf of https://github.com/clee2000 due to the newly added test dynamo/test_higher_order_ops.py::FuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers [GH job link](https://github.com/pytorch/pytorch/actions/runs/10147452270/job/28058682000) [HUD commit link](5aab1acc84) is broken, probably a landrace since it passed on PR ([comment](https://github.com/pytorch/pytorch/pull/128646#issuecomment-2256375501))
2024-07-29 16:26:50 +00:00
Guilherme Leobas
5aab1acc84 Let dynamo inline functional_call (#128646)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128646
Approved by: https://github.com/zou3519
ghstack dependencies: #129091, #130490
2024-07-29 15:41:03 +00:00
Guilherme Leobas
1e9cdf7d91 Relax constraints for creating a GenericContextWrappingVariable (#129091)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129091
Approved by: https://github.com/yanboliang, https://github.com/zou3519
2024-07-29 15:40:59 +00:00
Chengji Yao
d47c470f47 [dynamo] implement var_getattr in UserFunctionVariable (#130413)
This PR addresses the `getattr` of  UserFunctionVariable. Although this usage is uncommon, it does appear in [Megatron's code](https://github.com/NVIDIA/Megatron-LM/blob/main/megatron/core/tensor_parallel/layers.py#L635).

```
def linear_with_grad_accumulation_and_async_allreduce(...):
    ....
    if not linear_with_grad_accumulation_and_async_allreduce.warned:
        ....
    ....

linear_with_grad_accumulation_and_async_allreduce.warned = False
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130413
Approved by: https://github.com/yanboliang
2024-07-29 08:29:59 +00:00
Oguz Ulgen
75c8d59ea1 Remove mypy ignore from torch/_dynamo/variables/lazy.py (#131785)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131785
Approved by: https://github.com/aorenste, https://github.com/zou3519
ghstack dependencies: #131786, #131870
2024-07-28 17:13:53 +00:00
Oguz Ulgen
96c1862e0b Remove mypy ignore from torch/_dynamo/variables/__init__.py (#131784)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131784
Approved by: https://github.com/aorenste, https://github.com/zou3519, https://github.com/Skylion007
2024-07-27 05:07:33 +00:00
Vishwa Raj Singh
cd53698df0 Add hpu backend support for dynamo torchVariable _in_graph_classes() function (#129948)
Fixes #ISSUE_NUMBER

Recent change from PR#
f657b2b1f8 (diff-4a52059570bb96333d8383ce6a9d01bbb114c5e34aff6028f820899ca39b5a26R80)  , has hard coded flow to cuda stream in ingraph function. For non cuda backend (hpu in our case), it breaks the graph.

As part of this PR change adding hpu backend support to dynamo variables function _in_graph_classes().

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129948
Approved by: https://github.com/yanboliang
2024-07-26 18:38:03 +00:00
William Wen
2576dbbc35 [dynamo] implement IteratorVariable and polyfill fallbacks for enumerate (#131725)
Fixes https://github.com/pytorch/pytorch/issues/112794.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131725
Approved by: https://github.com/anijain2305
ghstack dependencies: #131413, #131716
2024-07-26 17:17:09 +00:00
William Wen
35b4de32fa [dynamo] add itertools repeat/count bytecode reconstruction (#131716)
Also fix bugs in the count iterator variable implementation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131716
Approved by: https://github.com/anijain2305
ghstack dependencies: #131413
2024-07-26 17:17:09 +00:00
Brian Hirsh
8bb9aa93a7 dynamo: mutations on .data should be invisible to autograd (#131403)
Fixes https://github.com/pytorch/pytorch/issues/121353

our handle for `.data` in dynamo today basically just converts `y = x.data` into `y = x.detach()`. The semantics of these two ops are not quite the same, because:

(1) any future mutations on `x.data` will be fully ignored by autograd
(2) any mutations on `x.detach()` will bump x's version counter

the linked model does a .data mutation that is hidden from autograd in eager, but ends up erroring during AOTDispatcher tracing.

I updated dynamo's handling so that:

(1) when dynamo sees a call to `getattr(tensor, "data")` and calls `.detach()` we set a flag on the returned `TensorVariable` indicating it came from `.data`

(2) on any tensor method that we call with an input `TensorVariable` with this flag turned on, we proxy autograd's `preserve_version_counter` logic into the graph, to properly reset the VC after the op is run.

One thing to note is that I don't actually do this on every op that we pass the tensor to: I only do it for tensor methods that appear to be mutations (by checking for a trailing underscore). My thought was that:

(1) I didn't want to do this for **every** op that you pass `y` into, since that will e.g. triple the number of nodes in the graph, and could cause compile time regressions if you use .data

(2) this situation is pretty rare in general, and I'm hoping that "tensor method mutations" cover most reasonable mutation cases. If we manage to miss a case, you will get a loud error during tracing anyway, so there is not a safety issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131403
Approved by: https://github.com/anijain2305, https://github.com/zou3519
2024-07-26 14:22:20 +00:00
Yanbo Liang
e76e566cfb [Dynamo] Support zip_longest (#131497)
Fixes #121348

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131497
Approved by: https://github.com/mlazos, https://github.com/jansel, https://github.com/zou3519
2024-07-26 14:06:10 +00:00
William Wen
7d282d8755 [dynamo] add lazy IteratorVariable implementations for map and zip (#131413)
Fixes https://github.com/pytorch/pytorch/issues/130750.

Repro of lazy/eager `map` discrepancy without `islice`:
```python
    def fn(a, b):
        y = 1

        def f(x):
            nonlocal y
            y += 1
            return x

        l = list(zip([a, b], map(f, [1, 2, 3, 4])))
        return a + y
```

The major change is that we implement `MapVariable` and `ZipVariable` based on `IteratorVariable`. Before, `map` and `zip` were being traced by immediately unpacking the result as a `TupleVariable`, which is wrong in cases such as the example above.

`MapVariable`s are not allowed to be unpacked while `ZipVariable`s can only be unpacked if all of its iterables can also be unpacked.

We also add new `[has_]force_unpack_var_sequence` methods to `VariableTracker` for the case where it is safe to unpack the entire sequence lazily, e.g., when building a list from a map (i.e. `list(map(f, ...))`).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131413
Approved by: https://github.com/anijain2305
2024-07-26 10:47:38 +00:00
Animesh Jain
a617919541 [dynamo] Do not guard on keys for _forward_hooks and _forward_pre_hooks (#131682)
Fixes https://github.com/pytorch/pytorch/issues/125836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131682
Approved by: https://github.com/bdhirsh
2024-07-26 04:39:54 +00:00
Animesh Jain
2a4ca5ccc4 [dynamo] Pop the exception stack on handling the StopIteration natively (#131801)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131801
Approved by: https://github.com/yanboliang
ghstack dependencies: #131795
2024-07-25 23:33:19 +00:00
Michael Lazos
51f4f87718 [Reland] Ensure staticmethods can be allowed in graph (#131789)
Fixes https://github.com/pytorch/pytorch/issues/124735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131789
Approved by: https://github.com/anijain2305
2024-07-25 22:54:18 +00:00
Yidi Wu
ffc6bf8149 [dynamo] lazily guard and specialize on the symint when used in f-string. (#131529)
Fixes https://github.com/pytorch/pytorch/issues/103602.

This PR implements the idea of "if someone creates a string and then ends up not using it, we would prefer to NOT have specialized." mentioned in above issue. Specifically, we create a lazy variable tracker instead of ConstantVariable when we're in FORMAT_VALUE, and when the lazy variable tracker is realized (i.e. it's going to be used), we create a ConstantVariable and the specialization/guarding happens at the time of realization.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131529
Approved by: https://github.com/ezyang
2024-07-25 16:16:34 +00:00
Oguz Ulgen
7a42470bcb Annotate all InstructionTranslator (#131509)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131509
Approved by: https://github.com/zou3519
2024-07-24 23:45:53 +00:00
PyTorch MergeBot
236e06f9f9 Revert "Ensure staticmethods can be allowed in graph (#130882)"
This reverts commit 93fdd0237d.

Reverted https://github.com/pytorch/pytorch/pull/130882 on behalf of https://github.com/clee2000 due to torchrec test still broken internally D59945836 ([comment](https://github.com/pytorch/pytorch/pull/130882#issuecomment-2249003059))
2024-07-24 22:32:41 +00:00
PyTorch MergeBot
5db5865614 Revert "Annotate all InstructionTranslator (#131509)"
This reverts commit eafbd20f23.

Reverted https://github.com/pytorch/pytorch/pull/131509 on behalf of https://github.com/clee2000 due to sorry need to revert this to revert something else, I think you only need to rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/131509#issuecomment-2249000843))
2024-07-24 22:29:49 +00:00
Oguz Ulgen
b56939dae1 Annotate more InstructionTranslator (#131680)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131680
Approved by: https://github.com/zou3519
ghstack dependencies: #131676
2024-07-24 22:14:29 +00:00
Edward Z. Yang
0c6f1ca064 Introduce torch._dynamo.config.enable_compiler_collectives for syncing compilation across ranks (#130935)
This PR implements an opt-in configuration option for synchronizing compilation across all ranks at the end of Dynamo tracing (and potentially, other places in the future). There are two pieces to this PR:

1. Implementing infrastructure for compiler collectives (DistributedState/LocalState, the actual collective)
2. Using this infrastructure to synchronize automatic dynamic choices across all ranks

The infrastructure in part one can be used for other purposes, just add more (serializable) fields to LocalState.

Here is how automatic dynamic synchronization works:

1. Preflight in "torch/_dynamo/variables/builder.py": On the first Dynamo trace run, we trace without automatic dynamic at all; we assume all Tensor inputs that are not otherwise marked are static. This run is purely to collect all Tensor input sizes in the program.
2. torch/_dynamo/output_graph.py: At the end of the first Dynamo trace run, we perform a compiler collective to distribute all Tensor input sizes to all ranks. Then, we restart Dynamo
3. Apply the updates in "torch/_dynamo/variables/builder.py": Now that we have all sizes for every rank, we now update frame state with the observed sizes for all ranks, in rank order. Under the assumption that frame state is consistent on all ranks, this series of updates will preserve consistency.

For future work, it would be safer if we force a consistent hint on all ranks; this is more involved as we have to interpose in fakification.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130935
Approved by: https://github.com/jansel
2024-07-24 11:24:11 +00:00
Oguz Ulgen
eafbd20f23 Annotate all InstructionTranslator (#131509)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131509
Approved by: https://github.com/zou3519
2024-07-24 05:31:01 +00:00
Michael Lazos
9575b1afad Ensure tensor dict is populated with compiled autograd (#131556)
The issue addressed is that compiled autograd changes the calling convention of the FX graph to only have a single placeholder which contains a list of inputs. In this case, the meta of the tensor input nodes don't contain the `tensor_dict` meta. This adds them.

The context is that `tensor_dict` is used to convey if a tensor is an input with a static address.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131556
Approved by: https://github.com/anijain2305
2024-07-24 04:00:02 +00:00
Animesh Jain
6850e42266 [dynamo][exception] Remove older specialization for StopIteration (#131512)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131512
Approved by: https://github.com/yanboliang
ghstack dependencies: #131347, #131367, #131378, #131389, #131405, #131480
2024-07-24 00:06:53 +00:00
Aaron Orenstein
5a0068cc69 [BE] mypy: disallow untyped decorators (#131428)
Untyped decorators strip the types from their decorated function so even if the underlying function is fully typed then callers to it don't get any benefit from type annotations.

Step 1 - Enable the error and override in all the offending files.

#131429

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131428
Approved by: https://github.com/justinchuby, https://github.com/oulgen
2024-07-23 21:50:55 +00:00
Michael Lazos
93fdd0237d Ensure staticmethods can be allowed in graph (#130882)
Fixes https://github.com/pytorch/pytorch/issues/124735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130882
Approved by: https://github.com/anijain2305, https://github.com/williamwen42
2024-07-23 18:59:19 +00:00
Shangdi Yu
cfb9ccab6c [export] Filter errors by exception type, add case name (#131327)
Summary:
-  Log export errors to Scuba and mark them with "classified" and "unclassified"
- Classify errors by exception type (ALLOW_LIST) and a `case_name` attribute
- Add `case_name` for some exceptions.

Test Plan:
Running the code below logs a classified error to `torch_export_usage` table in Scuba.

```
import torch

from torch._export.db.case import SupportLevel

class TorchSymMin(torch.nn.Module):
    """
    torch.sym_min operator is not supported in export.
    """

    def forward(self, x):
        return x.sum() + torch.sym_min(x.size(0), 100)

example_args = (torch.randn(3, 2),)
tags = {"torch.operator"}
support_level = SupportLevel.NOT_SUPPORTED_YET
model = TorchSymMin()

torch.export.export(model, example_args)
``

Differential Revision: D59981459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131327
Approved by: https://github.com/zhxchen17
2024-07-23 18:01:13 +00:00
Animesh Jain
eab1595ce2 [dynamo] Delete wrong assertion in bind_args (#131405)
Fix - https://github.com/pytorch/pytorch/issues/130537

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131405
Approved by: https://github.com/williamwen42, https://github.com/yanboliang
ghstack dependencies: #131347, #131367, #131378, #131389
2024-07-23 17:28:05 +00:00
Animesh Jain
6bbef2a06b [dynamo] Support set on KeysView (#131389)
Fixes https://github.com/pytorch/pytorch/issues/129664

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131389
Approved by: https://github.com/mlazos
ghstack dependencies: #131347, #131367, #131378
2024-07-23 14:15:26 +00:00
Animesh Jain
e7c5e06772 [dynamo] Support __contains__ on __dict__ on UserDefinedClassVariable (#131378)
Fixes https://github.com/pytorch/pytorch/issues/129665

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131378
Approved by: https://github.com/mlazos
ghstack dependencies: #131347, #131367
2024-07-23 14:15:26 +00:00
Animesh Jain
0bc5e26067 [dynamo] Support dict conversion of objects derived from MutableMapping (#131367)
Fixes - https://github.com/pytorch/pytorch/issues/129662

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131367
Approved by: https://github.com/williamwen42
ghstack dependencies: #131347
2024-07-23 14:15:20 +00:00
Animesh Jain
a944cce5b8 [dynamo] Support if callable on list (#131347)
Fixes https://github.com/pytorch/pytorch/issues/130720

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131347
Approved by: https://github.com/williamwen42, https://github.com/mlazos
2024-07-23 14:15:15 +00:00
Animesh Jain
ddde9dd25c [dynamo][automatic_dynamic] Trigger dynamism on stride changes (#130232)
Fixes https://github.com/pytorch/pytorch/issues/129798

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130232
Approved by: https://github.com/ezyang
2024-07-21 03:45:54 +00:00
Animesh Jain
e49c0acc39 [dynamo] Revert https://github.com/pytorch/pytorch/pull/130416 (#131058)
All the changes brought by the original PR have been addressed in alternative ways in the stack. Why the original PR has to be reverted requires  more effort because there is some bad interaction with export.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131058
Approved by: https://github.com/williamwen42
2024-07-19 17:26:24 +00:00
Michael Lazos
1b72cf0b09 Add hasattr for tensor variable (#131008)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131008
Approved by: https://github.com/anijain2305
ghstack dependencies: #131007
2024-07-19 12:43:27 +00:00
Animesh Jain
ac76dd606f [dynamo] Alternative way to skip empty hooks guards on inbuilt nn modules (#131057)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131057
Approved by: https://github.com/williamwen42, https://github.com/jansel
ghstack dependencies: #131056
2024-07-19 04:42:38 +00:00
Michael Lazos
22388ffe03 Graph break on tostring for numpy remapping (#131007)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131007
Approved by: https://github.com/williamwen42
2024-07-18 17:23:41 +00:00
PyTorch MergeBot
9f6db5d0e2 Revert "Ensure staticmethods can be allowed in graph (#130882)"
This reverts commit b0387449db.

Reverted https://github.com/pytorch/pytorch/pull/130882 on behalf of https://github.com/atalman due to failing torchrec tests internally, please fix and reland ([comment](https://github.com/pytorch/pytorch/pull/130882#issuecomment-2236528473))
2024-07-18 13:31:30 +00:00
Animesh Jain
a085acd7d6 [dynamo] Revert back changes to UnspecializedBuiltinNNModuleVariable (#130991)
xref - https://fb.workplace.com/groups/1075192433118967/permalink/1466525440652329/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130991
Approved by: https://github.com/williamwen42, https://github.com/mlazos
2024-07-18 05:01:46 +00:00
PyTorch MergeBot
0b134c15cd Revert "Relax constraints for creating a GenericContextWrappingVariable (#129091)"
This reverts commit 882fd91869.

Reverted https://github.com/pytorch/pytorch/pull/129091 on behalf of https://github.com/clee2000 due to test_jit started failing on main after this stack https://github.com/pytorch/pytorch/actions/runs/9980754603/job/27583474357 a8bd2933d9 ([comment](https://github.com/pytorch/pytorch/pull/129091#issuecomment-2234269541))
2024-07-17 20:59:40 +00:00
Animesh Jain
65b4163bd2 [dynamo][nn-module] Make slice getitem on nn module container sourceless (#130852)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130852
Approved by: https://github.com/mlazos
ghstack dependencies: #130773
2024-07-17 20:17:08 +00:00
Guilherme Leobas
882fd91869 Relax constraints for creating a GenericContextWrappingVariable (#129091)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129091
Approved by: https://github.com/yanboliang, https://github.com/zou3519
2024-07-17 20:07:06 +00:00
Michael Lazos
b0387449db Ensure staticmethods can be allowed in graph (#130882)
Fixes https://github.com/pytorch/pytorch/issues/124735

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130882
Approved by: https://github.com/anijain2305, https://github.com/williamwen42
2024-07-17 19:18:30 +00:00
Michael Lazos
bea6762c01 Add guards on subclass metadata (#130779)
This PR adds guards in dynamo which verify the equality of tensor subclass metadata along with tests verifying the expected recompile behavior. The next PR adds the capability to override the guard behavior to possibly perform the check in a less expensive manner.

Toward fixing https://github.com/pytorch/pytorch/issues/114405

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130779
Approved by: https://github.com/anijain2305, https://github.com/bdhirsh
2024-07-17 19:13:52 +00:00
Edward Z. Yang
408c921d96 Make hashing a SymInt raise an error again (#130548)
See https://github.com/pytorch/pytorch/issues/130547

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130548
Approved by: https://github.com/Skylion007, https://github.com/albanD, https://github.com/lezcano
2024-07-16 18:30:30 +00:00
Aaron Gokaslan
53e5b8ac5b [BE]: Update flake8-comprehensions and enable C420 (#130699)
Uses `dict.fromkeys` whenever possible as covered by flake8-comprehensions rule C420. While the ruff rule RUF025 is still in preview, flake8-comprehensions have added a new rule which covers this. Use dict.fromkeys is faster when the value being added to the dictionary is the same at every iteration and is immutable, it also removes an unnecessary dict comprehension.

This rule will be enabled with our current ruleset in RUF in 0.6 as C420.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130699
Approved by: https://github.com/lezcano, https://github.com/ezyang
2024-07-16 13:47:49 +00:00
Animesh Jain
fedae41c57 [dynamo] Do not mark nn.module containers as BuiltinNNModuleVariable (#130773)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130773
Approved by: https://github.com/williamwen42, https://github.com/mlazos
2024-07-16 06:55:46 +00:00
Michael Lazos
0d0c09702a Update mark_static_address for inlining NN modules (#130392)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130392
Approved by: https://github.com/anijain2305
ghstack dependencies: #130391
2024-07-16 00:25:29 +00:00
Michael Lazos
d8616eb66a Mark nn_module params and buffers as static in dynamo (#130391)
This PR marks all buffers and parameters of an NNModule as static using the `mark_static_address` API. As a result, when tensors are passed to AOT, the `tensor_dict` metadata of placeholder nodes will contain the `static_address_type` key, indicating which graph argument positions are static for cudagraphs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130391
Approved by: https://github.com/anijain2305
2024-07-16 00:25:23 +00:00
Alex Dennis
7d4f50de19 dynamo add support for defaultdict(set) (#130745)
Fixes #130554

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130745
Approved by: https://github.com/Skylion007
2024-07-15 22:23:33 +00:00
William Wen
3928ca2ab6 [dynamo] update call map to allow multiple input parameters (#130748)
Fixes https://github.com/pytorch/pytorch/issues/128072.

Commandeering https://github.com/pytorch/pytorch/pull/128282 since the issue is now hi pri.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130748
Approved by: https://github.com/Skylion007, https://github.com/anijain2305
2024-07-15 22:16:49 +00:00
awayzjj
dcaa111dc8 support intersection by polyfill (#130672)
Fixes https://github.com/pytorch/pytorch/issues/130557

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130672
Approved by: https://github.com/anijain2305
2024-07-14 10:44:26 +00:00
Xuehai Pan
4d7bf72d93 [BE][Easy] fix ruff rule needless-bool (SIM103) (#130206)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130206
Approved by: https://github.com/malfet
2024-07-14 08:17:52 +00:00
chilli
f9f85bfc0b [Inductor] FlexAttention supports partial masking (#130415) (#130626)
This is the new version of https://github.com/pytorch/pytorch/pull/130415

Updated test script: https://gist.github.com/yanboliang/7c34a82df611d4ea8869cb9e041bfbfc
Updated perf numbers:
```
(pt) [ybliang@devgpu002.ash8 ~/local/debug]$ CUDA_VISIBLE_DEVICES=4 python debug7.py
fwd speedup: 0.7166695598192317
bwd speedup: 0.7142133867805904
(pt) [ybliang@devgpu002.ash8 ~/local/debug]$ CUDA_VISIBLE_DEVICES=4 python debug7.py --partial-mask
fwd speedup: 0.8428246087169973
bwd speedup: 0.8486261278030254
```
Approved by: https://github.com/Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130626
Approved by: https://github.com/drisspg, https://github.com/yanboliang
2024-07-14 00:37:26 +00:00
Yidi Wu
741c1710e8 [cond] inlining into one of the branches when pred is a python constant (#130493)
Reland https://github.com/pytorch/pytorch/pull/128709.

When the input predicate is a python constant, we specialize into one of the branches and warn users that torch.cond is not preserving the dynamism. The previous behavior is that we baked in True/False in the cond operator. This can be confusing. In this PR, we change it to be specializing into one of the branches when the inputs are constants.

We additionally change the naming of cond operator to default one without overriding its name. This allows better testing on de-serialized graph.

Test Plan:
The predicate in some existing tests is the result of a shape comparison. When no dynamic shape is involved, the predicate is a python bool. To fix them, we either change the predicate to be some data-dependent tensor or change the test to check cond is specialized as one of the branches,

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130493
Approved by: https://github.com/BoyuanFeng
2024-07-12 18:02:09 +00:00
Yidi Wu
0bf9a091ec [torchbind] add tracing_mode support (#129586)
Sometimes, it could be difficult to write a fake class e.g. when the original implementation is using some third-party libraries or users are certain that the class is safe to trace with the real object.

This PR allows user to specify their intention by implementing a "safe_to_trace_with_real_obj" method on their script class.

Test Plan:
`pytest test/export/test_torchbind.py -k safe`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129586
Approved by: https://github.com/zou3519
2024-07-12 18:01:47 +00:00
Tom Ritchford
b0a597fcb4 Fix #121334: graph break on constant method call (#130158)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130158
Approved by: https://github.com/lezcano
2024-07-12 17:34:46 +00:00
PyTorch MergeBot
da030e7add Revert "[Inductor] FlexAttention supports partial masking (#130415)"
This reverts commit 207564bab1.

Reverted https://github.com/pytorch/pytorch/pull/130415 on behalf of https://github.com/janeyx99 due to Windows trunk test_proxy_tensor test failures look relevant  ([comment](https://github.com/pytorch/pytorch/pull/130415#issuecomment-2225575622))
2024-07-12 13:20:18 +00:00
Yanbo Liang
207564bab1 [Inductor] FlexAttention supports partial masking (#130415)
This is the new version of #130235

Updated test script: https://gist.github.com/yanboliang/7c34a82df611d4ea8869cb9e041bfbfc
Updated perf numbers:
```
(pt) [ybliang@devgpu002.ash8 ~/local/debug]$ CUDA_VISIBLE_DEVICES=4 python debug7.py
fwd speedup: 0.7166695598192317
bwd speedup: 0.7142133867805904
(pt) [ybliang@devgpu002.ash8 ~/local/debug]$ CUDA_VISIBLE_DEVICES=4 python debug7.py --partial-mask
fwd speedup: 0.8428246087169973
bwd speedup: 0.8486261278030254
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130415
Approved by: https://github.com/Chillee
2024-07-12 07:19:28 +00:00
Michael Lazos
c101c4517a Add python type for list iterators (#130511)
Fixes https://github.com/pytorch/pytorch/issues/117026

Also not sure why this was missing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130511
Approved by: https://github.com/williamwen42, https://github.com/yanboliang, https://github.com/anijain2305
2024-07-12 01:14:18 +00:00
Xuehai Pan
973037be6a [BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): list() / tuple() / dict() (#130199)
This PR changes the empty collection factory call to Python literals:

- `list()` -> `[]`
- `tuple()` -> `()`
- `dict()` -> `{}`

The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary:

```bash
$ python3 -m dis - <<EOS
import collections

d1 = {}
d2 = dict()

dict = collections.OrderedDict
d3 = dict()
EOS
```

```text
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 (0)
              4 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (collections)
              8 STORE_NAME               0 (collections)

  3          10 BUILD_MAP                0
             12 STORE_NAME               1 (d1)

  4          14 PUSH_NULL
             16 LOAD_NAME                2 (dict)
             18 CALL                     0
             26 STORE_NAME               3 (d2)

  6          28 LOAD_NAME                0 (collections)
             30 LOAD_ATTR                8 (OrderedDict)
             50 STORE_NAME               2 (dict)

  7          52 PUSH_NULL
             54 LOAD_NAME                2 (dict)
             56 CALL                     0
             64 STORE_NAME               5 (d3)
             66 RETURN_CONST             1 (None)
```

The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199
Approved by: https://github.com/malfet
2024-07-11 17:30:28 +00:00
Animesh Jain
a833582dbb [dynamo][tuple] Optimize guard for small tuples - helps conv2d guards (#130400)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130400
Approved by: https://github.com/yanboliang, https://github.com/jansel
ghstack dependencies: #130285, #130368, #130416
2024-07-11 14:13:24 +00:00
Animesh Jain
f7d7b94017 [dynamo][unspecialized-nn-module] Distinguish between user-defined and builtin nn module (#130416)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130416
Approved by: https://github.com/jansel
ghstack dependencies: #130285, #130368
2024-07-11 14:13:24 +00:00
PyTorch MergeBot
0beeac35fa Revert "[cond] inlining into one of the branches when pred is a python constant (#128709)"
This reverts commit fe3e6878c4.

Reverted https://github.com/pytorch/pytorch/pull/128709 on behalf of https://github.com/ydwu4 due to causing error on truck due to a land racing: fe3e6878c4 ([comment](https://github.com/pytorch/pytorch/pull/128709#issuecomment-2221104043))
2024-07-10 17:47:19 +00:00
Yidi Wu
fe3e6878c4 [cond] inlining into one of the branches when pred is a python constant (#128709)
When the input predicate is a python constant, we specialize into one of the branches and warn users that torch.cond is not preserving the dynamism. The previous behavior is that we baked in True/False in the cond operator. This can be confusing. In this PR, we change it to be specializing into one of the branches when the inputs are constants.

We additionally change the naming of cond operator to default one without overriding its name. This allows better testing on de-serialized graph.

Test Plan:
The predicate in some existing tests is the result of a shape comparison. When no dynamic shape is involved, the predicate is a python bool. To fix them, we either change the predicate to be some data-dependent tensor or change the test to check cond is specialized as one of the branches,

Differential Revision: [D59589709](https://our.internmc.facebook.com/intern/diff/D59589709)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128709
Approved by: https://github.com/zou3519
2024-07-10 16:44:27 +00:00
chilli
ce4d95143f Add scale kwarg to FlexAttention (and some changes that get FlexAttention numerics to be as accurate as FA2) (#130250)
After this PR, our numerical error is within 3% of FA2 for forward and gradients. Prior, for `dq` our numerical error was 30% higher. I also added a `PRESCALE_QK` kernel option that increases perf by about 3-4% but incurs about 20-30% more numerical error.

![image](https://github.com/pytorch/pytorch/assets/6355099/7b5ff44e-219b-4a05-8a1b-2a0182c01ab2)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130250
Approved by: https://github.com/drisspg
ghstack dependencies: #130227
2024-07-10 16:14:45 +00:00
rzou
6ce0bd7d3b [HOP] Use user directed names for variables where possible (#130271)
Afaict the previous check was too strict. Removing it passes all the
mutation tests (mutation checks happen via the TensorVariable's mutable_local).

Test Plan:
- tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130271
Approved by: https://github.com/Chillee, https://github.com/ydwu4
2024-07-10 13:59:20 +00:00
rzou
99c68f7bea Refactor TritonKernelVariable's logic so it can be shared (#130177)
TritonKernelVariable's logic tells us how to go from a user-defined
triton kernel and a grid to a call to the triton_kernel_wrapper_mutation
HOP. We want to re-use this in a setting without Dynamo; in the next PR
up, we create a new decorator (capture_triton) that, when applied to a
triton kernel, transforms a call to the triton kernel into a call
to the triton_kernel_wrapper_mutation HOP.

Test Plan:
- existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130177
Approved by: https://github.com/oulgen, https://github.com/ydwu4
2024-07-10 03:09:29 +00:00
PyTorch MergeBot
44815ed67e Revert "Add scale kwarg to FlexAttention (and some changes that get FlexAttention numerics to be as accurate as FA2) (#130250)"
This reverts commit 3e48d92733.

Reverted https://github.com/pytorch/pytorch/pull/130250 on behalf of https://github.com/izaitsevfb due to depends on #130227 which needs to be reverted ([comment](https://github.com/pytorch/pytorch/pull/130250#issuecomment-2218840674))
2024-07-09 22:32:54 +00:00
PyTorch MergeBot
3be4922a9d Revert "[HOP] Use user directed names for variables where possible (#130271)"
This reverts commit adb65682af.

Reverted https://github.com/pytorch/pytorch/pull/130271 on behalf of https://github.com/clee2000 due to broke inductor/test_flex_attention https://github.com/pytorch/pytorch/actions/runs/9863205414/job/27236960046 adb65682af Test not run on PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/130271#issuecomment-2218832643))
2024-07-09 22:24:39 +00:00
rzou
adb65682af [HOP] Use user directed names for variables where possible (#130271)
Afaict the previous check was too strict. Removing it passes all the
mutation tests (mutation checks happen via the TensorVariable's mutable_local).

Test Plan:
- tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130271
Approved by: https://github.com/Chillee, https://github.com/ydwu4
ghstack dependencies: #130255, #130268
2024-07-09 19:42:52 +00:00
chilli
3e48d92733 Add scale kwarg to FlexAttention (and some changes that get FlexAttention numerics to be as accurate as FA2) (#130250)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130250
Approved by: https://github.com/drisspg
ghstack dependencies: #130160, #130106, #130224, #130227
2024-07-09 09:24:06 +00:00
rzou
f2c9f0c0db [HOP] improve naming for subgraph inputs (#130255)
Previously, subgraph input names were whatever the input proxies were,
which were confusing. This PR changes those names to be
whatever the names of the arguments the functions being
speculate_subgraph'ed are. This is best-effort: if we can't figure it
out then we go back to the previous strategy.

Test Plan:
- existing expecttests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130255
Approved by: https://github.com/ydwu4
2024-07-09 02:46:40 +00:00
Animesh Jain
c5c9dbece1 [dynamo][user-defined] Simplify and improve scope of UserDefinedObject var_getattr (#130169)
Fixes https://github.com/pytorch/pytorch/issues/122649

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130169
Approved by: https://github.com/jansel
ghstack dependencies: #118448, #130159
2024-07-08 04:10:56 +00:00
Animesh Jain
bd0252fb98 [dynamo][user-defined] Support method descriptors (#130159)
Fixes https://github.com/pytorch/pytorch/issues/120650

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130159
Approved by: https://github.com/jansel
ghstack dependencies: #118448
2024-07-06 02:03:09 +00:00
Kurt Mohler
e590168865 Enable sharing meta tensors between processes (#129520)
Fixes #129436

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129520
Approved by: https://github.com/ezyang
2024-07-04 20:29:48 +00:00
Yanbo Liang
551f3b92b2 [Dynamo] Add assertion for tensor unpack shape mismatch (#130077)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130077
Approved by: https://github.com/Chillee
2024-07-04 09:25:08 +00:00
Animesh Jain
fa4e489d70 [dynamo][dynamic-shapes] Graph break if out shape changes on out= variants (#130074)
Fixes https://github.com/pytorch/pytorch/issues/130068

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130074
Approved by: https://github.com/ezyang
ghstack dependencies: #129913, #129914
2024-07-04 08:36:12 +00:00
Animesh Jain
a7a7363be0 [dynamo] Skip side effect tracking for c wrappers/descriptors (#129914)
Fixes PYTORCH_TEST_WITH_DYNAMO=1 pytest -vs test/test_python_dispatch.py::TestPythonDispatch::test_deepcopy_wrapper_subclass

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129914
Approved by: https://github.com/jansel
ghstack dependencies: #129913
2024-07-04 03:14:45 +00:00
Animesh Jain
da8af685ac [dynamo] Skip ID_MATCH guard on GetSetDescriptorType (#129913)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129913
Approved by: https://github.com/jansel
2024-07-04 03:14:45 +00:00
Edward Z. Yang
29c68df600 Stop immediately specializing common constants 0/1 for plain int (#128327)
Fixes https://github.com/pytorch/pytorch/issues/128319

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128327
Approved by: https://github.com/lezcano
ghstack dependencies: #129983
2024-07-03 16:41:51 +00:00
Edward Z. Yang
d7680a564b Bug fixes for disabling 0/1 specialization on plain int (#129961)
These bug fixes will be exercised in
https://github.com/pytorch/pytorch/pull/128327 but I separate them from
the actual policy change (which is more risky)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129961
Approved by: https://github.com/lezcano
2024-07-02 23:19:48 +00:00
Yanbo Liang
1f3e2d7877 [Inductor] Rename TemplatedAttention to FlexAttention (#129859)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129859
Approved by: https://github.com/Chillee, https://github.com/drisspg
ghstack dependencies: #129831
2024-07-02 18:48:16 +00:00
Yanbo Liang
34e94c507a [Inductor] Make FlexAttention block_mask argument as tuple (#129831)
Re-organize ```block_mask``` related arguments a tuple to reduce the individual argument number. I was trying to use named tuple, but aot autograd doesn't work well with named tuple. The only downside of using tuple rather than named tuple is we need to use index to access its element. But we only need this at one place, it should be fine.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129831
Approved by: https://github.com/Chillee, https://github.com/drisspg
2024-07-02 17:18:33 +00:00
Animesh Jain
9105d54c6b [dynamo][sparse] Graph break on sparse tensors (#129883)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129883
Approved by: https://github.com/ezyang
ghstack dependencies: #129830, #129858, #129857, #129881
2024-07-02 16:51:56 +00:00
Animesh Jain
53d67165c0 [dynamo] Skip FUNCTION_MATCH guards for descriptors (#129858)
Hard to write tests. This PR makes many test pass in the stack such as

`PYTORCH_TEST_WITH_DYNAMO=1 pytest test/test_ao_sparsity.py::TestComposability::test_convert_without_squash_mask`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129858
Approved by: https://github.com/mlazos
ghstack dependencies: #129830
2024-07-01 20:44:59 +00:00
Animesh Jain
e62073d799 [dynamo] Skip FUNCTION_MATCH on method-wrapper objects (#129830)
Fixes https://github.com/pytorch/pytorch/issues/118563

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129830
Approved by: https://github.com/jansel
2024-06-30 20:21:18 +00:00
Yanbo Liang
ec47d4d9a8 [Inductor] FlexAttention supports block sparse mask (#129216)
Benchmark script (causal mask): https://gist.github.com/yanboliang/c2010a1fd081d4e8ca94fadec9eef286
Initial perf number:
* fwd speedup: 0.44 -> 0.72
* bwd speedup: 0.38 -> 0.71

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129216
Approved by: https://github.com/Chillee
2024-06-29 04:44:38 +00:00
PyTorch MergeBot
dfd55d1714 Revert "[cond] inlining into one of the branches when pred is a python constant (#128709)"
This reverts commit 23adf166e1.

Reverted https://github.com/pytorch/pytorch/pull/128709 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is breaking one ExecuTorch test ([comment](https://github.com/pytorch/pytorch/pull/128709#issuecomment-2197806850))
2024-06-29 01:03:55 +00:00
Joel Schlosser
6897631ceb Guard on inner tensor names for traceable wrapper subclasses (#129618)
Fixes #129601

Background: it's possible that a traceable wrapper subclass will have an optional inner tensor constituent (e.g. NJT's cached min / max sequence lengths). To specify this, the subclass's `__tensor_flatten__()` impl should leave out any unspecified optional inner tensors in the returned list of `attrs`.

This PR guards on the list of inner tensor `attrs` returned in `subclass.__tensor_flatten__()[0]`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129618
Approved by: https://github.com/anijain2305
2024-06-28 16:30:25 +00:00
PyTorch MergeBot
c43923a116 Revert "[Inductor] FlexAttention supports block sparse mask (#129216)"
This reverts commit b9d3cedd64.

Reverted https://github.com/pytorch/pytorch/pull/129216 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is still failing in trunk b9d3cedd64, maybe a landrace given that TD has been turned off ([comment](https://github.com/pytorch/pytorch/pull/129216#issuecomment-2196182882))
2024-06-28 05:44:46 +00:00
Yanbo Liang
b9d3cedd64 [Inductor] FlexAttention supports block sparse mask (#129216)
Benchmark script (causal mask): https://gist.github.com/yanboliang/c2010a1fd081d4e8ca94fadec9eef286
Initial perf number:
* fwd speedup: 0.44 -> 0.72
* bwd speedup: 0.38 -> 0.71

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129216
Approved by: https://github.com/Chillee
2024-06-28 01:32:54 +00:00
Will Feng
c07a799ed5 [Traceable FSDP2] Add Dynamo support for run_with_rng_state HOP (#127247)
Test command:
`pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_run_with_rng_state`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127247
Approved by: https://github.com/bdhirsh
ghstack dependencies: #129502
2024-06-28 01:04:49 +00:00
Yidi Wu
23adf166e1 [cond] inlining into one of the branches when pred is a python constant (#128709)
When the input predicate is a python constant, we specialize into one of the branches and warn users that torch.cond is not preserving the dynamism. The previous behavior is that we baked in True/False in the cond operator. This can be confusing. In this PR, we change it to be specializing into one of the branches when the inputs are constants.

We additionally change the naming of cond operator to default one without overriding its name. This allows better testing on de-serialized graph.

Test Plan:
The predicate in some existing tests is the result of a shape comparison. When no dynamic shape is involved, the predicate is a python bool. To fix them, we either change the predicate to be some data-dependent tensor or change the test to check cond is specialized as one of the branches,

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128709
Approved by: https://github.com/zou3519
2024-06-27 20:28:50 +00:00
Angela Yi
4dcc1ceff3 [dynamo] Fakify result of delegate (#128752)
Summary: Somehow the delegate returns a real tensor result even though we pass in fake tensors. So here we need to convert the result to fake.

Test Plan: `buck2 run @//mode/dev-nosan //on_device_ai/helios/multi_zion:multi_zion_test -- -r test_single_delegate_dsp_only`

Differential Revision: D58617091

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128752
Approved by: https://github.com/ydwu4
2024-06-27 17:59:52 +00:00
PyTorch MergeBot
5ceba6a3cb Revert "[Inductor] FlexAttention supports block sparse mask (#129216)"
This reverts commit 4082759925.

Reverted https://github.com/pytorch/pytorch/pull/129216 on behalf of https://github.com/clee2000 due to broke functorch/aot_dispatch and test_proxy_tensor on windows https://github.com/pytorch/pytorch/actions/runs/9691331440/job/26743164471 4082759925 missed on PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/129216#issuecomment-2195087274))
2024-06-27 15:57:52 +00:00
Animesh Jain
c9798d123b [dynamo][compile-time] Manually trace torch.nn.Module.parameters (#129583)
With this PR, we are not worse than no-inlining for Dynamo-only compilation time (there is a litte bit of noise, so outlier of 0.89 is probably ok here). For most of the models, we see positive numbers because of better caching in `UserDefinedObjectVariable`.

![image](https://github.com/pytorch/pytorch/assets/13822661/719d34fd-3e7f-4886-b7e0-1dbfc7141aa5)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129583
Approved by: https://github.com/jansel
2024-06-27 06:06:04 +00:00
Yanbo Liang
4082759925 [Inductor] FlexAttention supports block sparse mask (#129216)
Benchmark script (causal mask): https://gist.github.com/yanboliang/c2010a1fd081d4e8ca94fadec9eef286
Initial perf number:
* fwd speedup: 0.44 -> 0.72
* bwd speedup: 0.38 -> 0.71

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129216
Approved by: https://github.com/Chillee
2024-06-27 05:44:27 +00:00
Tugsbayasgalan Manlaibaatar
6181e65cd8 Nested tensor subclass support (#127431)
When we have nested tensor subclasses, we need to recursively flatten/unflatten in Fake tensor creation and AOTAUtograd. Most of the PR is about mechanical change which changes today's single level flatten logic to be recursive.

Differential Revision: [D58533224](https://our.internmc.facebook.com/intern/diff/D58533224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/127431
Approved by: https://github.com/bdhirsh
2024-06-26 04:45:22 +00:00
PyTorch MergeBot
fb40ba6fc2 Revert "[Traceable FSDP2] Add Dynamo support for run_with_rng_state HOP (#127247)"
This reverts commit aa4ee2cb9e.

Reverted https://github.com/pytorch/pytorch/pull/127247 on behalf of https://github.com/ZainRizvi due to This PR is seems to be causing multiple macos failures.  Looks like it was merged before trunk jobs were started, which would have run those tests ([comment](https://github.com/pytorch/pytorch/pull/129414#issuecomment-2189479505))
2024-06-25 17:05:55 +00:00
Will Feng
aa4ee2cb9e [Traceable FSDP2] Add Dynamo support for run_with_rng_state HOP (#127247)
Test command:
`pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_run_with_rng_state`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127247
Approved by: https://github.com/bdhirsh
ghstack dependencies: #129414
2024-06-25 03:13:38 +00:00
Animesh Jain
c4dd752d97 [dynamo][compile-time][inlining-inbuilt-nn-modules] Manually implement nn.Module._call_impl (#129285)
# Compile time for eager backend
## AlbertForMaskedLM
No inlining - 3.65 seconds
Inlining on main - 7.48 seconds
Inlining + this PR - 2.86 seconds

## MobileBertForMaskedLM
No inlining - 26.90 seconds
Inlining on main - 48.21 seconds
Inlining + this PR - 24.25 seconds

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129285
Approved by: https://github.com/jansel
ghstack dependencies: #129316, #129315
2024-06-25 01:31:26 +00:00
Animesh Jain
514f9279f8 [dynamo][compile-time] Manually implement nn.Module.__getattr__ to reduce compile time (#129315)
# Compile time for eager backend
## AlbertForMaskedLM
No inlining - 3.65 seconds
Inlining on main - 7.48 seconds
Inlining + this PR - 6.70 seconds

## MobileBertForMaskedLM
No inlining - 26.90 seconds
Inlining on main - 48.21 seconds
Inlining + this PR - 43.85 seconds

*Next PR in the stack makes the total compile time better/comparable to no inlining*

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129315
Approved by: https://github.com/jansel
ghstack dependencies: #129316
2024-06-25 01:31:26 +00:00
Animesh Jain
17d1723aee [dynamo][unspecialized-nn-modules] Remove dead (also incorrect) code (#129316)
This code is unused because we just inline the `.parameters` call. The code was also wrong because side-effects only track the first level of mutations. An object might not marked mutated if one of the child objects (like a dict) is mutated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129316
Approved by: https://github.com/jansel
2024-06-23 03:02:27 +00:00
William Wen
79aabaf626 [3.13, dynamo] codegen PUSH_NULL when callable is codegen'd (#129172)
Significant bytecode generation API change!

The new suggested convention to generating bytecode to call a function is now to wrap instructions that push a callable to the stack with `add_push_null`, then that callable is called with `create_call_function` with `push_null=False` (see diff for examples).

In Python 3.13, NULL is now expected to be pushed after the callable. In <=3.12, the NULL was pushed before the callable.  This change abstracts away the exact placement of the NULL, but the developer must be aware that a NULL may be needed when codegen'ing a callable.

This abstraction also reduces the need for the `push_null=True` option in `create_call_function`, which removes the need to rotate a NULL to the right place on the stack with a sequence of `SWAP` instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129172
Approved by: https://github.com/jansel
2024-06-22 17:25:23 +00:00
Animesh Jain
c5b9ee7408 [easy][dynamo] Remove try except from call_getattr (#129217)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129217
Approved by: https://github.com/lezcano
ghstack dependencies: #129098, #129015
2024-06-21 23:56:00 +00:00
rzou
08b616281f [custom ops] Switch out references from old landing page to new landing page (#129178)
Test Plan:
- existing tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129178
Approved by: https://github.com/albanD
ghstack dependencies: #129177
2024-06-21 13:31:40 +00:00
Animesh Jain
6b5fbc544e [dynamo] Use polyfill to trace through the attributes of torch.jit.* and lru_cache_wrapper (#128336)
Earlier we were taking the vt for `obj` and then monkeypatching that `vt.source` to be `obj._torchdynamo_inline`. If one accesses `obj.attr_a`, this would cause problems because Dynamo would then search it in `obj._torchdynamo_inline.attr_a`. This PR makes it more functional, so that we have different vts for obj and `ob._torchdynamo_inline`.

Fixes https://github.com/pytorch/pytorch/issues/93698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128336
Approved by: https://github.com/jansel, https://github.com/yanboliang
ghstack dependencies: #129117
2024-06-21 07:44:44 +00:00
Animesh Jain
e8dbb45e98 [dynamo][user-defined-object] Check that object is valid (#129117)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129117
Approved by: https://github.com/yf225
2024-06-21 04:18:54 +00:00
Brian Hirsh
880e894c39 [Brian's PR #128981] fix dynamo isinstance inlining for nn.Parameter + subclasses (#129162)
This is a copy of Brian's PR https://github.com/pytorch/pytorch/pull/128981, with very small changes to work around numpy related errors.

For discussions, please see Brian's original PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129162
Approved by: https://github.com/bdhirsh
2024-06-21 03:48:10 +00:00
Animesh Jain
f2f4dde2d3 [dynamo] Remove ID_MATCH for FSDPModuleVariable (#129015)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129015
Approved by: https://github.com/yf225
ghstack dependencies: #129098
2024-06-20 19:23:32 +00:00
Brian Hirsh
8c2542623b [Traceable FSDP2] [Dynamo] Add tracing support for out-variant custom ops that return None (#129078)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129078
Approved by: https://github.com/yanboliang
2024-06-20 17:46:13 +00:00
rzou
7178b4e987 [Dynamo x torch_function] fix incorrect source (#128980)
Fixes https://github.com/pytorch/pytorch/issues/128964

The problem was that we were installing the source for a type
incorrectly.

Test Plan:
- new tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128980
Approved by: https://github.com/mlazos
2024-06-20 14:54:00 +00:00
Animesh Jain
ea47d542ca [dynamo][guards] Remove BOOL_FALSE - not needed after C++ guards (#129098)
PyDict_Size is very fast ... earlier with Python guards, Cpython will go through layers of fluff to finally call the PyDict_Size. With C++ guards, its not needed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129098
Approved by: https://github.com/jansel
2024-06-20 14:40:27 +00:00
Will Feng
ad2593cb86 [Animesh's PR #125340] [dynamo][fsdp] Track FSDPNNModuleVariable for mutations (#129045)
This is a copy of Animesh's work in https://github.com/pytorch/pytorch/pull/125340, with very small changes to the unit test. It's needed sooner for the Traceable FSDP2 work, so I copy it here and will work through landing it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129045
Approved by: https://github.com/anijain2305
2024-06-20 04:02:36 +00:00
PyTorch MergeBot
44722c6b10 Revert "[dynamo][fsdp] Dont take unspecializedNNModuleVariable path for FSDP modules (#128453)"
This reverts commit 2b28b107db.

Reverted https://github.com/pytorch/pytorch/pull/128453 on behalf of https://github.com/anijain2305 due to luca saw bad compile time ([comment](https://github.com/pytorch/pytorch/pull/128453#issuecomment-2176877667))
2024-06-18 20:09:00 +00:00
Animesh Jain
bdffd9f0c6 [export] Graph break on nn.Parameter construction (#128935)
Fixes https://github.com/pytorch/pytorch/issues/126109

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128935
Approved by: https://github.com/angelayi
2024-06-18 18:37:44 +00:00
Will Feng
e3a39d49a0 [Traceable FSDP][Compiled Autograd] Add queue_callback() support (#126366)
Adds support for `Variable._execution_engine.queue_callback()`, which is used in FSDP2.

Important tests:
- `pytest -rA test/inductor/test_compiled_autograd.py::TestCompiledAutograd::test_callback_graph_break_throws_error`
- `pytest -rA test/inductor/test_compiled_autograd.py::TestAutogradWithCompiledAutograd::test_callback_adds_callback`
- `PYTORCH_TEST_WITH_DYNAMO=1 python test/test_autograd.py -k TestAutograd.test_callback_adds_callback`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126366
Approved by: https://github.com/xmfan
2024-06-18 06:22:14 +00:00
Animesh Jain
22f1793c0a [dynamo][easy] Use LazyVariableTracker for UserDefinedObject var_getattr (#128877)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128877
Approved by: https://github.com/mlazos
ghstack dependencies: #128315, #128748
2024-06-18 02:17:56 +00:00
Edward Z. Yang
95b5ea9cde Add mark_unbacked (#128638)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128638
Approved by: https://github.com/IvanKobzarev
2024-06-17 23:39:48 +00:00
Animesh Jain
b0282071c4 [dynamo] override torch.nn.modules.activation._is_make_fx_tracing (#128748)
Discovered while inlining `MultiHeadAttention` nn Module.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128748
Approved by: https://github.com/jansel
ghstack dependencies: #128315
2024-06-17 08:49:29 +00:00
Will Feng
979edbbe12 [Traceable FSDP2] Dynamo support FSDP2 use_training_state context manager (#127854)
Improve Dynamo to support the FSDP2 `use_training_state()` context manager.

Test command:
`
pytest -rA test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_dynamo_trace_use_training_state
`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127854
Approved by: https://github.com/yanboliang
2024-06-16 08:48:52 +00:00
Animesh Jain
108adbc726 [dynamo][side effects] Raise assertion error if the object is already tracked for mutation (#128590)
This issue was pointed out by @tombousso here - https://github.com/pytorch/pytorch/pull/128269#issuecomment-2163755792

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128590
Approved by: https://github.com/mlazos
ghstack dependencies: #128715, #128269
2024-06-15 05:07:49 +00:00
Animesh Jain
7e092a62e6 [dynamo] Support weakref objects (#128533)
Fixes https://github.com/pytorch/pytorch/issues/125720

I was earlier worried that DELETE_* or STORE_* on referent values should result in a graph break, because they could invalidate the weak ref. But then @zou3519 pointed out that weakref invalidation will happen EVENTUALLY, CPython provides no guarantees when the weakref will be invalidated (even when the user calls del x and x is the last reference).

So any code that relies on del x to invalidate the weakref of x right away is BAD code. CPython provide no guarantees. Therefore we can (ab)use this nuance, and can just ignore DELETE_* or STORE_* on the referent objects.

The only corner case is when Dynamo is reconstructing the weakref object. Dynamo will have a hard time being correct here, so just SKIP_FRAME on such a case. This is rare.

Cpython notes
1) https://docs.python.org/3/library/weakref.html
2) https://docs.python.org/3/reference/datamodel.html#index-2

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128533
Approved by: https://github.com/jansel
2024-06-15 02:16:25 +00:00
Simon Fan
4b96575a09 [dynamo][aot autograd] Silently disable default saved tensor hooks during tracing (#123196)
FIXES #113263. Same idea as in https://github.com/pytorch/pytorch/pull/113417, but we need a more intrusive C API to silently nop default saved tensor hooks, in order to support user-code that use torch.autograd.disable_saved_tensors_hooks (see test_unpack_hooks_can_be_disabled). We mock the output of get_hooks while leaving push/pop untouched.

For compiled autograd, we're firing pack hooks once and unpack hooks twice right now, I'll look into this separately from this issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123196
Approved by: https://github.com/soulitzer
2024-06-14 20:28:08 +00:00
Laith Sakka
4c84af0f5d Fix indexing and slicing of ranges in dynamo (#128567)
Fix https://github.com/pytorch/pytorch/issues/128520
Dynamo does not handle range()[binary subscript] or range()[trinary_subscript] correctly. Right now it calls
the get_item function which basically applies the subscript operation on top of the list of [start, end, step]! which is completely not related to what is  expected.

in python, range()[complex subscript] is another range, ex:
range(1, 10, 2)[1:4:1] is range(3, 9, 2)
and range(1, 10, 2)[1:4:1]  is range(-9, 9, 2)

This diff fix index and slice applications on range.
it mimics implementations from (https://github.com/python/cpython/blob/main/Objects/rangeobject.c)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128567
Approved by: https://github.com/anijain2305
2024-06-14 16:49:49 +00:00