Commit Graph

54 Commits

Author SHA1 Message Date
Xuehai Pan
3ce352e389 [BE][PYFMT] migrate PYFMT for torch._dynamo to ruff format (#144549)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144549
Approved by: https://github.com/jansel
2025-02-28 03:03:53 +00:00
Raymond Li
21c2565f35 Document dynamo (#146736)
Many files in dynamo are currently lacking file/module-level documentation, which makes it hard to know what they do at a glance and without digging into the code. This fixes that.

Note: documentation was AI-generated and could be incorrect, please review carefully.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146736
Approved by: https://github.com/jansel, https://github.com/StrongerXi, https://github.com/anijain2305, https://github.com/zou3519
2025-02-13 00:02:21 +00:00
Aaron Orenstein
a79100ab11 PEP585 update - torch/_dynamo (#145105)
See #145101 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145105
Approved by: https://github.com/bobrenjc93
2025-01-18 20:47:11 +00:00
bobrenjc93
1fe3af2c68 Migrate from Tuple -> tuple in torch/_dynamo (#144261)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144261
Approved by: https://github.com/aorenste, https://github.com/zou3519
2025-01-10 07:45:57 +00:00
William Wen
18261e9f39 [dynamo] implement framelocals mapping as c++ object (#140063)
Implements https://github.com/pytorch/pytorch/issues/93753 - move frame local guard accessors to C++.

Before, we used dict accessors on a Python dict representing the frame's fastlocals that we manually build. We move this accessor to C++ and additionally use the fastlocal index whenever possible.

Some implementation notes:
- `FrameLocalsMapping` is now initialized as a C++ vector of `PyObject`s. We do not just use the frame's localsplus/fastlocals buffer because we also unbox cells.
- `FrameLocalsMapping` can still be converted into a Python dict representing the frame's fastlocals, but it is done lazily.
- We update `LeafGuard`, `GuardAccessor`, and `GuardManager`'s `check_nopybind` methods to accept `FrameLocalsMapping`. By default, we convert the `FrameLocalsMapping` to a Python dict and run the original `check_nopybind` on it, but in some cases, conversion is not needed.
- We add a new guard accessor `FrameLocalsGuardAccessor`, which is similar to `DictGetItemGuardAccessor` but has special handling for `FrameLocalsMapping`. We create a separate class to emphasize different use cases, but we could probably combine these two (can do in a follow up)

dynamo_guard_eval.py microbenchmark update:
- 713.2us -> 630.0us (3.10)
- 598.8us -> 530.7us (3.12)

Other followups:
- Add `FrameLocalsMapping` version for `check_verbose_nopybind` in order to match behavior between `check_nopybind` and `check_verbose_nopybind`. This can prevent difficult debugging situations where guards fail (`check_nopybind` returns false) but no guard error message is generated (`check_verbose_nopybind` succeeds).
- Rewrite the `SHAPE_ENV` guard into C++ - it is a fairly common guard that results in `FrameLocalsMapping` needing to convert to a dict

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140063
Approved by: https://github.com/jansel
ghstack dependencies: #142117, #142430
2024-12-17 18:54:27 +00:00
Ryan Guo
3141e038f0 [dynamo] Fix VariableBuilder._wrap on frozenset and enforce invariants on ConstantVariable (#141504)
Prior to this patch, we are using `ConstantVariable.create` to create VT
for frozenset objects, and intended yet failed to predicate that on all
itmes being literals (see https://github.com/pytorch/pytorch/pull/140984#discussion_r1847393736).

The code was from https://github.com/pytorch/torchdynamo/commit/7c03434 and
the original goal was to help DBR quantization, but as the new test in
this patch shows, it could lead to silent incorrectness.

Upon a closer look, this exposes some subtleties in how Dynamo handles
`ConstantVariable` and `LOAD_CONST`, so this patch both fixes the
aforementioned issue and documents, enforces, and makes explicit the
invariants around `ConstantVariable` and `LOAD_CONST` -- only immutable
objects are supported.

Specifically, this patch:
1. refine the checks for wrapping a `frozenset` object, document why we
   can't just wrap its items directly due to lack of `Sourcec` for set
   items, and use a safe workaround (`SourcelessBuilder`) to ensure
   soundness while keeping the DBR quantization support.
2. Adds more types to `common_constant_types`, thereby making
   `ConstantVariable.is_base_literal` more lenient, and strictly checks
   this property in the constructor of `ConstantVariable`.
3. Change relevant uses of `create_instruction("LOAD_CONST", ...)` to
   `create_load_const` which checks `is_safe_constant`, and makes
   developer overrides explicit by using `create_load_const_unchecked`
   when needed.
4. In a few places, use more specific `VariableTracker`, e.g.,
   `TypingVariable` rather than `ConstantVariable`, and
   `FrozensetVariable` rather than `SetVariable`.

(2) and (3) are mainly to future-proof Dynamo against bugs like (1).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141504
Approved by: https://github.com/jansel
2024-11-27 21:58:35 +00:00
Aaron Gokaslan
12e95aa4ee [BE]: Apply PERF401 autofixes from ruff (#140980)
* Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables.
* list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize.
* Manually went back and made mypy happy after the change.
* Also fixed style lints in files covered by flake8 but not by pyfmt

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-11-20 17:52:07 +00:00
Michael Lazos
e41dffbedd [Dynamo] Trace enter/exit of TorchFunctionModes (#135422) (#137114)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137114
Approved by: https://github.com/yanboliang
2024-10-09 02:29:40 +00:00
PyTorch MergeBot
d34b617bb9 Revert "[Dynamo] Trace enter/exit of TorchFunctionModes (#135422) (#137114)"
This reverts commit 51bc839b94.

Reverted https://github.com/pytorch/pytorch/pull/137114 on behalf of https://github.com/huydhn due to The top of the stack has been reverted but it leaves trunk in a broken state, so I try to revert the rest of the stack ([comment](https://github.com/pytorch/pytorch/pull/137114#issuecomment-2400765603))
2024-10-08 20:33:17 +00:00
Michael Lazos
51bc839b94 [Dynamo] Trace enter/exit of TorchFunctionModes (#135422) (#137114)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137114
Approved by: https://github.com/yanboliang
2024-10-07 18:55:26 +00:00
Animesh Jain
289df45cee Revert "[Dynamo] Trace enter/exit of TorchFunctionModes (#135422)" (#136590)
This reverts commit 7743149b2b.

Reverts
* https://github.com/pytorch/pytorch/pull/135503
* https://github.com/pytorch/pytorch/pull/135502
* https://github.com/pytorch/pytorch/pull/135422

This passes this test. Earlier, the getitem would stay like a getitem in the Fx graph. But now the fake tensor propagations fails saying that .item is called. It seems that torch function is not getting triggered while fake tensor propagation.

```
import torch
from torch.nn.attention.flex_attention import BlockMask, _mask_mod_signature, _score_mod_signature, flex_attention
from torch._inductor.lowering import make_pointwise, register_lowering
from torch._inductor.virtualized import ops
from torch.nn.attention.flex_attention import create_block_mask

torch.set_default_device('cuda')

flex_attention = torch.compile(flex_attention, dynamic=False)

prefix_lengths = torch.arange(8)
def prefix_lm(b, h, q, kv):
    return prefix_lengths[b] >= kv

mask = create_block_mask(prefix_lm, 8, None, 512, 512, _compile=True)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136590
Approved by: https://github.com/Chillee
2024-09-25 21:10:43 +00:00
William Wen
ae80bce496 [dynamo] refactor resume_execution.py to use bytecode templates (#136483)
Use bytecode from template instead of hardcoding bytecode in resume_execution.py. Gets rid of a lot of Python-version dependent bytecode generation. Also makes resume_execution.py easier to support in future Python version updates.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136483
Approved by: https://github.com/jansel, https://github.com/anijain2305
2024-09-24 22:20:26 +00:00
Aaron Gokaslan
31715be72a [BE]: Update mypy to 1.11.2 (#133816)
Updates mypy to 1.11.1 to improve type inference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816
Approved by: https://github.com/ezyang
2024-09-16 19:44:11 +00:00
PyTorch MergeBot
3117f2cf67 Revert "[BE]: Update mypy to 1.11.2 (#133816)"
This reverts commit 55299cfc22.

Reverted https://github.com/pytorch/pytorch/pull/133816 on behalf of https://github.com/jeanschmidt due to seems to have broken https://github.com/pytorch/pytorch/actions/runs/10865710499/job/30155699792 on main ([comment](https://github.com/pytorch/pytorch/pull/133816#issuecomment-2352377684))
2024-09-16 09:11:16 +00:00
Aaron Gokaslan
55299cfc22 [BE]: Update mypy to 1.11.2 (#133816)
Updates mypy to 1.11.1 to improve type inference

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133816
Approved by: https://github.com/ezyang
2024-09-14 21:40:36 +00:00
Michael Lazos
1b9daeb240 [Dynamo] Trace enter/exit of TorchFunctionModes (#135422)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135422
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444
2024-09-14 18:52:22 +00:00
PyTorch MergeBot
f3180f0088 Revert "[Dynamo] Trace enter/exit of TorchFunctionModes (#135422)"
This reverts commit 7743149b2b.

Reverted https://github.com/pytorch/pytorch/pull/135422 on behalf of https://github.com/mlazos due to broke python test/quantization/pt2e/test_numeric_debugger.py TestNumericDebugger.test_re_export_preserve_handle modified yesterday ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2350937008))
2024-09-14 10:02:55 +00:00
Michael Lazos
7743149b2b [Dynamo] Trace enter/exit of TorchFunctionModes (#135422)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135422
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444
2024-09-14 02:41:08 +00:00
PyTorch MergeBot
ac169795a9 Revert "[Dynamo] Trace enter/exit of TorchFunctionModes (#135422)"
This reverts commit 2af3b8ffd8.

Reverted https://github.com/pytorch/pytorch/pull/135422 on behalf of https://github.com/albanD due to Broke tests on main ([comment](https://github.com/pytorch/pytorch/pull/134732#issuecomment-2348886378))
2024-09-13 12:52:57 +00:00
Michael Lazos
2af3b8ffd8 [Dynamo] Trace enter/exit of TorchFunctionModes (#135422)
This PR implements tracing of with contexts with TorchFunction modes which have the default enter/exit behavior (ie pushing/popping the mode)

Typically the bytecode for a context manager looks like this during a graph break:
1. graph call
2. enter context
3. unsupported code
4. exit context
5. resume call

resume fn structure:
1. enter context
2. jump
...
3. exit context

The issue with torch function modes is that side effects will replay any mutations to the torch function stack performed during tracing. So, we do not need to enter and exit around the unsupported code in the original function (doing so would result in a duplicate torch function mode entry during execution of the unsupported code), and we don't need to enter again in the resume function (the mode that was pushed from the side effects bytecode would still be on the stack).

So for torch function modes the structure of our output code is this:

1. graph call
2. mutate tf mode stack to replay mutations
4. unsupported code
5. on exception restore stack
6. resume function

Then our resume fn looks like this:

1. no-op enter torch function mode
2. jump
3.  exit tf mode

To implement the no-op enter of the torch function mode I added torch function mode in polyfill which no-op enters, but normally exits. This is needed because we still want to trace the with context in the resume function, and exit properly (the exit instructions will still be in the function, so we need to generate instructions to set up the context).

Separately from the bytecode, dynamo also tracks contexts on the block stack, which is how the SETUP_* instructions are implemented. Naturally at a graph break, we exit these block stacks to properly reset the contexts entirely, so that we can re-enter around the unsupported code soundly. However once again, in the torch function mode case, in the event of a graph we do not want to perform any exit side effects because we want to preserve the state of the mode stack as is so that we will properly update the stack with bytecode mentioned in the first section. If we exited here, dynamo would pop the mode off of the symbolic stack, and not update the true python torch function mode stack with the suffix bytecode. All in all, for torch function modes we enter exactly once, update the global torch function mode stack with side effects bytecode, re-read this stack when compiling the resume function, and exit exactly once in the resume function. This matches the semantics of eager exactly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135422
Approved by: https://github.com/williamwen42
ghstack dependencies: #134732, #133137, #135443, #135444
2024-09-13 08:41:24 +00:00
Xuehai Pan
e74ba1b34a [BE][Easy][15/19] enforce style for empty lines in import segments in torch/_d*/ (#129767)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129767
Approved by: https://github.com/anijain2305
2024-07-31 21:18:11 +00:00
Xuehai Pan
973037be6a [BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): list() / tuple() / dict() (#130199)
This PR changes the empty collection factory call to Python literals:

- `list()` -> `[]`
- `tuple()` -> `()`
- `dict()` -> `{}`

The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary:

```bash
$ python3 -m dis - <<EOS
import collections

d1 = {}
d2 = dict()

dict = collections.OrderedDict
d3 = dict()
EOS
```

```text
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 (0)
              4 LOAD_CONST               1 (None)
              6 IMPORT_NAME              0 (collections)
              8 STORE_NAME               0 (collections)

  3          10 BUILD_MAP                0
             12 STORE_NAME               1 (d1)

  4          14 PUSH_NULL
             16 LOAD_NAME                2 (dict)
             18 CALL                     0
             26 STORE_NAME               3 (d2)

  6          28 LOAD_NAME                0 (collections)
             30 LOAD_ATTR                8 (OrderedDict)
             50 STORE_NAME               2 (dict)

  7          52 PUSH_NULL
             54 LOAD_NAME                2 (dict)
             56 CALL                     0
             64 STORE_NAME               5 (d3)
             66 RETURN_CONST             1 (None)
```

The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199
Approved by: https://github.com/malfet
2024-07-11 17:30:28 +00:00
William Wen
79aabaf626 [3.13, dynamo] codegen PUSH_NULL when callable is codegen'd (#129172)
Significant bytecode generation API change!

The new suggested convention to generating bytecode to call a function is now to wrap instructions that push a callable to the stack with `add_push_null`, then that callable is called with `create_call_function` with `push_null=False` (see diff for examples).

In Python 3.13, NULL is now expected to be pushed after the callable. In <=3.12, the NULL was pushed before the callable.  This change abstracts away the exact placement of the NULL, but the developer must be aware that a NULL may be needed when codegen'ing a callable.

This abstraction also reduces the need for the `push_null=True` option in `create_call_function`, which removes the need to rotate a NULL to the right place on the stack with a sequence of `SWAP` instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129172
Approved by: https://github.com/jansel
2024-06-22 17:25:23 +00:00
Aaron Orenstein
dcfa7702c3 Flip default value for mypy disallow_untyped_defs [1/11] (#127838)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127838
Approved by: https://github.com/oulgen
2024-06-08 18:16:33 +00:00
youkaichao
36e70572d0 [Dynamo] make bytecode of resume function resemble natural bytecode (#126630)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126630
Approved by: https://github.com/williamwen42
2024-05-23 05:06:33 +00:00
William Wen
f2ab96a57e [dynamo] fix crash when context manager is passed to a function (#125321)
Fix https://github.com/pytorch/pytorch/issues/125274. Main change was to reconstruct `ContextWrappingVariables` as objects in general, but we can replace them with the class on the caller side when generating the resume function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125321
Approved by: https://github.com/jansel
2024-05-03 23:01:30 +00:00
William Wen
0506e95433 [dynamo] support inactive context managers across graph breaks (#125203)
Fix https://github.com/pytorch/pytorch/issues/124900.

When we reconstruct `ContextWrappingVariables`s, we only reconstruct the context class, not the object. Normally, contexts are active (via `with ctx:`) and we initialize the context object in the resume function. But for the case of inactive contexts (contexts declared ahead of time before the `with` block), we do not reconstruct them properly in the optimized bytecode or resume function. So this PR adds initialization for inactive contexts in the resume function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125203
Approved by: https://github.com/jansel
2024-05-01 01:49:09 +00:00
William Wen
9309580d69 [dynamo, 3.12] handle possibility of NULL local variables during graph breaks (#124095)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124095
Approved by: https://github.com/jansel
2024-04-16 08:44:43 +00:00
willfengg
d765e223ac [dynamo][PT2D] avoid skipping dynamo_resume_* in torch/testing/_internal (#123013)
this PR ensures ``dynamo_resume_`` survives ``trace_rules.py``. As a ground truth, modules defined outside of ``pytorch/torch`` folders can survive ``trace_rules.py``

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123013
Approved by: https://github.com/jansel
2024-04-01 21:12:48 +00:00
William Wen
a9b27bbbe9 [dynamo, 3.12] update jump instructions (#122530)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122530
Approved by: https://github.com/jansel
ghstack dependencies: #122146, #122335, #122354, #122355, #122356, #122449, #122455, #122456
2024-03-27 20:39:39 +00:00
William Wen
01547960bc [dynamo, 3.12] remove LOAD_METHOD, update LOAD_ATTR (#122356)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122356
Approved by: https://github.com/jansel
ghstack dependencies: #122146, #122335, #122354, #122355
2024-03-27 20:39:39 +00:00
Edward Z. Yang
514159ddcb Add torch_dynamo to resume_in for ease of debugging (#118201)
resume_in_* code objects show up in user backtraces when failures occur
in code that has been Dynamo processed.  It is obvious to me, a PT2
developer, that these are generated by PT2, but it is NOT obvious to a
non-core dev that this is happened.  Add an extra torch_dynamo
breadcrumb to help get people to the right place.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118201
Approved by: https://github.com/albanD
2024-01-25 06:52:17 +00:00
William Wen
5b671ce486 [dynamo] fix typo in 3.11 resume_execution.py (#118108)
whoopsie

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118108
Approved by: https://github.com/angelayi, https://github.com/zou3519
2024-01-24 00:59:04 +00:00
Yanbo Liang
ac4f6beb00 [Dynamo] Make resume function name more explicit by adding lineno (#115608)
Adding lineno to resume function name for easy aggregation in Scuba table.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115608
Approved by: https://github.com/jansel, https://github.com/williamwen42
2023-12-12 21:08:41 +00:00
Jez Ng
fe41a9ce08 [dynamo] Enable typechecking for resume_execution.py (#112564)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112564
Approved by: https://github.com/williamwen42, https://github.com/eellison
ghstack dependencies: #112561, #112562, #112563
2023-11-04 19:37:06 +00:00
Aaron Gokaslan
1ad0f0b308 [BE]: remove unnecessary enumerate calls (#111690)
Remove unnecessary enumerate calls, entirely automated fixes so probably reasonably low risk.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111690
Approved by: https://github.com/malfet
2023-10-20 23:20:29 +00:00
Kaichao You
d1110a18de [Dynamo]make sure resume function have valid names (#111635)
An ongoing effort for https://github.com/pytorch/pytorch/issues/111633 .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111635
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-10-20 18:54:52 +00:00
William Wen
777fc0bb58 [dynamo] fine-grained bytecode-source attribution in python 3.11 (#104676)
Since Python 3.11 bytecode contains endline and column information, for each bytecode, we attribute the source code corresponding to the bytecode in a more accurate way. For example, we can highlight a function call in a series of nested function calls, or highlight a function call spanning multiple lines.

Sample:
```python
import torch
import torch._dynamo
from functorch.experimental.control_flow import cond

def h(x):
    return x * 5

def true_fn(x):
    return x * 2

def false_fn(x):
    return x * 3

def f(pred, x):
    x = h(
        h(h(x))
    )
    x = x[1:][:2]
    torch._dynamo.graph_break()
    x = cond(pred, true_fn, false_fn, [x])

opt_f = torch.compile(f, backend="eager")
opt_f(torch.tensor(True), torch.randn(3, 3, 3, 3))
```

Output:
```
$ TORCH_LOGS="trace_call" python playground9.py
TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16
        h(h(x))
          ~^^^
TRACE FX call mul from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1)
    return x * 5
           ~~^~~
TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:16
        h(h(x))
        ~^^^^^^
TRACE FX call mul_1 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1)
    return x * 5
           ~~^~~
TRACE inlined call h from f /scratch/williamwen/work/pytorch/playground9.py:15
    x = h(
        ~^
        h(h(x))
        ^^^^^^^
    )
    ^
TRACE FX call mul_2 from h /scratch/williamwen/work/pytorch/playground9.py:6 (inline depth: 1)
    return x * 5
           ~~^~~
TRACE FX call getitem from f /scratch/williamwen/work/pytorch/playground9.py:18
    x = x[1:][:2]
        ~^^^^
TRACE FX call getitem_1 from f /scratch/williamwen/work/pytorch/playground9.py:18
    x = x[1:][:2]
        ~~~~~^^^^
TRACE inlined call true_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20
    x = cond(pred, true_fn, false_fn, [x])
        ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TRACE FX call mul from true_fn /scratch/williamwen/work/pytorch/playground9.py:9 (inline depth: 1)
    return x * 2
           ~~^~~
TRACE inlined call false_fn from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20
    x = cond(pred, true_fn, false_fn, [x])
        ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TRACE FX call mul from false_fn /scratch/williamwen/work/pytorch/playground9.py:12 (inline depth: 1)
    return x * 3
           ~~^~~
TRACE FX call cond from <resume in f> /scratch/williamwen/work/pytorch/playground9.py:20
    x = cond(pred, true_fn, false_fn, [x])
        ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104676
Approved by: https://github.com/ezyang
2023-07-20 17:18:52 +00:00
Justin Chu
8a688277a2 [BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432
Approved by: https://github.com/ezyang
2023-07-19 13:48:44 +00:00
Nikita Shulga
f0832914ee [Dynamo] Fix lineinfo generation on PY3.11+ (#103525)
- Replace `for inst in instructions[0:targe.offset//2]: inst.starts_line = None`, with the one that that iterates over all instructions until `inst.offset == target.offset` condition is met, this way making it uniform across Python bytecode dialects (Python-3.11+ bytecode size is variable, while bytecode size is fixed for older Pythons)
- Speedup target_index search by replacing `[i for i in instructions if i.offset == offset][0]` with `next(i for i in instructions if i.offset == offset)`, which aborts the evaluation after condition met for the first time, according to:
  ```python
   In [1]: lst=list(range(10000))

   In [2]: %time [i for i in lst if i == 10]
   CPU times: user 144 µs, sys: 23 µs, total: 167 µs
   Wall time: 168 µs
   Out[2]: [10]

   In [3]: %time next(i for i in lst if i == 10)
   CPU times: user 6 µs, sys: 0 ns, total: 6 µs
   Wall time: 9.06 µs
   Out[3]: 10
   ```
- Fix small typo
- use `is_py311_plus` variable rather than checking `sys.version_info`

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 6cd7f27</samp>

> _We fix the typos in our code of doom_
> _We remove the warnings that obscure our vision_
> _We refactor the `generate` function for the dynamo_
> _We resume the execution with precision_

Fixes https://github.com/pytorch/pytorch/issues/103355

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103525
Approved by: https://github.com/Skylion007, https://github.com/williamwen42
2023-06-14 05:41:43 +00:00
William Wen
88c8c2b71b [dynamo 3.11] implement 3.11 exceptiontable (#96511)
Summary of changes:
- Add CPython exceptiontable parsing/assembling functions in torch/_dynamo/bytecode_transformation.py, based on https://github.com/python/cpython/blob/3.11/Objects/exception_handling_notes.txt.
- Add optional `exn_tab_entry` field to dynamo `Instruction`s in torch/_dynamo/bytecode_transformation.py in order to virtualize exception table entries (start, end, target instructions).
- Add checks guarding against duplicate instructions in dynamo, so that jump/exceptiontable targets are unambiguous. See `get_indexof` in torch/_dynamo/bytecode_analysis.py. Ensure that bytecode generation throughout dynamo does not generate duplicate instructions.
- Allow dynamo bytecode generation logic to generate nested exception table entries for developer convenience. CPython expects entries to not overlap, so we flatten nested entries during assembly in torch/_dynamo/bytecode_transformation.py:compute_exception_table.
- Simulate the block stack in torch/_dynamo/symbolic_convert.py. CPython removed the block stack in 3.11, but dynamo needs it in order to keep track of active contexts. So we simulate the block stack as before by looking at exceptiontable entries in order to determine the current blocks.
- Update context codegen in torch/_dynamo/resume_execution.py. The `SETUP_FINALLY` bytecode, which conveniently had a jump target to the finally block, was removed in 3.11, so we need to keep track of the jump target of the finally block using exceptiontables. Generating resume functions is more difficult since the original exceptiontable entries pointing to old cleanup code need to be modified to point to new cleanup code.
- Fix a push_null bug in torch/_dynamo/variables/functions.py introduced by https://github.com/pytorch/pytorch/pull/98699

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96511
Approved by: https://github.com/jansel, https://github.com/yanboliang, https://github.com/albanD
2023-04-18 07:53:24 +00:00
William Wen
762a2079c7 [dynamo 3.11] make create_instruction kwarg mandatory (#98032)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98032
Approved by: https://github.com/albanD
2023-03-31 18:20:51 +00:00
William Wen
cb4bc8e0f5 [dynamo 3.11] support prefix instructions MAKE_CELL, COPY_FREE_VARS, RETURN_GENERATOR, RESUME (#96506)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96506
Approved by: https://github.com/jansel
2023-03-31 18:16:17 +00:00
William Wen
06d677f41d [dynamo 3.11] fix push null timing in resume functions (#96504)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96504
Approved by: https://github.com/jansel, https://github.com/albanD
2023-03-30 20:29:49 +00:00
William Wen
24a5d006f2 [dynamo 3.11] Refactor create_instruction (#96499)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96499
Approved by: https://github.com/jansel, https://github.com/albanD
2023-03-30 17:05:27 +00:00
jon-chuang
7a192cc51c dynamo: wrap graph break inst in try except block - with context manager setup/teardown (#94758)
Replacement to https://github.com/pytorch/pytorch/pull/94672.

Follow up to https://github.com/pytorch/pytorch/pull/94137.

We simply replace the set grad mode try except blocks with one for a more generic contextmanager (using `__enter__` and `__exit__`), storing the context manager into a `symbolic_local` for the duration of the try block.

(see https://github.com/pytorch/torchdynamo/issues/207 for the original motivation)

This allows us to handle calling inner functions with graph breaks for any arbitrarily deep nesting of live context managers subclassing `AbstractContextManager`. (see tests)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94758
Approved by: https://github.com/yanboliang
2023-03-06 14:04:17 +00:00
William Wen
1123ab8647 [dynamo 3.11] changes to with contexts (#94101)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94101
Approved by: https://github.com/albanD, https://github.com/jansel
2023-02-21 18:47:36 +00:00
William Wen
055a9e45aa [dynamo 3.11] changes to LOAD_GLOBAL and function calls (#94098)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94098
Approved by: https://github.com/albanD
2023-02-21 18:47:30 +00:00
jon-chuang
d1d5d16df3 dynamo: handle straight-line graph breaks for autocast context manager with constant args (#94137)
Fixes https://github.com/pytorch/pytorch/issues/93890

We do the following:
1. fix __init__constructor for `AutocastModeVariable` with exisiting `mode` while copying
2. `resume_execution` is made aware of constant args (`target_values`), by storing said args in `ReenterWith`. To propagate between subgraphs (in straightline code), we also store the constant args in the downstream's `code_options["co_consts"]` if not already.

---

Future work:
1. handle instantiating context manager in non-inlineable functions. Simultaneously fix nested grad mode bug.
2. generalize to general `ContextManager`s
3. generalize to variable arguments passed to context manager, with guards around the variable.

---

Actually, if we look at the repro: 74592a43d0/test/dynamo/test_repros.py (L1249), we can see that the method in this PR doesn't work for graph breaks in function calls, in particular, in function calls that don't get inlined.

Why inlining functions with graph breaks is hard:
- When we handle graph breaks, we create a new code object for the remainder of the code. It's hard to imagine doing this when you are inside a function, then we need a frame stack. And we just want to deal with the current frame as a sequence of straight line codes.

Why propagating context manager information is hard:
- If we do not inline the function, the frame does not contain any information about the parent `block_stack` or `co_consts`. So we cannot store it on local objects like the eval frame. It has to be a global object in the output_graph.

---

Anyway, I'm starting to see clearly that dynamo must indeed be optimized for torch use-case. Supporting more general cases tends to run into endless corner-cases and caveats.

One direction that I see as viable to handle function calls which have graph breaks and `has_tensor_in_frame` is stick with not inlining them, while installing a global `ContextManagerManager`, similar to the `CleanupManager` (which cleans up global variables). We can know which context managers are active at any given point, so that we can install their setup/teardown code on those functions and their fragments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94137
Approved by: https://github.com/yanboliang
2023-02-14 14:00:37 +00:00
William Wen
d567df9f36 [dynamo 3.11] remap dup/rotate to copy/swap (#93988)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93988
Approved by: https://github.com/jansel, https://github.com/albanD, https://github.com/mlazos
2023-02-14 04:25:14 +00:00