Commit Graph

118 Commits

Author SHA1 Message Date
Animesh Jain
0c3a1c893e [dynamo] Setup the globals for guard_fn without a reference to f_locals (#118447)
UPDATE - I changed the PR because from discussion with @jansel it was clear that someone else was holding on to a reference to f_locals. This PR now solves that problem first. I removed the eval_frame.c part because it was failing tests that use `exec` or `eval` with weird error like `no no locals found when storing 'math'`. I would debug that in a separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118447
Approved by: https://github.com/Skylion007, https://github.com/jansel
ghstack dependencies: #118975, #118420
2024-02-05 05:39:39 +00:00
Taras Tsugrii
41b63b26c2 [dynamo] Fix incorrect docstring placements in _guards.py. (#119114)
This makes them unavailable when using help and other tools accessing them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119114
Approved by: https://github.com/kit1980
2024-02-03 06:25:54 +00:00
lezcano
eb2bdfae88 Make variables in dict LazyTrackers (not lazily guarded yet) and avoid using DICT_KEYS guard (#117625)
Make variables in dict lazy and remove DICT_KEYS guard.

We build the keys of a dict depth-first and we rely on the guards of
each element in the dict to create the correct guards. This allows us to
remove the rather buggy DICT_KEYS guard and make the guard lazy.
The guards are not completely lazy yet, as we instantiate them in
`_HashableTracker._eq_impl` but it should be possible to make them
truly lazy.

Also, adding new types to the supported types within keys should be less
error prone.

This is marginally less efficient when we graph break, but in turn we
should graph break much less. It also  makes the dicts code easier to maintain
(removes `is_hashable_python_var`).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117625
Approved by: https://github.com/jansel, https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983
2024-02-02 14:38:08 +00:00
Edward Z. Yang
46712b019d Enable local_partial_types (#118467)
When using dmypy, this setting is enabled and cannot be turned off. Force it for regular mypy too.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118467
Approved by: https://github.com/Skylion007
ghstack dependencies: #118414, #118418, #118432
2024-01-28 13:38:22 +00:00
voznesenskym
081c5b3adc Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926) (#114526)
Summary:

The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with ezyang and eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (ezyang did this)

cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

imported-using-ghimport

Test Plan: Imported from OSS

Reviewed By: huydhn, Chillee

Differential Revision: D51566250

Pulled By: voznesenskym

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114526
Approved by: https://github.com/Chillee, https://github.com/huydhn
2023-11-26 23:40:32 +00:00
PyTorch MergeBot
2f3beb715c Revert "Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)"
This reverts commit 2ca1119d53.

Reverted https://github.com/pytorch/pytorch/pull/113926 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113926#issuecomment-1822713852))
2023-11-22 12:52:33 +00:00
voznesenskym
2ca1119d53 Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)
The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with @ezyang and @eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (@ezyang did this)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113926
Approved by: https://github.com/ezyang, https://github.com/eellison
2023-11-20 23:06:37 +00:00
Jez Ng
5b95715bc0 Make {Tracing,Compile}Context.get() return non-optional type (#113535)
They are used in many contexts that don't actually check if the returned
type is `None`. I have also created `try_get()` for the cases where we
do actually want an Optional type returned.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113535
Approved by: https://github.com/ezyang
ghstack dependencies: #113412
2023-11-14 04:31:12 +00:00
Jez Ng
a8cf04fd2a [inductor] Make {output_graph,pad_mm}.py pass follow_imports typechecking (#113413)
I changed OutputGraph.nn_modules' type to `Dict[str, Any]` because it
seems that `register_attr_or_module` can populate it with essentially
any type.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113413
Approved by: https://github.com/Skylion007
2023-11-11 22:15:46 +00:00
Jez Ng
b0ede09682 [inductor] Make pattern_matcher.py pass follow_imports typechecking (#113409)
Import following reveals that a good number of hints were wrong...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113409
Approved by: https://github.com/Skylion007
2023-11-10 19:58:08 +00:00
Jason Ansel
9664190952 [dynamo] Eagerly install guards (#111415)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111415
Approved by: https://github.com/voznesenskym
ghstack dependencies: #111306
2023-11-07 19:55:19 +00:00
Jason Ansel
4b8a5e1854 [dynamo] Remove VariableTracker.as_specialized (#112363)
My local testing can't seem to find this function actually doing anything.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112363
Approved by: https://github.com/yanboliang
2023-10-30 20:07:55 +00:00
Peter Bell
bbd5b935e4 Use pytree.tree_leaves everywhere (#112324)
This changes all the instances I could find of `tree_flatten(...)[0]` or
`x, _ = tree_flatten` to use `tree_leaves`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324
Approved by: https://github.com/lezcano
ghstack dependencies: #112327, #112323
2023-10-30 03:39:04 +00:00
lezcano
c8a5bb451e Do not import sympy within torch._prims_common (#112034)
This is the first of a few PRs that avoid importing SymPy at import time.
The pitch here is that we (almost!) do not have SymPy on our API, so
this should be feasible.

This should speed-up torch imports by a good 15% as per
https://dev-discuss.pytorch.org/t/delving-into-what-happens-when-you-import-torch/1589

In this PR we just move a few global imports into local imports.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112034
Approved by: https://github.com/ezyang
2023-10-26 12:53:25 +00:00
voznesenskym
9455af58b5 [easy][dynamo] Cleanup guard builder selection (#111723)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111723
Approved by: https://github.com/jon-chuang, https://github.com/jansel
2023-10-21 10:48:32 +00:00
Animesh Jain
58637c4b43 [dynamo] Remove SuperSource (#110475)
The motivation for removing this is already present in the pre-PR comments. Copying it

~~~
# NB - SuperSource is a weird one.
# it is our only source with 2 bases, so we use the objec
# as the base, rather than the type, since an invocation
# like super(Foo, foo) is represented here, the source object base is more spiritually
# aligned with the instance, rather than the type.
# This whole construction is questionable tho, and we should probably find a way to
# avoid this exception to our otherwise nice source parentage invariant.
~~~

Instead of using super(a, b), we can use `type(b).__mro__[index]`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110475
Approved by: https://github.com/jansel
2023-10-08 04:45:06 +00:00
chilli
005e8ddcb9 cache the hash construction on Guard (#110464)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110464
Approved by: https://github.com/zou3519, https://github.com/voznesenskym
2023-10-04 04:49:18 +00:00
Edward Yang
88600e7d2e [RELAND] Force synced KJT to trace unbacked SymInt (#108960) (#109216)
Summary:

The basic concept behind this diff is to modify Dynamo's tracing behavior when it encounters a KeyedJaggedTensor that is synced (aka has `_length_per_key` and `_offset_per_key` populated). These fields are lists of integers; ordinarily, Dynamo will optimistically try to specialize on integers, however, for KJTs, we know that these integers will definitely vary from run-to-run. Furthermore, ordinarily, we would also specialize these integers if they are 0/1, but we will frequently expect features in KJTs to be 0/1.

The fix is to detect KJTs and treat these integers as *unbacked integers*. This is NOT a universally sound optimization: when treating these integers as unbacked, we never report them as equal to zero or one. In return, we always generate graphs that generalize no matter the length of values on features. This is enough to trace through APS sparse arch, torchrec_dlrm and some small split-cat examples.

The special integer behavior is triggered by a dynamically scoped `force_unspec_int_unbacked_size_like` variable on TracingContext, which we trigger when we wrap a KJT. There probably are other ways to do this, but this was simple and worked.

Test Plan:
```
buck2 test mode/dev-nosan //pytorch/benchmark/fb/test_gpu:run_test_gpu
```

from aakhundov

1. first build feed_lower_benchmark:
```
buck2 build --show-output mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true hpc/new/models/feed/benchmark:feed_lower_benchmark
```
2. then run the lowering of the model with it:
```
TORCHINDUCTOR_MAX_AUTOTUNE=1 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCH_LOGS="output_code,graph_code" TORCH_COMPILE_DEBUG=1 ../buck-out/v2/gen/fbcode/79c6b019ee0f9469/hpc/new/models/feed/benchmark/__feed_lower_benchmark__/feed_lower_benchmark.par --load=manifold://ig_inference_model/tree/user/facebook/fblearner/predictor/960999465/60/gpu_lowering/input.predictor --skip-trt --skip-ait --sync-mode=0 --enable-aot-inductor --lower-presets="ig_stories" --gpu-trace
```
cf https://docs.google.com/document/d/1yD30xYrdmM8r2HTdmXnZTg0-MHVexfVrAa0294m1AUE/edit?pli=1#heading=h.qiv3fp7e6zg0

From torchrec: https://www.internalfb.com/intern/wiki/Torchrec/Development/Testing_production_models/

From ge0405
baseline (without your diff): f477293168
your diff: f477292363

```
buck2 test //caffe2/test/dynamo:test_dynamo_torchrec
buck2 run 'fbcode//mode/opt' fbcode//pytorch/benchmark/fb/test_gpu:run_test_gpu -- 'pytorch.benchmark.fb.test_gpu.test_gpu.TestBenchmarkFbGpu.test_train_blue_reels_vdd_v3_inductor_speedup'
```

Differential Revision: D49236757

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109216
Approved by: https://github.com/voznesenskym
2023-09-18 14:39:44 +00:00
PyTorch MergeBot
1d32c9c7f2 Revert "Force synced KJT to trace unbacked SymInt (#108960)"
This reverts commit f9a250c35b.

Reverted https://github.com/pytorch/pytorch/pull/108960 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/108960#issuecomment-1715850779))
2023-09-12 14:37:36 +00:00
Edward Yang
f9a250c35b Force synced KJT to trace unbacked SymInt (#108960)
Summary:
The basic concept behind this diff is to modify Dynamo's tracing behavior when it encounters a KeyedJaggedTensor that is synced (aka has `_length_per_key` and `_offset_per_key` populated). These fields are lists of integers; ordinarily, Dynamo will optimistically try to specialize on integers, however, for KJTs, we know that these integers will definitely vary from run-to-run. Furthermore, ordinarily, we would also specialize these integers if they are 0/1, but we will frequently expect features in KJTs to be 0/1.

The fix is to detect KJTs and treat these integers as *unbacked integers*. This is NOT a universally sound optimization: when treating these integers as unbacked, we never report them as equal to zero or one. In return, we always generate graphs that generalize no matter the length of values on features. This is enough to trace through APS sparse arch, torchrec_dlrm and some small split-cat examples.

The special integer behavior is triggered by a dynamically scoped `force_unspec_int_unbacked_size_like` variable on TracingContext, which we trigger when we wrap a KJT. There probably are other ways to do this, but this was simple and worked.

Test Plan:
```
buck2 test mode/dev-nosan //pytorch/benchmark/fb/test_gpu:run_test_gpu
```

from aakhundov

1. first build feed_lower_benchmark:
```
buck2 build --show-output mode/opt -c python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.split-dwarf=true hpc/new/models/feed/benchmark:feed_lower_benchmark
```
2. then run the lowering of the model with it:
```
TORCHINDUCTOR_MAX_AUTOTUNE=1 TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCH_LOGS="output_code,graph_code" TORCH_COMPILE_DEBUG=1 ../buck-out/v2/gen/fbcode/79c6b019ee0f9469/hpc/new/models/feed/benchmark/__feed_lower_benchmark__/feed_lower_benchmark.par --load=manifold://ig_inference_model/tree/user/facebook/fblearner/predictor/960999465/60/gpu_lowering/input.predictor --skip-trt --skip-ait --sync-mode=0 --enable-aot-inductor --lower-presets="ig_stories" --gpu-trace
```
cf https://docs.google.com/document/d/1yD30xYrdmM8r2HTdmXnZTg0-MHVexfVrAa0294m1AUE/edit?pli=1#heading=h.qiv3fp7e6zg0

From torchrec: https://www.internalfb.com/intern/wiki/Torchrec/Development/Testing_production_models/

From ge0405
baseline (without your diff): f477293168
your diff: f477292363

Differential Revision: D49019987

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108960
Approved by: https://github.com/voznesenskym
2023-09-12 03:44:24 +00:00
Edward Z. Yang
66f67d9a25 Print restart attempt as part of Dynamo log context (#108864)
Now looks like:

```
[2023-09-08 06:04:48,532] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE STORE_ATTR foo [ConstantVariable(int), NNModule
Variable()]
[2023-09-08 06:04:48,532] [0/0] torch._dynamo.convert_frame: [INFO]
Restarting analysis due to _dynamo/variables/nn_module.py
:138 in convert_to_unspecialized
[2023-09-08 06:04:48,533] [0/0_1] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing f /data/users/ezyang/c/pytorch/a.py:6
[2023-09-08 06:04:48,533] [0/0_1] torch._dynamo.symbolic_convert.__trace_source: [DEBUG] TRACE starts_line f /data/users/ezyang/c/pytorch/a.py:6
```

I'm happy to bikeshed the exact formatting of the attempt number if you
want.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108864
Approved by: https://github.com/mlazos, https://github.com/voznesenskym
2023-09-08 23:00:19 +00:00
Huy Do
5a4fe05a15 Revert "Force synced KJT to trace unbacked SymInt (#107788)" (#108684)
This reverts commit 3b92ef814d.  So let's manually revert it instead.

(Not sure why the bot doesn't work on https://github.com/pytorch/pytorch/pull/107788)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108684
Approved by: https://github.com/ezyang
2023-09-06 19:15:45 +00:00
Edward Z. Yang
3b92ef814d Force synced KJT to trace unbacked SymInt (#107788)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107788
Approved by: https://github.com/voznesenskym
2023-09-06 03:18:26 +00:00
Animesh Jain
a506d0ad8f [dynamo] Store originating source in the Guard object (#107634)
Many times, I find myself wanting to know the source for the guard. This PR adds that as a field of guard itself.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107634
Approved by: https://github.com/voznesenskym
ghstack dependencies: #107622
2023-08-22 02:16:31 +00:00
Edward Z. Yang
8292b03c47 Use fast traceback for symbolic shapes (#107439)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107439
Approved by: https://github.com/voznesenskym
ghstack dependencies: #107505, #107516, #107530, #107532, #107562, #107471
2023-08-22 01:03:13 +00:00
Edward Z. Yang
8316affc45 Add frame/recompile counter to all log messages in tracing context (#107530)
All log messages that occur while running Dynamo compilation now have `[X/Y]` added to the beginning of their message. X represents the frame being compiled, while Y says which compilation of the frame. For example, if you are debugging a frame that is repeatedly recompiling, you can look for N/0, N/1, N/2, etc. for the same N.  Here is what the logs look like as you transition from one frame to another:

<img width="1372" alt="image" src="https://github.com/pytorch/pytorch/assets/13564/4897e368-1e50-4807-b342-54e911bcf087">

To accurately get this prefix added to all messages, I had to expand the scope of the `tracing` context manager. Its scope now coincides with `log_compilation_event`. To do this, I had to populate fake mode lazily in the TracingContext, since it isn't created until later, inside the OutputGraph.

This subsumes the previous X.Y logging that was solely for dynamic shapes.

Unfortunately I had to reindent some stuff. Review the diff with whitespace off.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107530
Approved by: https://github.com/anijain2305
ghstack dependencies: #107505, #107516
2023-08-21 13:02:12 +00:00
Edward Z. Yang
d6d485fa8c Revamp guard debug logging (#107505)
The new guard printout looks like this:

```
[DEBUG] GUARDS:
[DEBUG]   ___check_type_id(L['name'], 7605632)                          # if name == "special_attr":  # test/dynamo/test_misc.py:1155 in __getattribute__
[DEBUG]   L['name'] == '_backward_pre_hooks'                            # if name == "special_attr":  # test/dynamo/test_misc.py:1155 in __getattribute__
[DEBUG]   ___check_obj_id(L['self'], 139746432564960)                   # return super().__getattribute__(name)  # test/dynamo/test_misc.py:1157 in __getattribute__
[DEBUG]   ___check_obj_id(L['__class__'], 1451499216)                   # return super().__getattribute__(name)  # test/dynamo/test_misc.py:1157 in __getattribute__
[DEBUG]   ___is_grad_enabled()                                          # _dynamo/output_graph.py:346 in init_ambient_guards
[DEBUG]   not ___are_deterministic_algorithms_enabled()                 # _dynamo/output_graph.py:342 in init_ambient_guards
[DEBUG]   ___is_torch_function_enabled()                                # _dynamo/output_graph.py:350 in init_ambient_guards
[DEBUG]   utils_device.CURRENT_DEVICE == None                           # _dynamo/output_graph.py:348 in init_ambient_guards
```

Along with the guards, we also print what line of user code caused the guard to be added, or what line of Dynamo internal code added the guard (if there is no user stack trace, which is typically the case for ambient guards.)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107505
Approved by: https://github.com/mlazos, https://github.com/voznesenskym, https://github.com/anijain2305
2023-08-20 06:50:27 +00:00
Edward Z. Yang
67bb3c05b0 Add verbose_guards logging artifact (#107388)
It looks like this:

```
[DEBUG] GUARD: ___check_type_id(L['z'][L["MyEnum"].BAR], 7640416) and L['z'][L["MyEnum"].BAR] == 10
[DEBUG] Stack:
[DEBUG]   File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 6657, in <module>
[DEBUG]     run_tests()
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/test_case.py", line 38, in run_tests
[DEBUG]     run_tests()
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 985, in run_tests
[DEBUG]     unittest.main(argv=argv)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 101, in __init__
[DEBUG]     self.runTests()
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/main.py", line 271, in runTests
[DEBUG]     self.result = testRunner.run(self.test)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/runner.py", line 184, in run
[DEBUG]     test(result)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__
[DEBUG]     return self.run(*args, **kwds)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run
[DEBUG]     test(result)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 84, in __call__
[DEBUG]     return self.run(*args, **kwds)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/suite.py", line 122, in run
[DEBUG]     test(result)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 650, in __call__
[DEBUG]     return self.run(*args, **kwds)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2521, in run
[DEBUG]     self._run_with_retry(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2450, in _run_with_retry
[DEBUG]     super_run(result=result)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 591, in run
[DEBUG]     self._callTestMethod(testMethod)
[DEBUG]   File "/home/ezyang/local/b/pytorch-env/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
[DEBUG]     method()
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/testing/_internal/common_utils.py", line 2377, in wrapper
[DEBUG]     method(*args, **kwargs)
[DEBUG]   File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2529, in test_enum_as_dict_key_with_overloaded_str
[DEBUG]     res = opt_fn(x)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 333, in _fn
[DEBUG]     return fn(*args, **kwargs)
[DEBUG]   File "/data/users/ezyang/b/pytorch/test/dynamo/test_misc.py", line 2519, in fn
[DEBUG]     torch._dynamo.graph_break()
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/eval_frame.py", line 493, in catch_errors
[DEBUG]     return callback(frame, cache_size, hooks, frame_state)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 637, in _convert_frame
[DEBUG]     result = inner_convert(frame, cache_size, hooks, frame_state)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 133, in _fn
[DEBUG]     return fn(*args, **kwargs)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 371, in _convert_frame_assert
[DEBUG]     return _compile(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 567, in _compile
[DEBUG]     guarded_code = compile_inner(code, one_graph, hooks, transform)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/utils.py", line 181, in time_wrapper
[DEBUG]     r = func(*args, **kwargs)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 466, in compile_inner
[DEBUG]     out_code = transform_code_object(code, transform)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
[DEBUG]     transformations(instructions, code_options)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/convert_frame.py", line 416, in transform
[DEBUG]     tracer = InstructionTranslator(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2018, in __init__
[DEBUG]     self.symbolic_locals = collections.OrderedDict(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/symbolic_convert.py", line 2021, in <genexpr>
[DEBUG]     VariableBuilder(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__
[DEBUG]     vt = self._wrap(value).clone(**self.options())
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 404, in _wrap
[DEBUG]     result = {
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 405, in <dictcomp>
[DEBUG]     k: VariableBuilder(
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 211, in __call__
[DEBUG]     vt = self._wrap(value).clone(**self.options())
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 354, in _wrap
[DEBUG]     return type_dispatch(self, value)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 837, in wrap_literal
[DEBUG]     return self.wrap_unspecialized_primitive(value)
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 1073, in wrap_unspecialized_primitive
[DEBUG]     guards=self.make_guards(GuardBuilder.CONSTANT_MATCH),
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in make_guards
[DEBUG]     return {source.make_guard(guard) for guard in guards}
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_dynamo/variables/builder.py", line 269, in <setcomp>
[DEBUG]     return {source.make_guard(guard) for guard in guards}
[DEBUG]   File "/data/users/ezyang/b/pytorch/torch/_guards.py", line 641, in make_guard
[DEBUG]     return Guard(self.name(), self.guard_sou
```

One downside is I can't report *why* the guard was added. I'm not entirely sure how to do this; the problem is guards will propagate to a bunch of variables before finally getting included as part of the final set. Maybe a very very verbose version could report stack traces at every handoff point.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107388
Approved by: https://github.com/mlazos
ghstack dependencies: #107438, #107358
2023-08-18 19:05:54 +00:00
eellison
e9ae820279 Unfuse bias add before pointwise ops (#106912)
I get a 2% inference speedup in HF with this PR. I checked to see if there any models where unfusing was slower than the cublas gelu fusion, and I did not see any, which was surprising to me. Sorry for the cublas-activation api churn 😬

Kicking off another run in cublas 12, it's possible that the results have changed since.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106912
Approved by: https://github.com/jansel
ghstack dependencies: #106911
2023-08-16 17:22:24 +00:00
Edward Z. Yang
91afefb55b Fix some fake mode confusion between inner/outer fake mode in export (#106515)
Fixes https://github.com/pytorch/pytorch/issues/106412

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106515
Approved by: https://github.com/voznesenskym, https://github.com/BowenBao, https://github.com/thiagocrepaldi
2023-08-04 15:42:23 +00:00
Edward Z. Yang
697893568d Improve error message when export encounters non-local input (#106403)
Previously, you would get an error like

```
Dynamo input and output is a strict subset of traced input/output
```

now you get

```
Cannot export model which references tensors that are neither
buffers/parameters/constants nor are direct inputs.  For each tensor, if you'd
like this tensor to be an explicit input, add it as a dummy argument
to the top-level model definition you are exporting; if you would
like its value to be embedded as an exported constant, wrap its access
in a function marked with @assume_constant_result.

G['bulbous_bouffant'], accessed at:
  File "test_export.py", line N, in f
    return bulbous_bouffant + y
```

This doesn't handle outputs, I'm going to hit that next.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106403
Approved by: https://github.com/tugsbayasgalan
2023-08-03 12:35:25 +00:00
Edward Z. Yang
76163a56c0 Refactor stack handling to always use TracingContext to populate real stack on exception (#106277)
The basic gist of the PR is simple, but it's accompanied with some careful modifications and unit tests to make sure I got it right. Check inline comments for more details.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106277
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2023-08-02 00:09:16 +00:00
Edward Z. Yang
884cd53e49 Unconditionally record when FakeTensorMode is allocated and report it on inconsistency (#105927)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105927
Approved by: https://github.com/albanD
2023-07-26 03:38:42 +00:00
Edward Z. Yang
523100a2f1 Make _CURRENT_TRACING_CONTEXT thread local (#105942)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105942
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2023-07-26 03:38:01 +00:00
Michael Lazos
05eea20eb9 [dynamo] Simulate torch function enablement state (#105091)
Part of https://github.com/pytorch/pytorch/issues/93723

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105091
Approved by: https://github.com/voznesenskym, https://github.com/anijain2305
2023-07-13 17:42:20 +00:00
Edward Z. Yang
979f826015 Read out real strides from compilation result, rather than real args (#105010)
This prefigures a refactor that will move the backward compilation
to entirely ahead of time, so I need to extract these strides some
other way.  Straight from the compiler's mouth will do it.

I can't easily get the information via the return result of `fw_compiler` without changing the calling convention, so instead I smuggle it via TracingContext. TracingContext may be None when we are compiling patterns for the joint graph pattern matcher.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105010
Approved by: https://github.com/shunting314
2023-07-12 11:33:08 +00:00
Michael Voznesensky
e5e9d563c2 Lift user defined attributes into inputs for certain cases (user defined types and tensors) (#103386)
(1) Lazy (converts to dynamo variable on access only)
(2) Uses existing side effect/reconstruct tech
(3) not tensor opinionated

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103386
Approved by: https://github.com/jansel
2023-06-20 23:45:19 +00:00
Thiago Crepaldi
6f655d4195 Add symbolic tracing support to torch._dynamo.export (fake input + weights) (#100017)
Fixes #95900
Using the following repro as guide:

```python
import torch
import torch._dynamo
from torch._subclasses import fake_tensor
from torch.fx.experimental.symbolic_shapes import ShapeEnv
from torch._dynamo.output_graph import config
class Model(torch.nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.linear = torch.nn.Linear(2, 2)
        self.linear2 = torch.nn.Linear(2, 2)

    def forward(self, x):
        out = self.linear(x)
        out = self.linear2(out)
        return out

fake_mode = fake_tensor.FakeTensorMode(allow_non_fake_inputs=False,
                                       allow_fallback_kernels=True,
                                       shape_env=ShapeEnv(
                                            allow_scalar_outputs=config.capture_scalar_outputs,
                                            allow_dynamic_output_shape_ops=config.capture_dynamic_output_shape_ops,
                                            frame_id=0
                                        ),
)
# Fakefying input/model before calling torch._dynamo.export
with fake_mode:
    fake_x = torch.rand(5, 2, 2)
    model = Model()

# Calling torch._dynamo.export without active fake mode
graph_module, guards = torch._dynamo.export(
    model,
    fake_x,
    aten_graph=True,
    fake_mode=fake_mode
)
graph_module.print_readable()
graph_module.graph.print_tabular()
```

Summary of changes:

    * Plumb fake_mode through torch.export API. When specified, it
    replaces the creation of a new FaketendorMode at InstructionTranslator on behalf of OutputGraph
     Hacks FakeTensor.__new__ to prevent a
    torch.tensor._make_subclass call for inputs that are already fakefied by
    user. This probably need to be fixed in a nicer way. Any idea?
    * Removed a few asserts that didn't want faked tensors coming
    from user script
    * Added torch._subclasses.fake_tensor.FakeTensor to type list on a few
    asserts check to allow fake inputs

The changes above allowed symbolic tracing with both static and dynamic shapes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100017
Approved by: https://github.com/ezyang
2023-06-15 21:28:10 +00:00
Michael Voznesensky
056bf951bf Strengthen partially supported invariant of base for chained sources (#103445)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103445
Approved by: https://github.com/ezyang
2023-06-13 22:44:28 +00:00
Elias Ellison
d083d444ff Inductor Freezing (#100652)
Adds a freezing pass that will constant fold parameters in inductor `config.freezing`. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes:

- There is an option to discard parameters `config.freezing_discard_parameters` which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used.
- I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed `fw_metadata` to avoid constant folding mutated paraemters.
- Caching parameter transformations/constant folding across different inferences nyi
- Checking version_counter of constant folded params nyi

I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name.

Differential Revision: [D46244033](https://our.internmc.facebook.com/intern/diff/D46244033)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100652
Approved by: https://github.com/jansel
2023-06-12 20:56:03 +00:00
Mark Saroufim
95fced4483 Pretty dataclass dynamo explain (#102869)
Also thinking out loud: maybe we only print graph break reasons? And for the rest we have a verbose print which prints everything?

TODO: some tests are failing based on what they expect a guard string to look like, easy to fix i'll do it early next week

# After

```
(sourcetorch) ubuntu@ip-172-31-1-136:~/test$ python pretty.py
BREAK
Graph Count: 2
Graph Break Count: 1
Op Count: 2
Break Reasons:
  Break Reason 1:
    Reason: call_function BuiltinVariable(print) [ConstantVariable(str)] {}
    User Stack:
      <FrameSummary file /home/ubuntu/test/pretty.py, line 6 in fn>
Ops per Graph:
  Ops 1:
    <built-in function add>
  Ops 2:
    <built-in function add>
Out Guards:
  Guard 1:
    Name: ''
    Source: global
    Create Function: GRAD_MODE
    Guard Types: ['GRAD_MODE']
    Code List: ['___is_grad_enabled()']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 2:
    Name: ''
    Source: global
    Create Function: DEFAULT_DEVICE
    Guard Types: ['DEFAULT_DEVICE']
    Code List: ['utils_device.CURRENT_DEVICE == None']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 3:
    Name: "G['print']"
    Source: global
    Create Function: BUILTIN_MATCH
    Guard Types: None
    Code List: None
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 4:
    Name: ''
    Source: global
    Create Function: DETERMINISTIC_ALGORITHMS
    Guard Types: ['DETERMINISTIC_ALGORITHMS']
    Code List: ['not ___are_deterministic_algorithms_enabled()']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 5:
    Name: "L['x']"
    Source: local
    Create Function: TENSOR_MATCH
    Guard Types: None
    Code List: None
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 6:
    Name: ''
    Source: global
    Create Function: GRAD_MODE
    Guard Types: ['GRAD_MODE']
    Code List: ['___is_grad_enabled()']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 7:
    Name: ''
    Source: global
    Create Function: DEFAULT_DEVICE
    Guard Types: ['DEFAULT_DEVICE']
    Code List: ['utils_device.CURRENT_DEVICE == None']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 8:
    Name: ''
    Source: global
    Create Function: DETERMINISTIC_ALGORITHMS
    Guard Types: ['DETERMINISTIC_ALGORITHMS']
    Code List: ['not ___are_deterministic_algorithms_enabled()']
    Object Weakref: None
    Guarded Class Weakref: None
  Guard 9:
    Name: "L['x']"
    Source: local
    Create Function: TENSOR_MATCH
    Guard Types: None
    Code List: None
    Object Weakref: None
    Guarded Class Weakref: None
Compile Times: TorchDynamo compilation metrics:
Function                        Runtimes (s)
------------------------------  --------------
_compile                        0.0164, 0.0035
OutputGraph.call_user_compiler  0.0000, 0.0000
```

## Before

```
('Dynamo produced 2 graphs with 1 graph break and 2 ops', [{Guard(name='print', source=<GuardSource.GLOBAL: 1>, create_fn=<function GuardBuilder.BUILTIN_MATCH at 0x7f92ea5009d0>, is_volatile=False, guard_types=None, code_list=None, obj_weakref=None, guarded_class_weakref=None), Guard(name='x', source=<GuardSource.LOCAL: 0>, create_fn=<function GuardBuilder.TENSOR_MATCH at 0x7f92ea501000>, is_volatile=False, guard_types=['TENSOR_MATCH'], code_list=None, obj_weakref=<weakref at 0x7f9224d28f40; dead>, guarded_class_weakref=<weakref at 0x7f92d81734c0; to 'torch._C._TensorMeta' at 0x540b610 (Tensor)>)}, {Guard(name='x', source=<GuardSource.LOCAL: 0>, create_fn=<function GuardBuilder.TENSOR_MATCH at 0x7f92ea501000>, is_volatile=False, guard_types=['TENSOR_MATCH'], code_list=None, obj_weakref=<weakref at 0x7f9224d5e700; dead>, guarded_class_weakref=<weakref at 0x7f92d81734c0; to 'torch._C._TensorMeta' at 0x540b610 (Tensor)>)}], [GraphModule(), GraphModule()], [[<built-in function add>], [<built-in function add>]], [GraphCompileReason(reason='call_function BuiltinVariable(print) [ConstantVariable(str)] {}', user_stack=[<FrameSummary file <ipython-input-1-9e2ddb639697>, line 6 in fn>]), GraphCompileReason(reason='return_value', user_stack=[<FrameSummary file <ipython-input-1-9e2ddb639697>, line 8 in <graph break in fn>>])], 'Dynamo produced 2 graphs with 1 graph break and 2 ops\n Break reasons: \n\n1. call_function BuiltinVariable(print) [ConstantVariable(str)] {}\n  File "<ipython-input-1-9e2ddb639697>", line 6, in fn\n    print("BREAK")\n \n2. return_value\n  File "<ipython-input-1-9e2ddb639697>", line 8, in <graph break in fn>\n    return x\n \nTorchDynamo compilation metrics:\nFunction                        Runtimes (s)\n------------------------------  --------------\n_compile                        0.0418, 0.0084\nOutputGraph.call_user_compiler  0.0001, 0.0001')

```

## Program

```python
import torch
import torch._dynamo

def fn(x):
    x = x + 1
    print("BREAK")
    x = x + 1
    return x

out = torch._dynamo.explain(fn, torch.randn(10))
print(out)

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102869
Approved by: https://github.com/voznesenskym
2023-06-07 22:38:57 +00:00
Yanbo Liang
9ff1932d2b [Dynamo] Save global autocast state to restore on graph break (#102415)
Fixes #102414

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102415
Approved by: https://github.com/yf225
2023-05-30 23:03:21 +00:00
Animesh Jain
dafa009c3c [dynamo][moco] Save global torch state to restore on graph break (#101201)
This is relevant to  https://github.com/pytorch/pytorch/pull/100570 as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101201
Approved by: https://github.com/voznesenskym
2023-05-18 01:03:15 +00:00
Michael Voznesensky
ffcbd1c2de Move tracked nn_modules from OutputGraph to TracingContext (#100457)
Lint

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100457
Approved by: https://github.com/anijain2305
2023-05-03 02:00:11 +00:00
Edward Z. Yang
d69a1a4491 In detect_fake_mode, assert that all detected fake modes are consistent (#99392)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99392
Approved by: https://github.com/eellison
2023-04-18 15:35:05 +00:00
Edward Z. Yang
5c38c4cfa4 Improve symbolic shapes guard logging (#98941)
Billing of changes:
* Get rid of `print_guards`; instead, you control this with `TORCH_LOGS=torch.fx.experimental.symbolic_shapes`, debug logging toggles stack traces
* Don't incorrectly report the tracing context frame when we're compiling; we just don't have this info anymore! (TODO: use the saved frames instead). This is via a new TracingContext.clear_frame context manager
* Add TracingContext.extract_stack() which gives you the tracing context stack.
* Add ShapeEnvLoggingAdapter to report which ShapeEnv any given operation is from (this is helpful for debugging situations when there are too many ShapeEnvs floating around)
* Tweak create_symbol log message to also report Source
* Add a debug log whenever duck sizing occurs
* Report an excerpt of both the user and system backtrace whenever a guard is added in INFO mode. I found this is a good balance of "where did the guard come from" without full backtrace verbosity.

Example log output with the new output:

```
[2023-04-12 08:25:49,003] torch.fx.experimental.symbolic_shapes: [INFO] 0: create_env
[2023-04-12 08:25:49,021] torch.fx.experimental.symbolic_shapes: [INFO] 0: create_symbol s0 = 32 for L['x'].size()[0]
[2023-04-12 08:25:50,154] torch.fx.experimental.symbolic_shapes: [INFO] 0: evaluate_expr s0 < 128 [guard added] at w.py:11 in forward2 (_dynamo/variables/tensor.py:476 in evaluate_expr)
[2023-04-12 08:25:52,057] torch.fx.experimental.symbolic_shapes: [INFO] 0: evaluate_expr Eq(Mod(s0, 16), 0) [guard added] (_inductor/codegen/triton.py:77 in is_aligned)
```

from running

```
import torch
import torch._dynamo

def f(x, y):
    return x + y

def forward(x, y):
    return forward2(x, y)

def forward2(x, y):
    if x.size(0) < 128:
        x = x * 2
    else:
        x = x * 3
    r = f(x, y)
    r = r * y
    return r

def woof():
    fn_compiled = torch.compile(forward, dynamic=True)
    x = torch.randn(32, device='cuda')
    y = torch.randn(32, device='cuda')
    print(fn_compiled(x, y))

woof()
```

(To induce the Triton guard, I synthetically reverted https://github.com/pytorch/pytorch/pull/98471)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98941
Approved by: https://github.com/wconstab
2023-04-12 21:58:59 +00:00
Edward Z. Yang
9abae6ae32 Make all Source subclasses frozen. (#98737)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98737
Approved by: https://github.com/albanD
2023-04-10 17:51:10 +00:00
Edward Z. Yang
d01ee10b25 Add detect_fake_mode (#98321)
This replaces fake_mode_from_tensors but it preferentially looks for
fake_mode in TracingContext and also if there is an active fake mode
on the dispatch stack, before groveling in tensors to find it.

This advances PegasusForCausalLM, which was previously failing because
we generated a graph that had a parameter (non-fake) and a SymInt,
and thus previously we failed to detect the correct fake mode.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98321
Approved by: https://github.com/voznesenskym
2023-04-05 22:15:16 +00:00
Will Constable
c1a6dde79e Make dynamo-FSDP skip guards (#97463)
Create a new GuardSource for FSDP modules, and use it
to opt out of guard installation.

Based on @awgu's work in https://github.com/pytorch/pytorch/pull/97091

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97463
Approved by: https://github.com/voznesenskym, https://github.com/jansel, https://github.com/awgu
2023-03-28 04:04:34 +00:00
Michael Voznesensky
f9ce593267 Extend aot autograd dedup guards to params, stop using positions (#96774)
The purpose of this PR is to remove reliance on argument positions in dedup guards, AND extend the functionality to params.

A version of this PR was stamped prior https://github.com/pytorch/pytorch/pull/95831 - but was kinda gross, because it was based on an underlying PR that did way too much with source names.

This PR leaves most of that alone, in favor of just reusing the same name standardization logic that dynamo module registration does.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96774
Approved by: https://github.com/ezyang
2023-03-21 05:59:33 +00:00
Avik Chaudhuri
e4e761b277 record caller frame instead of function frame (#96882)
Previously, when starting to trace a function, we would record a frame summary recording the definition loc. This would lead to an unconventional-looking stack trace when used for debugging, e.g., shape guards.

```
  File ".../scripts/avik/pt2/example.py", line 407, in forward
    def forward(self, x):
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 912, in forward
    @add_start_docstrings_to_model_forward(BERT_INPUTS_DOCSTRING.format("batch_size, sequence_length"))
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 562, in forward
    def forward(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 484, in forward
    def forward(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 416, in forward
    def forward(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 275, in forward
    def forward(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 351, in forward
    attention_scores = attention_scores + attention_mask
```

As noted in https://github.com/pytorch/pytorch/pull/95848#discussion_r1134397096, we would like to change this to record function calls instead, like conventional stack traces do. This diff makes this change. The above stack now looks like the following, which is way more helpful at a glance to understand what's going on.

```
  File ".../scripts/avik/pt2/example.py", line 408, in forward
    bert_out = self.bert(**x)
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 1021, in forward
    encoder_outputs = self.encoder(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 610, in forward
    layer_outputs = layer_module(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 496, in forward
    self_attention_outputs = self.attention(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 426, in forward
    self_outputs = self.self(
  ...
  File ".../transformers/models/bert/modeling_bert.py", line 351, in forward
    attention_scores = attention_scores + attention_mask
```

Differential Revision: [D44101882](https://our.internmc.facebook.com/intern/diff/D44101882/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96882
Approved by: https://github.com/ezyang
2023-03-17 00:06:16 +00:00
Avik Chaudhuri
178d2a38e0 debug shape guards (#95848)
Adds logging when shape guards are added and when symbols are specialized to constants.

Differential Revision: [D43719743](https://our.internmc.facebook.com/intern/diff/D43719743/)

Differential Revision: [D43719743](https://our.internmc.facebook.com/intern/diff/D43719743)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95848
Approved by: https://github.com/ezyang
2023-03-14 16:05:28 +00:00
Michael Voznesensky
d7db5b05b4 Context manager to push/pop frame summaries (#96054)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96054
Approved by: https://github.com/avikchaudhuri, https://github.com/ezyang
2023-03-08 04:01:49 +00:00
Andrew Gu
cbac56e244 [BE] Simplify Source.is_nn_module; add some types (#95292)
I am still reading Dynamo source code...

This is an easy PR to simplify `Source.is_nn_module()` to reuse `GuardSource.is_nn_module()` instead of having the `in (...)` check implemented twice. While simplifying that, I thought I might as well add some type annotations for `Source` methods.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95292
Approved by: https://github.com/ezyang
2023-02-22 22:33:58 +00:00
Edward Z. Yang
89e16c4f18 Assume sympy is always installed (#94903)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94903
Approved by: https://github.com/Skylion007, https://github.com/malfet
2023-02-16 14:09:58 +00:00
Edward Z. Yang
f8740db410 Properly resolve source_ref when constructing shape guards (#91058)
Whenever you guard on something, you're supposed to tell GuardBuilder about it, so GuardBuilder knows that it has to actually bind it in scope when it creates the guard function. But shape env guards bypass that mechanism completely. Well, now they don't.

For the most part, this didn't matter in practice, because we usually had a `TENSOR_MATCH` guard floating around that made sure that the guard stayed live. But if we ever eliminate those guards (e.g., because we build it into the shape guard directly; something we'll probably want to do when https://github.com/pytorch/pytorch/pull/89707 goes online) then this will indeed matter.

One complication: some of the shape env guards are on globals. You have to make sure to shunt the usage to the correct guard builder in that case. Maybe it would be better if we refactored things so there is only one GuardBuilder. Not sure.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91058
Approved by: https://github.com/voznesenskym
2022-12-30 05:56:56 +00:00
Edward Z. Yang
bcf15cd93b Store source, not sname, in Symbol (#91057)
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:

- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports

I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/zou3519
2022-12-30 05:56:56 +00:00
PyTorch MergeBot
b68fd7e319 Revert "Store source, not sname, in Symbol (#91057)"
This reverts commit 88c581be87.

Reverted https://github.com/pytorch/pytorch/pull/91057 on behalf of https://github.com/atalman due to causing internal build failures
2022-12-21 22:33:15 +00:00
Edward Z. Yang
88c581be87 Store source, not sname, in Symbol (#91057)
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:

- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports

I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2022-12-21 04:51:51 +00:00
Edward Z. Yang
57390116e0 Restructure ShapeEnv so it uses GuardBuilder.SHAPE_ENV directly (#91055)
The idea is to make ShapeEnv guards less of a one-off special snowflake, and integrate it more closely with the regular builder infrastructure. But it is not so easy: the shape env code has to live after tensor match code, because we need to know that the values in question are tensors before we start matching on them. So we introduce a new `shape_env_code` field to put the special shape env code, so we can add it to the final constructed code after tensor.

Everything else works the obvious way. There's a new ShapeEnvSource for constructing the singleton SHAPE_ENV guard that drives the shape env guard construction. I added some more docs and also made the printed code for guards include the enclosing lambda for more clarity.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91055
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2022-12-21 03:50:47 +00:00
Michael Voznesensky
b72caf311d Introduce guardexpr, aot autograd guarding of duplicates into torch._guards (#90955)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90955
Approved by: https://github.com/ezyang
2022-12-18 03:05:47 +00:00
Michael Voznesensky
53e71fad8f Add shape_env guards to tracing context (#90876)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90876
Approved by: https://github.com/Chillee, https://github.com/ezyang
2022-12-16 09:05:05 +00:00
Edward Z. Yang
eef019c14a Lint rule to forbid direct use of logging.info/etc APIs (#90907)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90907
Approved by: https://github.com/jansel
2022-12-16 05:13:51 +00:00
Michael Voznesensky
6c8ef6a4c2 Add tracing context, Integrate dynamo guards into torch._guards (#90647)
As defined here: https://docs.google.com/document/d/1oniZEgAaHE1IMByPRWRKbUHeaW06E2HMfCTCQyMRLek/edit#

This PR creates a new structure, a TracingContext, whose lifecycle matches that of the traced frame. It carries on it a GuardsContext, and eventually, a FakeTensorMode. It is the source of truth of all accumulated guards.

In this PR, we create the structure, and integrate it into dynamo. We do so by mapping OutputGraph's guards structure to its guard structure.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90647
Approved by: https://github.com/ezyang
2022-12-14 07:35:32 +00:00
Michael Voznesensky
5adc18dcbc Shape guard structure (#90679)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90679
Approved by: https://github.com/ezyang
2022-12-12 09:50:00 +00:00
Michael Voznesensky
11442accc6 Make torch._guards, shuffle structures around for migration (#90636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636
Approved by: https://github.com/ezyang
2022-12-11 23:16:07 +00:00
PyTorch MergeBot
15a4c60383 Revert "Make torch._guards, shuffle structures around for migration (#90636)"
This reverts commit 933b6c4eed.

Reverted https://github.com/pytorch/pytorch/pull/90636 on behalf of https://github.com/huydhn due to Breaking lint on master. Please rebase and run lintrunner -a before re-merging the PR
2022-12-11 10:15:47 +00:00
Michael Voznesensky
933b6c4eed Make torch._guards, shuffle structures around for migration (#90636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90636
Approved by: https://github.com/ezyang
2022-12-11 06:04:17 +00:00