Commit Graph

387 Commits

Author SHA1 Message Date
Yidi Wu
836bb1941b [hop] support torch.func.functional_call in hop subgraph (#155886)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155886
Approved by: https://github.com/zou3519
2025-06-28 23:47:46 +00:00
William Wen
7b7eafe7ba [dynamo] add set_fullgraph decorator/context manager (#154289)
Implements https://github.com/pytorch/pytorch/issues/144908.

Implementation notes:
- `set_fullgraph` is implemented using `patch_config`, which changes config correctly during runtime and tracing.
- Moved setting `config.error_on_graph_break` from convert_frame.py to eval_frame.py. This is because this should only be done at the top-level decorated function. If we kept this in convert_frame.py, we would be changing `config.error_on_graph_break` on every top-level frame, which causes confusing behavior (see added test for example).
- InstructionTranslator reads from `config.error_on_graph_break` every `step()`. This is to determine the value of `config.error_on_graph_break` at the time of the graph break, because tracer cleanup will restore the value of `config.error_on_graph_break` .
- `convert_frame.py` determines whether we should abort tracing (fullgraph=True) or continue (fullgraph=False) by reading the value of the tracer's `error_on_graph_break`. If there is no tracer (failed to initialize), then default to reading `config.error_on_graph_break`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154289
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: #154283
2025-06-26 21:40:38 +00:00
angelayi
48e7b62d3a [dynamo] Add immutable pytree to trace_rules (#156772)
Fixes https://github.com/pytorch/pytorch/issues/155426

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156772
Approved by: https://github.com/williamwen42
2025-06-25 20:08:47 +00:00
Xuehai Pan
1b2146fc6d [BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314
Approved by: https://github.com/jingsh
ghstack dependencies: #156313
2025-06-23 02:57:19 +00:00
PyTorch MergeBot
5e56db59d4 Revert "[dynamo] add set_fullgraph decorator/context manager (#154289)"
This reverts commit 2c372a0502.

Reverted https://github.com/pytorch/pytorch/pull/154289 on behalf of https://github.com/ezyang due to All of this is responsible for regression, see https://github.com/pytorch/pytorch/pull/156561 ([comment](https://github.com/pytorch/pytorch/pull/154283#issuecomment-2994242583))
2025-06-22 14:22:07 +00:00
PyTorch MergeBot
5b427c92a8 Revert "[BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)"
This reverts commit ead741c5fb.

Reverted https://github.com/pytorch/pytorch/pull/156314 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912) [HUD commit link](c95f7fa874) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213))
2025-06-22 12:31:57 +00:00
Xuehai Pan
ead741c5fb [BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314
Approved by: https://github.com/jingsh
ghstack dependencies: #156313
2025-06-22 08:43:18 +00:00
soulitzer
554b568040 Add internal use only utility to allow externally visible side effects within HOPs (#155715)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155715
Approved by: https://github.com/zou3519
2025-06-21 03:55:28 +00:00
William Wen
2c372a0502 [dynamo] add set_fullgraph decorator/context manager (#154289)
Implements https://github.com/pytorch/pytorch/issues/144908.

Implementation notes:
- `set_fullgraph` is implemented using `patch_config`, which changes config correctly during runtime and tracing.
- Moved setting `config.error_on_graph_break` from convert_frame.py to eval_frame.py. This is because this should only be done at the top-level decorated function. If we kept this in convert_frame.py, we would be changing `config.error_on_graph_break` on every top-level frame, which causes confusing behavior (see added test for example).
- InstructionTranslator reads from `config.error_on_graph_break` every `step()`. This is to determine the value of `config.error_on_graph_break` at the time of the graph break, because tracer cleanup will restore the value of `config.error_on_graph_break` .
- `convert_frame.py` determines whether we should abort tracing (fullgraph=True) or continue (fullgraph=False) by reading the value of the tracer's `error_on_graph_break`. If there is no tracer (failed to initialize), then default to reading `config.error_on_graph_break`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154289
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: #154283
2025-06-20 07:03:07 +00:00
Laith Sakka
3f69e3b3a0 Add view_simple as meta function for view, and avoid calling reshape_view_helper for unbacked (#154757)
address https://github.com/pytorch/pytorch/issues/153303

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154757
Approved by: https://github.com/bobrenjc93, https://github.com/leslie-fang-intel
2025-06-19 04:50:18 +00:00
PyTorch MergeBot
c5d3e7a4ff Revert "[dynamo] add set_fullgraph decorator/context manager (#154289)"
This reverts commit 920f6e681e.

Reverted https://github.com/pytorch/pytorch/pull/154289 on behalf of https://github.com/atalman due to inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_do_not_trigger_dynamic_shapes_on_empty_block_mask_cuda GH job link HUD commit link ([comment](https://github.com/pytorch/pytorch/pull/154289#issuecomment-2984774814))
2025-06-18 15:51:06 +00:00
William Wen
920f6e681e [dynamo] add set_fullgraph decorator/context manager (#154289)
Implements https://github.com/pytorch/pytorch/issues/144908.

Implementation notes:
- `set_fullgraph` is implemented using `patch_config`, which changes config correctly during runtime and tracing.
- Moved setting `config.error_on_graph_break` from convert_frame.py to eval_frame.py. This is because this should only be done at the top-level decorated function. If we kept this in convert_frame.py, we would be changing `config.error_on_graph_break` on every top-level frame, which causes confusing behavior (see added test for example).
- InstructionTranslator reads from `config.error_on_graph_break` every `step()`. This is to determine the value of `config.error_on_graph_break` at the time of the graph break, because tracer cleanup will restore the value of `config.error_on_graph_break` .
- `convert_frame.py` determines whether we should abort tracing (fullgraph=True) or continue (fullgraph=False) by reading the value of the tracer's `error_on_graph_break`. If there is no tracer (failed to initialize), then default to reading `config.error_on_graph_break`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154289
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: #154283
2025-06-18 07:27:00 +00:00
Yu, Guangye
0935a97d95 [Dynamo] Add torch.accelerator API to trace_rules (#155884)
# Motivation
- Add binding API and non-binding API in torch.accelerator to trace rules.
- Add some function in torch.accelerator to const fold functon list for Dynamo capature.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155884
Approved by: https://github.com/jansel, https://github.com/EikanWang
ghstack dependencies: #155787, #155788
2025-06-15 17:09:57 +00:00
Yu, Guangye
b51d803785 [Dynamo] Add XPU API to trace_rules (#155788)
# Motivation
- Add binding API and non-bindling API to trace rules for XPU;
- Add some XPU API to the const fold function for Dynamo capture.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155788
Approved by: https://github.com/jansel, https://github.com/EikanWang
ghstack dependencies: #155787
2025-06-15 17:09:57 +00:00
PyTorch MergeBot
06408dae49 Revert "Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757)"
This reverts commit 0029259bdf.

Reverted https://github.com/pytorch/pytorch/pull/154757 on behalf of https://github.com/laithsakka due to post land issue ([comment](https://github.com/pytorch/pytorch/pull/154757#issuecomment-2971385787))
2025-06-13 19:11:43 +00:00
Aleksandar Samardžić
62fa3f5aeb Support tuning of _grouped_mm (#153953)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153953
Approved by: https://github.com/ngimel
2025-06-12 15:39:35 +00:00
Laith Sakka
0029259bdf Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757)
address https://github.com/pytorch/pytorch/issues/153303

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154757
Approved by: https://github.com/bobrenjc93, https://github.com/leslie-fang-intel
2025-06-12 09:58:15 +00:00
Oguz Ulgen
d1947a8707 Migrate from lru_cache to cache (#155613)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155613
Approved by: https://github.com/ezyang
ghstack dependencies: #155612
2025-06-11 19:44:18 +00:00
Pian Pawakapan
247f83e0a4 [dynamic shapes] guard individual terms in sym_and; user-code-friendly sym_and/sym_or (#154737)
Previously when processing `sym_and(a, b, c)`, symbolic shapes wouldn't individually process a, b, and c and store their implications. This would lead us to data-dependent error on individual checks, e.g. we stored `u0 >= 0 & u0 <= 10`, but then couldn't figure out `u0 <= 10`.

This handles that, and also makes `sym_and/or` user-code friendly, for testing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154737
Approved by: https://github.com/laithsakka
2025-06-11 18:08:06 +00:00
Bob Ren
0756ebcd48 Add docblock to torch/_dynamo/trace_rules.py (#155401)
Add comprehensive module docstring explaining the tracing rules and policies
that govern TorchDynamo's compilation decisions, including skip rules,
inlining policies, and library-specific handling.

Originally generated by claude but reviewed and edited by me.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155401
Approved by: https://github.com/williamwen42
2025-06-08 04:30:03 +00:00
bobrenjc93
53ecb8159a Introduce statically_known_false (#154291)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154291
Approved by: https://github.com/mengluy0125
2025-05-24 14:23:55 +00:00
Yidi Wu
ceb009baee [map] always turn on dynamo for map (#152041)
Summary:
X-link: https://github.com/pytorch/executorch/pull/10409

Reland D72896450

Make map consistent with other control flow ops. After the change, map is able to support accessing closures in the map fn.

Test Plan: See existing tests.

Reviewed By: zou3519

Differential Revision: D73138427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152041
Approved by: https://github.com/zou3519
2025-05-12 02:10:08 +00:00
Michael Lazos
20e2ca3e29 [Dynamo] Allow inlining into AO quantization modules (#152934)
This adds dynamo inlining into `torch.ao.quantization.fake_quantize`.

This is needed for QAT compatbility w/ an RL training model.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152934
Approved by: https://github.com/williamwen42
2025-05-07 23:58:11 +00:00
Boyuan Feng
d969e2ec33 [CUDAGraph Trees] support memory allocation on side stream (#152472)
I tried `beginAllocateToPool` instead of `_cuda_beginAllocateCurrentStreamToPool` and the error in #151199 does not happen any more.

However, this approach is unsafe for multithreading. When multiple run_eager happens concurrently, we expect memory allocation to different mem_pool. Since beginAllocateToPool does not check stream, these memory allocation may happen on the same mem_pool.

So, I use `_cuda_beginAllocateCurrentThreadToPool` to direct all memory allocation on the same thread to a given mem_pool. In particular, `_cuda_beginAllocateCurrentThreadToPool` records the launching thread id, and during runtime checks if the current thread id matches the launching thread id.

Fixes #151199

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152472
Approved by: https://github.com/eellison, https://github.com/ngimel
2025-05-02 04:26:35 +00:00
Pian Pawakapan
2ee8de54b1 [dynamic shapes] user-code friendly statically_known_true, has_static_value (#151601)
Fixes #151480

Allows `statically_known_true` in user code, as well as introducing `has_static_value`, returning True if the input has a static bool/float/int value

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151601
Approved by: https://github.com/laithsakka, https://github.com/zou3519, https://github.com/jingsh
2025-04-24 02:53:59 +00:00
William Wen
5b9df57b50 [dynamo] context manager/decorator for dynamo config patching during tracing (#150586)
Implement traceable config patching for Dynamo: enables restricted patching of Dynamo config where user can use a context manager/decorator to change tracing behavior for parts of the code.

The new `dont_skip_tracing` decorator/context manager for ignoring most trace rules is easily implemented with this more generic traceable config patching feature.

Implementation:
- Create a new specialized context manager class representing a wrapper around torch._dynamo.config.patch
- Dynamo doesn't trace into the context manager but updates config at compile time
- Correctness is based on our correctness for handling supported context managers
- Implementation is inspired by how `GradModeVariable` is implemented.

Previous attempts: https://github.com/pytorch/pytorch/pull/148736 (decorator-only global approach) and https://github.com/pytorch/pytorch/pull/149439 (decorator-only traceback approach)

See https://docs.google.com/document/d/1vWNwKL_jpg-PLopifcaSa338wks3GqSVF4GHRguybGg/edit?tab=t.0 for more details on implementation - including previous approaches.

NOTE: this PR fixes a bug where skipped code objects were not tracked by convert_frame.py, leading to cases where code objects would be automatically skipped even after `torch._dynamo.reset()`. This exposed some latent dynamo-wrapped test failures in CI that previously passed in CI but not locally.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150586
Approved by: https://github.com/jansel, https://github.com/zou3519, https://github.com/anijain2305
2025-04-23 09:12:13 +00:00
PyTorch MergeBot
6a3a6d22dc Revert "[dynamo] context manager/decorator for dynamo config patching during tracing (#150586)"
This reverts commit 40ce4fb24a.

Reverted https://github.com/pytorch/pytorch/pull/150586 on behalf of https://github.com/clee2000 due to broke some inductor tests? inductor/test_fuzzer.py::TestConfigFuzzer::test_config_fuzzer_dynamo_bisect [GH job link](https://github.com/pytorch/pytorch/actions/runs/14486513628/job/40635178179) [HUD commit link](40ce4fb24a), bad TD ([comment](https://github.com/pytorch/pytorch/pull/150586#issuecomment-2810064322))
2025-04-16 16:13:47 +00:00
William Wen
40ce4fb24a [dynamo] context manager/decorator for dynamo config patching during tracing (#150586)
Implement traceable config patching for Dynamo: enables restricted patching of Dynamo config where user can use a context manager/decorator to change tracing behavior for parts of the code.

The new `dont_skip_tracing` decorator/context manager for ignoring most trace rules is easily implemented with this more generic traceable config patching feature.

Implementation:
- Create a new specialized context manager class representing a wrapper around torch._dynamo.config.patch
- Dynamo doesn't trace into the context manager but updates config at compile time
- Correctness is based on our correctness for handling supported context managers
- Implementation is inspired by how `GradModeVariable` is implemented.

Previous attempts: https://github.com/pytorch/pytorch/pull/148736 (decorator-only global approach) and https://github.com/pytorch/pytorch/pull/149439 (decorator-only traceback approach)

See https://docs.google.com/document/d/1vWNwKL_jpg-PLopifcaSa338wks3GqSVF4GHRguybGg/edit?tab=t.0 for more details on implementation - including previous approaches.

NOTE: this PR fixes a bug where skipped code objects were not tracked by convert_frame.py, leading to cases where code objects would be automatically skipped even after `torch._dynamo.reset()`. This exposed some latent dynamo-wrapped test failures in CI that previously passed in CI but not locally.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150586
Approved by: https://github.com/jansel, https://github.com/zou3519, https://github.com/anijain2305
2025-04-16 06:49:58 +00:00
Oguz Ulgen
8d5f7ab06c Replace all random is_fbcode imports to environment (#151283)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151283
Approved by: https://github.com/masnesral, https://github.com/Skylion007
2025-04-15 19:42:58 +00:00
Shangdi Yu
83d88d128d [reland] Make export._trace._WrapperModule work in strict mode (#146919) (#151264)
Summary:

as title

`export._trace._WrapperModule` is used to wrap functions into a Module so we can export the function.

We add `export._wrapper_utils` to `dynamo`'s `MOD_INLINELIST` so dynamo traces into `_WrapperModule`

Fixes https://github.com/pytorch/pytorch/issues/146867

Test Plan:
```
buck run fbcode//mode/dev-nosan //caffe2/test:test_export -- -r wrapper_module
```

Differential Revision: D72986826

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151264
Approved by: https://github.com/angelayi
2025-04-15 18:35:34 +00:00
PyTorch MergeBot
4a47dd9b3f Revert "[map] always turn on dynamo for map (#150962)"
This reverts commit a72d56cb6b.

Reverted https://github.com/pytorch/pytorch/pull/150962 on behalf of https://github.com/Camyll due to breaking internal builds {SHORT_REASON} ([comment](https://github.com/pytorch/pytorch/pull/150962#issuecomment-2803006282))
2025-04-14 21:09:22 +00:00
PyTorch MergeBot
2f899f07aa Revert "Make export._trace._WrapperModule work in strict mode (#146919)"
This reverts commit dad5e5e262.

Reverted https://github.com/pytorch/pytorch/pull/146919 on behalf of https://github.com/malfet due to Broke lint, see https://github.com/pytorch/pytorch/actions/runs/14415686353/job/40431799827 ([comment](https://github.com/pytorch/pytorch/pull/146919#issuecomment-2798446930))
2025-04-12 04:12:36 +00:00
Shangdi Yu
dad5e5e262 Make export._trace._WrapperModule work in strict mode (#146919)
Summary:
as title

`export._trace._WrapperModule` is used to wrap functions into a Module so we can export the function.

We add `export._wrapper_utils` to `dynamo`'s `MOD_INLINELIST` so dynamo traces into `_WrapperModule`

Fixes https://github.com/pytorch/pytorch/issues/146867

Test Plan:
```
 buck run fbcode//mode/dev-nosan //caffe2/test:test_export -- -r wrapper_module
```

Differential Revision: D69434316

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146919
Approved by: https://github.com/angelayi
2025-04-12 03:22:08 +00:00
Yidi Wu
a72d56cb6b [map] always turn on dynamo for map (#150962)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150962
Approved by: https://github.com/zou3519
2025-04-11 23:28:06 +00:00
Bert Maher
2d187bf7e6 Support tuning of _scaled_grouped_mm (#150421)
This includes the default aten implementation, as well as a Triton
implementation imported from FBGEMM
(https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/experimental/gemm/triton_gemm/grouped_gemm.py)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150421
Approved by: https://github.com/ngimel
2025-04-11 23:03:49 +00:00
PyTorch MergeBot
6a65f2c4fe Revert "Support tuning of _scaled_grouped_mm (#150421)"
This reverts commit 8efcf21fff.

Reverted https://github.com/pytorch/pytorch/pull/150421 on behalf of https://github.com/malfet due to Looks like it broke lint, see a0ab243c3a/1 ([comment](https://github.com/pytorch/pytorch/pull/150421#issuecomment-2795218547))
2025-04-10 21:36:41 +00:00
Bert Maher
8efcf21fff Support tuning of _scaled_grouped_mm (#150421)
This includes the default aten implementation, as well as a Triton
implementation imported from FBGEMM
(https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/experimental/gemm/triton_gemm/grouped_gemm.py)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150421
Approved by: https://github.com/ngimel
2025-04-10 20:34:16 +00:00
ZhiweiYan-96
52d172eafd Facilitate at::_weight_int4pack_mm_with_scale_and_zeros related registration (#147962)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147962
Approved by: https://github.com/jerryzh168, https://github.com/guangyey, https://github.com/EikanWang
ghstack dependencies: #137566

Co-authored-by: xiaolil1 <xiaoli.liu@intel.com>
2025-04-08 15:36:07 +00:00
Guilherme Leobas
f3b2fb6c66 Allow trace through unittest (#146500)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146500
Approved by: https://github.com/anijain2305
2025-04-08 14:55:17 +00:00
FFFrog
f8aa6404ac Refactor: add initialization of math.lcm into torch_c_binding_in_graph_functions (#150766)
As the title stated.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150766
Approved by: https://github.com/aorenste, https://github.com/jansel
2025-04-08 04:12:26 +00:00
Laith Sakka
f9f6c080d8 support guard or false/true in user code and add tests (#150178)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150178
Approved by: https://github.com/pianpwk
2025-04-04 01:19:14 +00:00
Ryan Guo
33535b3eee [dynamo] Support Tensor subclass that has dynamic attributes or calls Parameter.__torch_function__ (#149482)
This fixes most of https://github.com/huggingface/diffusers/issues/10795,
except for `torch.Tensor._make_subclass`, which will be fixed in a
subsequent patch.

The relevant tensor subclass from the aforementioned issue is defined
here: fbf6b856cc/src/diffusers/quantizers/gguf/utils.py (L398-L435).

There are two things to note about the tensor subclass:
1. it calls `super().__torch_function__`, which is
   `torch._C._disabled_torch_function_impl`, so this patch updates
   `SuperVariable.call_method` to handle it (we can't do a simpler
   polyfill due to some bug with `var_getattr` raising
   `NotImplementedError`, which forgot to restore symbolic context).
2. it sets and reads attributes (`quant_type`), and
   defines new methods (`as_data`), so this patch adds support for those.
3. it has a `__init__`, which Dynamo needs to trace through in
   `TensorSubclassVariable.call_function`.

Differential Revision: [D71906140](https://our.internmc.facebook.com/intern/diff/D71906140)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149482
Approved by: https://github.com/jansel, https://github.com/mlazos
2025-04-02 20:56:43 +00:00
PyTorch MergeBot
03c879d59b Revert "[dynamo] Support Tensor subclass that has dynamic attributes or calls Parameter.__torch_function__ (#149482)"
This reverts commit 98453c135a.

Reverted https://github.com/pytorch/pytorch/pull/149482 on behalf of https://github.com/malfet due to Broke trunk, see b03c42109c/1 ([comment](https://github.com/pytorch/pytorch/pull/149482#issuecomment-2773650522))
2025-04-02 20:30:33 +00:00
Ryan Guo
98453c135a [dynamo] Support Tensor subclass that has dynamic attributes or calls Parameter.__torch_function__ (#149482)
This fixes most of https://github.com/huggingface/diffusers/issues/10795,
except for `torch.Tensor._make_subclass`, which will be fixed in a
subsequent patch.

The relevant tensor subclass from the aforementioned issue is defined
here: fbf6b856cc/src/diffusers/quantizers/gguf/utils.py (L398-L435).

There are two things to note about the tensor subclass:
1. it calls `super().__torch_function__`, which is
   `torch._C._disabled_torch_function_impl`, so this patch updates
   `SuperVariable.call_method` to handle it (we can't do a simpler
   polyfill due to some bug with `var_getattr` raising
   `NotImplementedError`, which forgot to restore symbolic context).
2. it sets and reads attributes (`quant_type`), and
   defines new methods (`as_data`), so this patch adds support for those.
3. it has a `__init__`, which Dynamo needs to trace through in
   `TensorSubclassVariable.call_function`.

Differential Revision: [D71906140](https://our.internmc.facebook.com/intern/diff/D71906140)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149482
Approved by: https://github.com/jansel, https://github.com/mlazos
2025-04-02 17:05:12 +00:00
Michael Lazos
ce5adc5c05 [Dynamo] add support for torch._C._is_torch_function_all_disabled (#149490)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149490
Approved by: https://github.com/StrongerXi
ghstack dependencies: #149489
2025-03-20 22:19:55 +00:00
Shuai Yang
00a2c68f67 Fix a typo "trochrec" to "torchrec" (#149542)
Summary: As titled, the path is incorrect due to the typo

Test Plan: CI

Differential Revision: D71490709

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149542
Approved by: https://github.com/williamwen42
2025-03-20 10:14:23 +00:00
Marko Radmilac
c65ee728f0 Initial implementation of host memory stats (#147660)
This is an initial attempt to provide some statistics for the pinned host memory allocations flowing through CachingHostAllocator. Many times in the past we have had inexplicable slowdowns that would be much easier to diagnose if we had some host memory characteristics.

This change tries very hard not to disrupt the initial design of the allocator, and it uses existing locking mechanism, whenever possible, to gather statistics "for free". Only deviation from that is on the "slow path" where we incur CUDA calls anyway, so taking a short lock is not going to hurt the performance much, especially in the steady state where most allocations will come from cache.

As mentioned before, this is the first PR, to introduce the concept and to see if it fits the right paradigm. We can always add more later.

Metrics that would require more involved changes to the code base and locks, like requested memory, have been punted for now. I also tried to reuse the Stat structure used in CUDA caching allocator, in order to maintain symmetry.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147660
Approved by: https://github.com/ngimel
2025-03-05 16:13:19 +00:00
PyTorch MergeBot
a983b2b11a Revert "Initial implementation of host memory stats (#147660)"
This reverts commit 945e359fc1.

Reverted https://github.com/pytorch/pytorch/pull/147660 on behalf of https://github.com/mradmila due to There is an issue with ambiguous definition of Stat structure when different C++ tools are used. Backing out for now. ([comment](https://github.com/pytorch/pytorch/pull/147660#issuecomment-2692346379))
2025-03-01 18:05:45 +00:00
Marko Radmilac
945e359fc1 Initial implementation of host memory stats (#147660)
This is an initial attempt to provide some statistics for the pinned host memory allocations flowing through CachingHostAllocator. Many times in the past we have had inexplicable slowdowns that would be much easier to diagnose if we had some host memory characteristics.

This change tries very hard not to disrupt the initial design of the allocator, and it uses existing locking mechanism, whenever possible, to gather statistics "for free". Only deviation from that is on the "slow path" where we incur CUDA calls anyway, so taking a short lock is not going to hurt the performance much, especially in the steady state where most allocations will come from cache.

As mentioned before, this is the first PR, to introduce the concept and to see if it fits the right paradigm. We can always add more later.

Metrics that would require more involved changes to the code base and locks, like requested memory, have been punted for now. I also tried to reuse the Stat structure used in CUDA caching allocator, in order to maintain symmetry.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147660
Approved by: https://github.com/ngimel
2025-02-28 18:36:44 +00:00
Yuanhao Ji
0a948f705b [Dynamo] Fix AssertionError when dynamo traces torch.functional.xxx() functions (#148075)
Fixes #147840

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148075
Approved by: https://github.com/yanboliang
2025-02-28 15:09:11 +00:00