Commit Graph

19 Commits

Author SHA1 Message Date
Xinya Zhang
e769026bcb [ROCm] Remove HIPBLASLT_ALLOW_TF32 from codebase (#162998)
A few UT failures are caused by `HIPBLASLT_ALLOW_TF32`

Fixes #157094
Fixes #157093
Fixes #157092
Fixes #157091
Fixes #157064
Fixes #157063
Fixes #157062
Fixes #157061
Fixes #157042
Fixes #157041
Fixes #157039
Fixes #157004

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162998
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-09-18 13:53:48 +00:00
PyTorch MergeBot
66308fb470 Revert "[ROCm] Remove HIPBLASLT_ALLOW_TF32 from codebase (#162998)"
This reverts commit cef815dc2c.

Reverted https://github.com/pytorch/pytorch/pull/162998 on behalf of https://github.com/huydhn due to Sorry for reverting this, but it seems to break a test in trunk ([comment](https://github.com/pytorch/pytorch/pull/162998#issuecomment-3300280242))
2025-09-16 20:39:41 +00:00
Xinya Zhang
cef815dc2c [ROCm] Remove HIPBLASLT_ALLOW_TF32 from codebase (#162998)
A few UT failures are caused by `HIPBLASLT_ALLOW_TF32`

Fixes #157094, #157093, #157092, #157091, #157064, #157063, #157062, #157061, #157042, #157041, #157039, #157004

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162998
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2025-09-16 12:48:45 +00:00
Michael Lazos
01bba62e21 Remove unused test code (#160823)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160823
Approved by: https://github.com/Skylion007
2025-08-18 18:37:52 +00:00
Michael Lazos
8d6d324631 [Dynamo][Hierarchical-Compile] Don't allow node duplicates to be added (#160605)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160605
Approved by: https://github.com/StrongerXi
2025-08-14 20:02:10 +00:00
Michael Lazos
ecde76c764 [Hierarchical Compile] Sort all regions identically (#158814)
Before we would topologically sort each region individually, this works well except if some nodes have no arguments, then their order may change. To rectify this, we sort the first region as the reference region and use that sort order to sort the remaining regions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158814
Approved by: https://github.com/williamwen42
2025-08-13 11:55:23 +00:00
Michael Lazos
4628f1b7a9 [Hierarchical-Compile] Track mutations for setitem (#155880)
This fixes a bug in tensor variable where we would not do things like set the example value on setitem nodes (but these don't typically have users so it doesn't matter)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155880
Approved by: https://github.com/anijain2305
2025-06-13 18:59:31 +00:00
Michael Lazos
d3d655ad14 [Hierarchical-Compile] Hash int args in addition to input shapes (#155655)
Fixes Swsl_resnext101_32x16d in TIMM

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155655
Approved by: https://github.com/anijain2305
2025-06-12 06:35:12 +00:00
Michael Lazos
ff039d39ec [Dynamo] Optimize dedupe region ancestor tracking (#152589)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152589
Approved by: https://github.com/anijain2305
ghstack dependencies: #152389, #152505, #152410, #152506, #152570, #152572
2025-05-13 12:17:59 +00:00
Michael Lazos
023a3dc69f [Hierarchical Compilation] Track node mutations (#152389)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152389
Approved by: https://github.com/anijain2305
2025-05-13 12:17:59 +00:00
PyTorch MergeBot
5c3fddb9cc Revert "[Hierarchical Compilation] Track node mutations (#152389)"
This reverts commit c2936ebfd5.

Reverted https://github.com/pytorch/pytorch/pull/152389 on behalf of https://github.com/jeanschmidt due to Humm, interesting, there seems to be a bug in stack PRs, as it should be part of the stack and be reverted with the other ones ([comment](https://github.com/pytorch/pytorch/pull/152389#issuecomment-2873540451))
2025-05-12 18:18:44 +00:00
PyTorch MergeBot
aa7fe6af41 Revert "[Dynamo] Optimize dedupe region ancestor tracking (#152589)"
This reverts commit b5f1345f72.

Reverted https://github.com/pytorch/pytorch/pull/152589 on behalf of https://github.com/jeanschmidt due to Breaking internal signal citadel-fbcode-test-mode-opt-for-pt2_stack_for_internal-linux-0 please see diff [D74531503](https://www.internalfb.com/diff/D74531503) for more details ([comment](https://github.com/pytorch/pytorch/pull/152410#issuecomment-2871168679))
2025-05-12 07:15:09 +00:00
Michael Lazos
b5f1345f72 [Dynamo] Optimize dedupe region ancestor tracking (#152589)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152589
Approved by: https://github.com/anijain2305
ghstack dependencies: #152389, #152505, #152410, #152506, #152570, #152572
2025-05-10 08:27:56 +00:00
Michael Lazos
c2936ebfd5 [Hierarchical Compilation] Track node mutations (#152389)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152389
Approved by: https://github.com/anijain2305
2025-05-10 08:27:01 +00:00
Xiaodong Wang
3d3a07963f [reland][attempt2][AMD] Turn on TF32 for aten::mm (#144145)
Summary:
https://github.com/pytorch/pytorch/pull/143549 was reverted due to some
internal/oss tooling issue. Relanding.

hipblaslt supports TF32, so adding the support.
Original PR https://github.com/pytorch/pytorch/pull/139869

Test Plan: CI

Differential Revision: D67785496

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144145
Approved by: https://github.com/jianyuh
2025-01-06 00:37:01 +00:00
Michael Lazos
5c3996cab2 [Dynamo] topologically sort duplicated graph regions (#143523)
Ensure regions are topologically sorted

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143523
Approved by: https://github.com/williamwen42
2024-12-19 00:43:48 +00:00
Tom Ritchford
d25e6e623f Fix unused Python variables in test/[a-d]* (#134665)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665
Approved by: https://github.com/albanD
2024-12-13 22:13:12 +00:00
Michael Lazos
49e4307686 [Dynamo] add debug logging for graph region expansion (#141382)
This PR adds debug logging for the region expansion algorithm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141382
Approved by: https://github.com/williamwen42
ghstack dependencies: #141381
2024-12-11 02:22:21 +00:00
Michael Lazos
96c36a6947 [Dynamo] Implement graph region tracking for deduplication (#141381)
This PR implements graph region tracking for later extraction into common subgraphs. The algorithm is as follows:

`GraphRegionTracker` tracks each node added to the output graph and generates a key based on the source location, instruction pointer, input shapes, and global state at the time the node is inserted into the graph. Nodes with the same key are grouped together in a list of identical nodes.

Once graph capture is complete, these nodes are organized into region groups. A region group looks like this:
[[IdenticalNode1], [IdenticalNode2], [IdenticalNode3]] and each sublist is called a region. For each region group (starting at the topologically latest region group), the inner regions are gradually expanded one node at time from args and kwargs of the node in each region provided that for all regions in the group, the nodes being added are also identical (ie have the same key computed above). The `get_identical_regions` function is the main entry point which will be used by the graph replacement algorithm in #141383

Edge cases to add more testing for in future PRs (in progress):
* ~~multiple nodes on the same line~~ (implemented)
* ~~dynamic shapes checking (need to verify symbolic inputs are the same across subgraphs)~~ (implemented)
* ensure we don't expand regions where it will create a cycle during subgraph replacement
* ensure outputs are always tensors (or tuples of tensors iirc)
* ~~out of order kwargs, unevenly nested kwargs~~ (implemented)
* input aliasing - TBD, we may add support for this in `invoke_subgraph` or reuse the aliasing analysis here to not form regions with these properties
* ~~all global state~~ (implemented)

Other followups:
* consolidate global state checking across all caching infra

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141381
Approved by: https://github.com/zou3519
2024-12-11 02:22:21 +00:00