Commit Graph

24 Commits

Author SHA1 Message Date
Xuehai Pan
fc0376e8b1 [BE][2/6] fix typos in test/ (test/test_*.py) (#157636)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157636
Approved by: https://github.com/yewentao256, https://github.com/mlazos
ghstack dependencies: #156311, #156609
2025-07-09 11:02:23 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
Zhou, Lingzhi
35532fc477 [Partitioner] Reuse partition to check whether nodes exist (#135317)
The time complexity of find node whether in NodeList is O(n). Reuse partition to speed up due to partition.nodes is hash table and has same elements.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135317
Approved by: https://github.com/ezyang
2024-09-21 23:52:02 +00:00
PyTorch MergeBot
c025f7becc Revert "[Partitioner] Reuse partition to check whether nodes exist (#135317)"
This reverts commit e004d539da.

Reverted https://github.com/pytorch/pytorch/pull/135317 on behalf of https://github.com/izaitsevfb due to BC-breaking, breaks executorch and internal meta builds ([comment](https://github.com/pytorch/pytorch/pull/135317#issuecomment-2344730294))
2024-09-11 21:27:53 +00:00
Zhou, Lingzhi
e004d539da [Partitioner] Reuse partition to check whether nodes exist (#135317)
The time complexity of find node whether in NodeList is O(n). Reuse partition to speed up due to partition.nodes is hash table and has same elements.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135317
Approved by: https://github.com/ezyang
2024-09-10 17:45:29 +00:00
Oguz Ulgen
221350e3a4 Add None return type to init -- tests (#132352)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132352
Approved by: https://github.com/ezyang
ghstack dependencies: #132335, #132351
2024-08-01 15:44:51 +00:00
Ma-Jian1
da2f4bbc33 remove empty partition (#124920)
In some rare scenarios, the partitioner will produce an empty partition. it's a waste of time to compile an empty graph.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124920
Approved by: https://github.com/ezyang
2024-05-09 07:39:47 +00:00
PyTorch MergeBot
b2f521f376 Revert "remove empty partition (#124920)"
This reverts commit 98835fff9f.

Reverted https://github.com/pytorch/pytorch/pull/124920 on behalf of https://github.com/clee2000 due to I think Dr CI is wrong, the xla failure looks real 98835fff9f https://github.com/pytorch/pytorch/actions/runs/8840540357/job/24278180954 ([comment](https://github.com/pytorch/pytorch/pull/124920#issuecomment-2078495051))
2024-04-26 02:03:01 +00:00
Ma-Jian1
98835fff9f remove empty partition (#124920)
In some rare scenarios, the partitioner will produce an empty partition. it's a waste of time to compile an empty graph.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124920
Approved by: https://github.com/ezyang
2024-04-25 23:07:41 +00:00
Sergii Dymchenko
bd9db6a9c7 Update to TorchFix 0.4.0 (#119424)
`torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424
Approved by: https://github.com/zou3519
2024-02-12 23:30:12 +00:00
seanlatias
aad017183d Introduce aggressive merge to CapabilityPartitioner (#100195)
With the old partitioner, suppose `add` is supported, the following code
```python
def fn(a, b, c, d):
    x = a + b # add
    y = c + d # add_1
    return (x, y)

traced = symbolic_trace(fn)
partitioner = CapabilityBasedPartitioner(traced, supported_ops, allows_single_node_partition=True)
partitions = partitioner.propose_partitions()
```
results in the partitions `[[add], [add_1]]`. However, since these two partitions do not depend on each other, they can be aggressively merged into a single partition `[[add, add_1]]` without causing any issues. This PR introduces a new feature that allows such aggressive merging by introducing an option `aggressive_merge` to the Partitioner class.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100195
Approved by: https://github.com/SherlockNoMad
2023-05-05 23:20:17 +00:00
jjsjann123
192a11d49c refactor the dfs cyclic search from recursion to iterative approach (#91042)
Follow up on PR #86511

Python's 1000 limit on recursion depth is not practical for us to run cyclic check on larger graphs. This refactor avoids that issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91042
Approved by: https://github.com/kit1980
2022-12-20 23:15:30 +00:00
jjsjann123
af09270e10 nvprims bookend non compute (#88457)
Cherry-pickeding: https://github.com/csarofeen/pytorch/pull/2099

1. enabling bookend non-compute-ops pass on nvfuser
2. fixing bookend op check on intermediate tensor as partition inputs
3. python tests added for: `getitem` special handling bookend_non_compute removal
4. patching dfs by excluding dfs within partition to avoid going over recursion limitation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88457
Approved by: https://github.com/SherlockNoMad
2022-11-08 12:06:35 +00:00
Ivan Yashchuk
72f446b9bc Remove getitem special handling in the partitioner (#87073)
This special handling of getitem unnecessary splits fusions at functions with tuple outputs.

Example script:
```py
import torch
from torch.fx.passes.infra.partitioner import CapabilityBasedPartitioner
from torch._prims.nvfuser_executor import NvfuserPrimOperatorSupport
from torch.fx.experimental.proxy_tensor import make_fx

def func(x):
    xx = torch.ops.nvprims.add(x, 1)
    var, mean = torch.ops.nvprims.var_mean(x, correction=0)
    var_cos = torch.ops.nvprims.cos(var)
    mean_sin = torch.ops.nvprims.sin(mean)
    return torch.ops.nvprims.add(var_cos, mean_sin)

a = torch.randn(5, 3, 3, device="cuda")
gm = make_fx(func)(a)
gm.graph.print_tabular()

supported_ops = NvfuserPrimOperatorSupport()
partitioner = CapabilityBasedPartitioner(
    gm, supported_ops, allows_single_node_partition=False
)
partitions = partitioner.propose_partitions()
print(partitions)
partitioned_graph = partitioner.fuse_partitions(partitions)
partitioned_graph.graph.print_tabular()
```
Output on master:
```py
opcode         name       target                       args              kwargs
-------------  ---------  ---------------------------  ----------------  -----------------
placeholder    x_1        x_1                          ()                {}
call_function  add        nvprims.add.default          (x_1, 1)          {}
call_function  var_mean   nvprims.var_mean.main        (x_1, [0, 1, 2])  {'correction': 0}
call_function  getitem    <built-in function getitem>  (var_mean, 0)     {}
call_function  getitem_1  <built-in function getitem>  (var_mean, 1)     {}
call_function  cos        nvprims.cos.default          (getitem,)        {}
call_function  sin        nvprims.sin.default          (getitem_1,)      {}
call_function  add_1      nvprims.add.default          (cos, sin)        {}
output         output     output                       (add_1,)          {}
[{cos, sin, add_1}, {var_mean, add, getitem, getitem_1}]
opcode         name       target                       args                    kwargs
-------------  ---------  ---------------------------  ----------------------  --------
placeholder    x_1        x_1                          ()                      {}
call_module    fused_1    fused_1                      (x_1,)                  {}
call_function  getitem_2  <built-in function getitem>  (fused_1, 0)            {}
call_function  getitem_3  <built-in function getitem>  (fused_1, 1)            {}
call_module    fused_0    fused_0                      (getitem_2, getitem_3)  {}
output         output     output                       (fused_0,)              {}
```
Output with this PR:
```
[{var_mean, add_1, cos, sin, add, getitem_1, getitem}]
opcode       name     target    args        kwargs
-----------  -------  --------  ----------  --------
placeholder  x_1      x_1       ()          {}
call_module  fused_0  fused_0   (x_1,)      {}
output       output   output    (fused_0,)  {}
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87073
Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
2022-10-26 14:18:46 +00:00
jjsjann123
f903f1ab34 Patching getitem in partitioner (#86713)
1. rejecting getitem operator in backends fusion query getitem is merged in a special post partition pass, backends that takes getitem shouldn't affect the logic
2. added test for failing cases

Fixes #86698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86713
Approved by: https://github.com/SherlockNoMad
2022-10-12 07:50:46 +00:00
jjsjann123
2cb330ab15 Acyclic partition patch (#86511)
Fixes #86159 and #86108

Refactored graph partition to check for cyclic dependency on each partition merge, instead of relying on a pre-baked dependency map.

The previous implementation suffers from not updating dependency on existing partition. When a fusion happens, the updated dependency map needs to be propagated to all nodes in the graph, so each node in a partition shares an identical dependency set. Previous implementation suffers from the not identifying cyclic dependency in issue #86159.

Updated implementation does a cyclic check on partitioned graph before attempting a merge of two partitions.

- [x] python repro added with cyclic dependency after partition `TestFXGraphPasses.forward12`
- [x] fix dependency map with updated implementation using cyclic check

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86511
Approved by: https://github.com/SherlockNoMad
2022-10-10 23:48:52 +00:00
Sherlock Huang
2fec853c87 Fix SubgraphMatcher for case of no anchor found (#86421)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86421
Approved by: https://github.com/jerryzh168
2022-10-07 02:05:42 +00:00
Sherlock Huang
a8add2b92f Support matching Args for SubgraphMatcher (#85456)
Subgraph matcher now handles the matching of non-Node arguments.

Here are the 4 cases
- pn is Node, gn is Node: this go through the regular _match_node() function
- pn is Noed, gn is not a Node: this is a match if only pn is a placeholder op
- pn is not Node, gn is Node: this is a no match case
- pn is not a Node, gn is not a Node: this will go through the argument comparison.

With this change
```
def target(x):
    return foo(x, 3)

def pattern(x, y):
    return foo(x, y)
```

is a match

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85456
Approved by: https://github.com/jerryzh168
2022-09-24 20:06:48 +00:00
Sherlock Huang
34296e2f4c SubgraphMatcher remove invalid matches (#85444)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85444
Approved by: https://github.com/rkindi
2022-09-22 02:59:11 +00:00
Shirong Wu
fc470cf980 Back out "Support regex-style matching for Any and Oneof (#82853)" (#83922)
Reviewed By: hl475

Differential Revision: D38945806

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83922
Approved by: https://github.com/hl475
2022-08-24 00:17:46 +00:00
Sherlock Huang
39e6238788 Support regex-style matching for Any and Oneof (#82853)
pseudo.any is a wildcard node that can be matched with any fx node with arbitrary number of inputs and outputs.
For example, to match relu followed by one fx node:
```
    def pattern(a):
        y = a.relu()
        z = torch.ops.pseudo.any(y)
        return z
```

pseudo.oneof is a special node that can be matched with a fx node whose target is in the permissible list.
`targets` must be be a list of qualified name for operators, e.g. ["operator.add", "torch.sigmoid",
"torch.ops.aten.foo", "torch.ops.prims.bar"]

For example, using following pattern with pseudo.oneof
```
    def pattern(a):
        y = a.relu()
        z = torch.ops.pseudo.oneof(y, targets=["relu", "torch.sigmoid", "operator.add"])
        return z
```

It will have 3 matches in the following function
```
    def forward(y):
        z = y.relu()
        x = z.relu()    # first match

        x = x.relu()
        x = torch.sigmoid(x)    # second match

        x = x.relu()
        return x + 1    # third match
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82853
Approved by: https://github.com/ezyang
2022-08-12 18:43:13 +00:00
Sherlock Huang
2ca721cda5 An improved version of subgraph matcher (#82090)
This new version of subgraph matcher further supports
- optionally match with pattern's placeholder and output nodes
- patterns with multiple outputs
- filtering out non-containing matches
- filtering out overlapping matches

TODOs:
- [x] Update replace_pattern() to use this matcher
- [x] Fix cases with identical anchor
- [x] Introduce wildcard matching, such Any, OneOf
- [ ] Improve node comparer to match args and kwargs values
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82090
Approved by: https://github.com/ezyang
2022-08-12 03:32:09 +00:00
Edward Z. Yang
5b88a2078b Follow GitHub relabeling of oncall: fx for test owners (#81821)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81821
Approved by: https://github.com/janeyx99
2022-07-21 01:50:06 +00:00
Sherlock Huang
752c06e0e1 FX graph partitioner and fuser (#79439)
This PR introduces two components.

CapabilityBasedPartitioner for FX graph: given a list of supported operators, this partitioner tries to forms the largest subgraphs that only contain the supported ops.

Fuser utility: given a list of nodes in FX graph, it lifts them as a sub-GraphModule in the original graph.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79439
Approved by: https://github.com/jjsjann123, https://github.com/davidberard98
2022-06-24 18:49:37 +00:00