XiaobingSuper
4ca2fc485c
inductor(CPU): add Conv+binary+unary fusion filter ( #90259 )
...
For Conv+binary+unary fusion, we only support conv+add+relu, this PR adds a such check to fix TIMM failed models.
TODO: enable more Conv+binary+unary fusion to improve TIMM models' performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90259
Approved by: https://github.com/EikanWang , https://github.com/jgong5 , https://github.com/jansel
2022-12-12 06:04:55 +00:00
Michael Voznesensky
41c3b41b92
Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time [Merger of 89672 and 89773] ( #90039 )
...
After all of the preparatory commits, this is a subset of the
changes in https://github.com/pytorch/pytorch/pull/89392 that actually
change us to propagating fake tensors to backends.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
This is the merger of Ed's PR #89672 , which is a rewrite of an older PR of mine (#89392 ), with CI Fixes on top of it (#89773 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90039
Approved by: https://github.com/ezyang
2022-12-05 01:56:50 +00:00
PyTorch MergeBot
4648baa911
Revert "Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time [Merger of 89672 and 89773] ( #90039 )"
...
This reverts commit ef0c7ec958 .
Reverted https://github.com/pytorch/pytorch/pull/90039 on behalf of https://github.com/clee2000 due to broke xla tests ef0c7ec958 https://github.com/pytorch/pytorch/actions/runs/3606308473/jobs/6077646142
2022-12-04 21:57:30 +00:00
Michael Voznesensky
ef0c7ec958
Use dynamo fake tensor mode in aot_autograd, move aot_autograd compilation to lowering time [Merger of 89672 and 89773] ( #90039 )
...
After all of the preparatory commits, this is a subset of the
changes in https://github.com/pytorch/pytorch/pull/89392 that actually
change us to propagating fake tensors to backends.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
This is the merger of Ed's PR #89672 , which is a rewrite of an older PR of mine (#89392 ), with CI Fixes on top of it (#89773 )
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90039
Approved by: https://github.com/ezyang
2022-12-03 01:19:55 +00:00
XiaobingSuper
b058a02786
TorchDynamo: enable convolution bn folding for functional bn ( #89746 )
...
Motivation: for Timm model, there is always use customer-defined BN which using F.batch_norm: https://github.com/rwightman/pytorch-image-models/blob/main/timm/models/layers/norm_act.py#L26 , and the fx graph will be like:
```
------------- ---------------------- --------------------------------------- --------------------------------------------------------------------------------------------------------- --------
placeholder x x () {}
call_module self_conv self_conv (x,) {}
get_attr self_bn_running_mean_1 self_bn_running_mean () {}
get_attr self_bn_running_var self_bn_running_var () {}
get_attr self_bn_weight self_bn_weight () {}
get_attr self_bn_bias self_bn_bias () {}
call_function batch_norm <function batch_norm at 0x7f07196cdf70> (self_conv, self_bn_running_mean_1, self_bn_running_var, self_bn_weight, self_bn_bias, False, 0.1, 1e-05) {}
call_module self_bn_drop self_bn_drop (batch_norm,)
```
the original conv+bn folding path doesn't work for **F.batch_norm**, but for **F.batch_norm** case, if its' parameters are const(attr of the module and will not be updated), we can also do the const folding's optimization. This PR will enable it and will improve the Timm models' performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89746
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-12-02 04:13:34 +00:00
XiaobingSuper
3b3ebcd031
TorchDynamo: weight prepack for single conv ( #89209 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89209
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-25 01:23:11 +00:00
XiaobingSuper
0c4f3db7bf
TorchDynamo: weight prepack for mkl linear ( #89109 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89109
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-25 01:20:19 +00:00
XiaobingSuper
07151a6bd6
TorchDynamo: weight prepack for onednn convolution external call ( #88988 )
...
This PR is about enabled weight prepack using the MKLDNN tensor:
1. enable fake tensor mode for MKLDNN tensor input.
2. make convolution fusion kernel support MKLDNN tensor input.
3. do the weight prepack at FX fusion step.
For better performance, we always use channels_last for CPU convolution path. because we test that the channels_last path can get a better performance than block input path, and also avoid the activation's layout conversion(plain to block, block to plain), currently, there only need plain to plain format conversion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88988
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-25 01:16:11 +00:00
Will Constable
b50699f247
Fix inductor fallback_random for dropout/rand_like ( #89515 )
...
- Avoid fx graph rewrite that replaces certain ops with ones using
triton random
- Keep track of replacement ops using triton random, so it is possible
to not disable all replacements when using fallback_random
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89515
Approved by: https://github.com/ngimel
2022-11-22 23:53:47 +00:00
XiaobingSuper
31708a7310
TorchDynamo: enable conv+silu fusion ( #89278 )
...
This PR will improve the tf_efficientnet_b0 performance by fusing conv+silu.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89278
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-21 09:35:53 +00:00
XiaobingSuper
79770d3636
TorchDynamo: enable conv+relu6 fusion ( #89265 )
...
This PR is about enabled conv+relu6 which improves mobilenet'e performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89265
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-21 08:01:07 +00:00
Jiawen Liu
5270122773
[Inductor] Build FX Linear + Permute Vertical Fusion in Inductor ( #89118 )
...
Summary:
Build fx-based linear/matmul/bmm + permute/transpose vertical fusion in Inductor
For an internal Ads model: **1.15x -> 1.36x speedup**
Test Plan: CI
Reviewed By: bertmaher, jansel, jianyuh
Differential Revision: D41071665
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89118
Approved by: https://github.com/jianyuh
2022-11-16 10:37:30 +00:00
PyTorch MergeBot
9f0b2c73f3
Revert "[Inductor] Build FX Linear + Permute Vertical Fusion in Inductor ( #88859 )"
...
This reverts commit d60abe4b95 .
Reverted https://github.com/pytorch/pytorch/pull/88859 on behalf of https://github.com/kit1980 due to Broke Mac OS testing, which were clearly shown in CI
2022-11-16 01:13:00 +00:00
Jiawen Liu
d60abe4b95
[Inductor] Build FX Linear + Permute Vertical Fusion in Inductor ( #88859 )
...
Summary:
Build fx-based linear/matmul/bmm + permute/transpose vertical fusion in Inductor
For an internal Ads model: **1.15x -> 1.36x speedup**
Differential Revision: D41071665
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88859
Approved by: https://github.com/jianyuh , https://github.com/jansel
2022-11-15 19:34:38 +00:00
XiaobingSuper
9943d46aab
TorchDynamo: skip convolution fusion when convolution's padding is string ( #88794 )
...
Currently, the fusion convolution doesn't support the case when padding is a string, we will support it at the next step.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88794
Approved by: https://github.com/jansel , https://github.com/jgong5
2022-11-14 12:39:47 +00:00
XiaobingSuper
072920c281
TorchDynamo: Add convolution binary+unary fusion for cpu in inference mode ( #88412 )
...
This PR is about enabling the fusion of **conv+binary+relu**, which will improve the vision model's performance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88412
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-14 10:35:41 +00:00
XiaobingSuper
4ad7b17fab
TorchDynamo: Add convolution binary(inplace) fusion for cpu in inference mode ( #88403 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88403
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-14 08:42:40 +00:00
Yanbo Liang
cc04cf50bf
[Inductor] Fix lowmem_dropout() missing 1 required positional argument: 'p' ( #88716 )
...
Fixes error from 7k github models: https://github.com/jansel/pytorch-jit-paritybench/blob/master/generated/test_GuYuc_WS_DAN_PyTorch.py
Error:
```
TypeError: lowmem_dropout() missing 1 required positional argument: 'p'
While executing %lowmem_dropout : [#users=1] = call_function[target=torch._inductor.overrides.lowmem_dropout](args = (%avg_pool2d_9,), kwargs = {training: False})
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88716
Approved by: https://github.com/ngimel , https://github.com/jansel , https://github.com/desertfire
2022-11-10 23:37:29 +00:00
PyTorch MergeBot
29550e2c1d
Revert "[Inductor] Build FX Linear + Permute Vertical Fusion in Inductor ( #88566 )"
...
This reverts commit 48b58930cb .
Reverted https://github.com/pytorch/pytorch/pull/88566 on behalf of https://github.com/huydhn due to This change breaks trunk 48b58930cb
2022-11-10 20:56:30 +00:00
Jiawen Liu
48b58930cb
[Inductor] Build FX Linear + Permute Vertical Fusion in Inductor ( #88566 )
...
Summary:
Build fx-based linear/matmul/bmm + permute/transpose vertical fusion in Inductor
For an internal Ads model: 1.15x -> 1.36x speedup
Differential Revision: D41071665
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88566
Approved by: https://github.com/jansel , https://github.com/jianyuh
2022-11-10 18:32:25 +00:00
XiaobingSuper
3e43ff2794
torchdynamo: add convolution add(relu) inplace fusion kernel ( #88048 )
...
This PR is about add convolution add(relu) inplace fusion kernel which works for **other.add_(conv)**.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88048
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-10 13:54:37 +00:00
XiaobingSuper
b3206268ac
TorchDynamo: enable convolution and batchnorm folding for inference path ( #87435 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87435
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-04 05:24:57 +00:00
XiaobingSuper
71f793d312
TorchDynamo: Add linear binary fusion for cpu in BF16 inference mode ( #87066 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87066
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-04 02:40:29 +00:00
XiaobingSuper
e4efea4f14
TorchDynamo: Add linear unary fusion for cpu in BF16 inference mode ( #87065 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87065
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-04 01:26:08 +00:00
XiaobingSuper
52173188ef
TorchDynamo: Add convolution binary fusion for cpu in inference mode ( #87064 )
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87064
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-11-04 01:10:05 +00:00
XiaobingSuper
c36db82e12
TorchDynamo: Add convolution unary fusion for cpu in inference mode ( #87063 )
...
cc @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87063
Approved by: https://github.com/jgong5 , https://github.com/jansel
2022-10-27 06:55:32 +00:00
Jason Ansel
c7c09722ad
Move TorchDynamo into PyTorch core ( #86461 )
...
Context:
https://github.com/pytorch/torchdynamo/issues/1588
This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo ) and TorchInductor into PyTorch core.
- `torchdynamo` becomes `torch._dynamo`
- `torchinductor` becomes `torch._inductor`
This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461
Approved by: https://github.com/voznesenskym
2022-10-13 23:18:06 +00:00