pytorch/test/cpp
Wang, Eikan 45be74cc63 Optimize to if the datatyep of the source tensor is as same as the dest datatype (#85140)
The AMP inserts `_autocast_to_reduced_precision` and `_autocast_to_full_precision` automatically. The aten implementation provides a fast path to bypass the conversion if the tensor data type has been the reduced/full precision. But NNC always does the conversion which could bring >5% E2E performance regression.

This PR is to address the performance issue like aten. We will not pull `_autocast_to_reduced_precision` and `_autocast_to_full_precision` into NNC fusion group and fallback to aten to trigger its fast path if the tensor data type has been the reduced/full precision.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85140
Approved by: https://github.com/frank-wei
2022-09-27 04:40:42 +00:00
..
api Handle implicit real->complex casting for backward of stack (#84993) 2022-09-19 21:20:34 +00:00
c10d [2/N] [Dispatchable Collectives] Extract ProcessGroup::Work into a separate class and update references (#83680) 2022-09-14 13:05:58 +00:00
common
dist_autograd [lint] autoformat test/cpp and torch/csrc 2022-06-11 21:11:16 +00:00
jit [perf][1/5] Replace IValue::toString()->string() with IValue::toStringRef() (#85437) 2022-09-23 23:36:57 +00:00
lazy empty strided symint (#84830) 2022-09-15 04:09:43 +00:00
lite_interpreter_runtime Back out "Back out "[profiling] Adding targets file for test_mobile_profiler"" (#82243) 2022-07-28 23:08:52 +00:00
monitor torch/monitor: merge Interval and FixedCount stats (#72009) 2022-01-30 23:21:59 +00:00
profiler Add SOFT_ASSERT to gracefully recover from invariant violations (#82689) 2022-08-10 00:58:07 +00:00
rpc [lint] autoformat test/cpp and torch/csrc 2022-06-11 21:11:16 +00:00
tensorexpr Optimize to if the datatyep of the source tensor is as same as the dest datatype (#85140) 2022-09-27 04:40:42 +00:00
__init__.py