pytorch/torch/testing/_internal
Sun, Jiayi c173a9d9b3 add Half support for layer_norm on CPU (#99590)
### Testing
Single socket (icx, 32cores):
| shape | fp32 forward (ms) | fp16 forward (ms) | mixed fp32 fp16 forward (ms) | fp32 backward (ms) | fp16 backward (ms) | mixed fp32 fp16 backward (ms) |
| -- | -- | -- | -- | -- | -- | -- |
| (1, 8, 16) | 0.012 | 0.011 | 0.011 | 0.051 | 0.051 | 0.050 |
| (8 ,8, 16) | 0.013 | 0.013 | 0.013 | 0.054 | 0.053 | 0.051 |
| (32, 8, 16) | 0.015 | 0.014 | 0.014 | 0.059 | 0.054 | 0.052 |
| (64, 128, 56, 56) | 1.875 | 0.790 | 1.016 | 12.845 | 7.151 | 6.985 |
| (64, 128, 256, 256) | 50.226 | 25.462 | 35.736 | 328.957 | 179.615 | 175.618 |

Single core (icx):

| shape | fp32 forward (ms) | fp16 forward (ms) | mixed fp32 fp16 forward (ms) | fp32 backward (ms) | fp16 backward (ms) | mixed fp32 fp16 backward (ms) |
| -- | -- | -- | -- | -- | -- | -- |
| (1, 8, 16) | 0.012 | 0.011 | 0.011 | 0.040 | 0.041 | 0.041 |
| (8 ,8, 16) | 0.012 | 0.012 | 0.012 | 0.042 | 0.042 | 0.042 |
| (32, 8, 16) | 0.027 | 0.014 | 0.014 | 0.048 | 0.048 | 0.046 |
| (64, 128, 56, 56) | 58.054 | 11.034 | 17.928 | 108.603 | 48.816 | 50.244 |
| (64, 128, 256, 256) | 1327.758 | 352.394 | 496.994 | 2846.182 | 1224.247 | 1218.422 |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99590
Approved by: https://github.com/mingfeima, https://github.com/jgong5, https://github.com/cpuhrsch
2023-12-20 01:11:15 +00:00
..
codegen remove nvfuser test in upstream pytorch (#109918) 2023-09-24 13:49:37 +00:00
data
distributed disable test_ddp_profiling_autograd_profiler in distributed_test.py (#115704) 2023-12-14 01:41:37 +00:00
generated
opinfo Use parent class attribute supports_out for foreach_zero opinfo (#112778) 2023-11-22 18:00:44 +00:00
optests Stop using excess memory in generate_opcheck_tests, re-enable fbgemm TBE tests (#114641) 2023-11-29 02:21:13 +00:00
test_module
__init__.py
autocast_test_lists.py Add Half support for CPU autocast on eager mode (#112484) 2023-11-21 20:08:28 +00:00
autograd_function_db.py Setup_context does not contain default values of forward() (#108561) 2023-09-19 16:23:52 +00:00
check_kernel_launches.py [BE] Enable ruff's UP rules and autoformat testing/ (#105425) 2023-07-18 21:04:39 +00:00
common_cuda.py Revert "Initial Flash Attention support on ROCM (#114309)" (#115975) 2023-12-16 03:40:14 +00:00
common_device_type.py [CI][Inductor] Skip CPU tests when running on GPU (#115430) 2023-12-10 15:21:24 +00:00
common_dist_composable.py
common_distributed.py Switch env variable use in test harnesses to the non-deprecated names to fix warnings (#114880) 2023-12-01 20:08:23 +00:00
common_dtype.py expose mem-eff to autograd (#110495) 2023-11-13 17:47:40 +00:00
common_fsdp.py [FSDP] Passed TORCH_NCCL_DESYNC_DEBUG instead of NCCL_DESYNC_DEBUG (#114432) 2023-11-23 04:53:12 +00:00
common_jit.py
common_methods_invocations.py add Half support for layer_norm on CPU (#99590) 2023-12-20 01:11:15 +00:00
common_modules.py Migrated loss functions to ModuleInfos (#115584) 2023-12-14 16:21:05 +00:00
common_nn.py Migrated loss functions to ModuleInfos (#115584) 2023-12-14 16:21:05 +00:00
common_optimizers.py add markDynamoStrictTest to TestOptimRenewed, removing flakiness (#115947) 2023-12-16 01:33:32 +00:00
common_pruning.py [BE] Enable ruff's UP rules and autoformat testing/ (#105425) 2023-07-18 21:04:39 +00:00
common_quantization.py [quant][be] Add a test for per channel quant for groupwise conv (#115224) 2023-12-07 04:46:20 +00:00
common_quantized.py
common_subclass.py [BE] add parentheses to kwargs unpacking func(*args, **(kwargs or {})) (#115026) 2023-12-03 20:03:26 +00:00
common_utils.py Add Dynamo test expected failure mechanism (#115845) 2023-12-15 01:22:17 +00:00
composite_compliance.py [BE] add parentheses to kwargs unpacking func(*args, **(kwargs or {})) (#115026) 2023-12-03 20:03:26 +00:00
control_flow_opinfo_db.py
custom_op_db.py optests improvements based on torchvision usage on nms (#108929) 2023-09-13 13:26:15 +00:00
dist_utils.py [BE]: Apply RUF015 to torch folder (#113025) 2023-11-07 00:48:15 +00:00
dynamo_test_failures.py Revert "More markDynamoStrictTest (#115870)" 2023-12-19 15:40:57 +00:00
hypothesis_utils.py [BE]: Update ruff to 0.285 (#107519) 2023-08-22 23:16:38 +00:00
inductor_utils.py Revert "[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275)" 2023-11-15 01:44:26 +00:00
jit_metaprogramming_utils.py Add python and C++ support for LPPool3d (#114199) 2023-12-08 18:18:44 +00:00
jit_utils.py [BE]: Apply RUF015 to torch folder (#113025) 2023-11-07 00:48:15 +00:00
logging_tensor.py
logging_utils.py make sure log tests are run in non-verbose mode (#106496) 2023-08-03 02:45:35 +00:00
quantization_torch_package_models.py
triton_utils.py [Inductor] Deduplicate grid wrapper statements for user defined triton kernels (#115849) 2023-12-20 00:25:32 +00:00
two_tensor.py Expand dynamic dims support for traceable subclasses (#114311) 2023-12-05 21:09:25 +00:00