pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Sun, Jiayi c173a9d9b3 add Half support for layer_norm on CPU (#99590 ) ### Testing Single socket (icx, 32cores): \| shape \| fp32 forward (ms) \| fp16 forward (ms) \| mixed fp32 fp16 forward (ms) \| fp32 backward (ms) \| fp16 backward (ms) \| mixed fp32 fp16 backward (ms) \| \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| \| (1, 8, 16) \| 0.012 \| 0.011 \| 0.011 \| 0.051 \| 0.051 \| 0.050 \| \| (8 ,8, 16) \| 0.013 \| 0.013 \| 0.013 \| 0.054 \| 0.053 \| 0.051 \| \| (32, 8, 16) \| 0.015 \| 0.014 \| 0.014 \| 0.059 \| 0.054 \| 0.052 \| \| (64, 128, 56, 56) \| 1.875 \| 0.790 \| 1.016 \| 12.845 \| 7.151 \| 6.985 \| \| (64, 128, 256, 256) \| 50.226 \| 25.462 \| 35.736 \| 328.957 \| 179.615 \| 175.618 \| Single core (icx): \| shape \| fp32 forward (ms) \| fp16 forward (ms) \| mixed fp32 fp16 forward (ms) \| fp32 backward (ms) \| fp16 backward (ms) \| mixed fp32 fp16 backward (ms) \| \| -- \| -- \| -- \| -- \| -- \| -- \| -- \| \| (1, 8, 16) \| 0.012 \| 0.011 \| 0.011 \| 0.040 \| 0.041 \| 0.041 \| \| (8 ,8, 16) \| 0.012 \| 0.012 \| 0.012 \| 0.042 \| 0.042 \| 0.042 \| \| (32, 8, 16) \| 0.027 \| 0.014 \| 0.014 \| 0.048 \| 0.048 \| 0.046 \| \| (64, 128, 56, 56) \| 58.054 \| 11.034 \| 17.928 \| 108.603 \| 48.816 \| 50.244 \| \| (64, 128, 256, 256) \| 1327.758 \| 352.394 \| 496.994 \| 2846.182 \| 1224.247 \| 1218.422 \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/99590 Approved by: https://github.com/mingfeima, https://github.com/jgong5, https://github.com/cpuhrsch		2023-12-20 01:11:15 +00:00
..
codegen	remove nvfuser test in upstream pytorch (#109918 )	2023-09-24 13:49:37 +00:00
data
distributed	disable test_ddp_profiling_autograd_profiler in distributed_test.py (#115704 )	2023-12-14 01:41:37 +00:00
generated
opinfo	Use parent class attribute supports_out for foreach_zero opinfo (#112778 )	2023-11-22 18:00:44 +00:00
optests	Stop using excess memory in generate_opcheck_tests, re-enable fbgemm TBE tests (#114641 )	2023-11-29 02:21:13 +00:00
test_module
__init__.py
autocast_test_lists.py	Add Half support for CPU autocast on eager mode (#112484 )	2023-11-21 20:08:28 +00:00
autograd_function_db.py	Setup_context does not contain default values of forward() (#108561 )	2023-09-19 16:23:52 +00:00
check_kernel_launches.py	[BE] Enable ruff's UP rules and autoformat testing/ (#105425 )	2023-07-18 21:04:39 +00:00
common_cuda.py	Revert "Initial Flash Attention support on ROCM (#114309 )" (#115975 )	2023-12-16 03:40:14 +00:00
common_device_type.py	[CI][Inductor] Skip CPU tests when running on GPU (#115430 )	2023-12-10 15:21:24 +00:00
common_dist_composable.py
common_distributed.py	Switch env variable use in test harnesses to the non-deprecated names to fix warnings (#114880 )	2023-12-01 20:08:23 +00:00
common_dtype.py	expose mem-eff to autograd (#110495 )	2023-11-13 17:47:40 +00:00
common_fsdp.py	[FSDP] Passed `TORCH_NCCL_DESYNC_DEBUG` instead of `NCCL_DESYNC_DEBUG` (#114432 )	2023-11-23 04:53:12 +00:00
common_jit.py
common_methods_invocations.py	add Half support for layer_norm on CPU (#99590 )	2023-12-20 01:11:15 +00:00
common_modules.py	Migrated loss functions to ModuleInfos (#115584 )	2023-12-14 16:21:05 +00:00
common_nn.py	Migrated loss functions to ModuleInfos (#115584 )	2023-12-14 16:21:05 +00:00
common_optimizers.py	add markDynamoStrictTest to TestOptimRenewed, removing flakiness (#115947 )	2023-12-16 01:33:32 +00:00
common_pruning.py	[BE] Enable ruff's UP rules and autoformat testing/ (#105425 )	2023-07-18 21:04:39 +00:00
common_quantization.py	[quant][be] Add a test for per channel quant for groupwise conv (#115224 )	2023-12-07 04:46:20 +00:00
common_quantized.py
common_subclass.py	[BE] add parentheses to kwargs unpacking `func(args, *(kwargs or {}))` (#115026 )	2023-12-03 20:03:26 +00:00
common_utils.py	Add Dynamo test expected failure mechanism (#115845 )	2023-12-15 01:22:17 +00:00
composite_compliance.py	[BE] add parentheses to kwargs unpacking `func(args, *(kwargs or {}))` (#115026 )	2023-12-03 20:03:26 +00:00
control_flow_opinfo_db.py
custom_op_db.py	optests improvements based on torchvision usage on nms (#108929 )	2023-09-13 13:26:15 +00:00
dist_utils.py	[BE]: Apply RUF015 to torch folder (#113025 )	2023-11-07 00:48:15 +00:00
dynamo_test_failures.py	Revert "More markDynamoStrictTest (#115870 )"	2023-12-19 15:40:57 +00:00
hypothesis_utils.py	[BE]: Update ruff to 0.285 (#107519 )	2023-08-22 23:16:38 +00:00
inductor_utils.py	Revert "[inductor] Move things into torch/testing/_internal/inductor_utils.py (#113275 )"	2023-11-15 01:44:26 +00:00
jit_metaprogramming_utils.py	Add python and C++ support for LPPool3d (#114199 )	2023-12-08 18:18:44 +00:00
jit_utils.py	[BE]: Apply RUF015 to torch folder (#113025 )	2023-11-07 00:48:15 +00:00
logging_tensor.py
logging_utils.py	make sure log tests are run in non-verbose mode (#106496 )	2023-08-03 02:45:35 +00:00
quantization_torch_package_models.py
triton_utils.py	[Inductor] Deduplicate grid wrapper statements for user defined triton kernels (#115849 )	2023-12-20 00:25:32 +00:00
two_tensor.py	Expand dynamic dims support for traceable subclasses (#114311 )	2023-12-05 21:09:25 +00:00