pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

eqy 790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946 ) Summary: https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = ` rather than making it the default behavior. CC ngimel ptrblck stas00 Note that the behavior after the previous PR can be replicated with `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False` Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946 Reviewed By: zou3519 Differential Revision: D32289896 Pulled By: ngimel fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe		2021-11-09 17:27:20 -08:00
..
amp_examples.rst	Reference amp tutorial (recipe) from core amp docs (#44725 )	2020-09-16 11:37:58 -07:00
autograd.rst	Add docs describing saved tensor hooks (#62362 )	2021-08-20 11:10:51 -07:00
broadcasting.rst	Fixes docs (#51439 )	2021-01-31 22:00:26 -08:00
cpu_threading_runtimes.svg	Update CPU threading doc (#33083 )	2020-02-11 14:13:51 -08:00
cpu_threading_torchscript_inference.rst	Upgrade MKL-DNN to DNNL v1.2 (#32422 )	2020-03-26 22:07:59 -07:00
cpu_threading_torchscript_inference.svg	Lint trailing newlines (#54737 )	2021-03-30 13:09:52 -07:00
cuda.rst	Add an option to disable reduced precision reductions for FP16 GEMM (#67946 )	2021-11-09 17:27:20 -08:00
ddp.rst	Forbid trailing whitespace (#53406 )	2021-03-05 17:22:55 -08:00
extending.rst	Update extending doc to cover forward mode AD (#66962 )	2021-10-27 14:18:38 -07:00
faq.rst	Update faq.rst so OOM section mentions checkpoint (#62709 )	2021-08-05 07:40:08 -07:00
gradcheck.rst	Add first draft of gradcheck note (#55966 )	2021-04-27 14:33:42 -07:00
hip.rst	Add note on ifdefing based on CUDA_VERSION for ROCm path (#62850 )	2021-08-25 15:02:03 -07:00
large_scale_deployments.rst	Move ThreadLocalDebugInfo to c10 (#37774 )	2020-05-11 19:27:41 -07:00
modules.rst	Update link to tutorial on defining NN modules (#65534 )	2021-09-23 11:26:50 -07:00
multiprocessing.rst	Update docs for master to remove Python 2 references (#36336 )	2020-04-16 10:15:48 -07:00
numerical_accuracy.rst	Add an option to disable reduced precision reductions for FP16 GEMM (#67946 )	2021-11-09 17:27:20 -08:00
randomness.rst	add comma to prevent syntax errors (#62492 )	2021-08-16 12:27:31 -07:00
serialization.rst	docs: reference links to serialization.html (#54659 )	2021-03-29 10:15:07 -07:00
windows.rst	Remove old references to 9.2 in documentation (#65059 )	2021-09-16 13:24:05 -07:00