From 1451d84766ea26d6e789e11fadf2bc565624d4a0 Mon Sep 17 00:00:00 2001 From: pbialecki Date: Tue, 22 Dec 2020 13:44:41 -0800 Subject: [PATCH] Minor doc fix: change truncating to rounding in TF32 docs (#49625) Summary: Minor doc fix in clarifying that the input data is rounded not truncated. CC zasdfgbnm ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/49625 Reviewed By: mruberry Differential Revision: D25668244 Pulled By: ngimel fbshipit-source-id: ac97e41e0ca296276544f9e9f85b2cf1790d9985 --- docs/source/notes/cuda.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/notes/cuda.rst b/docs/source/notes/cuda.rst index 6deea675f26..34ee143a77d 100644 --- a/docs/source/notes/cuda.rst +++ b/docs/source/notes/cuda.rst @@ -65,7 +65,7 @@ available on new NVIDIA GPUs since Ampere, internally to compute matmul (matrix and batched matrix multiplies) and convolutions. TF32 tensor cores are designed to achieve better performance on matmul and convolutions on -`torch.float32` tensors by truncating input data to have 10 bits of mantissa, and accumulating +`torch.float32` tensors by rounding input data to have 10 bits of mantissa, and accumulating results with FP32 precision, maintaining FP32 dynamic range. matmuls and convolutions are controlled separately, and their corresponding flags can be accessed at: