pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

haozhe.zhu 4c11b26158 refine fp32 precision api (#125888 ) Based on the [conversation](https://github.com/pytorch/pytorch/issues/121791), we plan to drop the "highest, high, medium" to represent fp32 internal computation data types . Instead, we will directly use the algorithm to represent it. ### Design Choice: Directly use algorithms name like "TF32", "BF16". #### Pros - The names are more informative. 'tf32' is more informative than a simple "high". - Easier to extend new algorithm like `tf32x3` #### Cons - "HIGHEST, HIGH, MEDIUM" indicated the relative precision between different algorithms. However, we can have more documents to discuss them. ### We provide a layered structure for backends/operators. ('f32' is short for 'fp32_precision') ![image](https://github.com/user-attachments/assets/f89143e5-d6a1-4865-9351-9a50439f5067) ### We provide 3 fp32 compute precision can be set: - "ieee": Not allowed to use any other internal computation data types . - "tf32": Allowed to use tf32 as internal computation data types. - "bf16": Allowed to use bf16 as internal computation data types. - "none": Precision's are not set. Can be override by its father node. ### Overriding Precision Settings Child node can be override by its father node if it is set to default. For current default settings: ``` backend = generic, op = all, precision setting = none backend = cuda, op = all, precision setting = none backend = cuda, op = conv, precision setting = tf32 backend = cuda, op = rnn, precision setting = tf32 backend = cuda, op = matmul, precision setting = none backend = matmul, op = all, precision setting = none backend = matmul, op = conv, precision setting = none backend = matmul, op = rnn, precision setting = none backend = matmul, op = matmul, precision setting = none ``` - If the user set `torch.backends.mkldnn.fp32_precision="bf16"`, his child nodes `torch.backends.mkldnn.matmul.fp32_precision` / `torch.backends.mkldnn.conv.fp32_precision` / `torch.backends.mkldnn.rnn.fp32_precision` will also be override to "bf16". - If the user set `torch.backends.fp32_precision="bf16"`, `torch.backends.mkldnn.fp32_precision` and his child nodes will also we override to "bf16". ### Backward Compatible Since new API allow user to have more fine-grained control. There will be some conflict. For example, previous `torch.backends.cudnn.allow_tf32` are not enough to represent the status for `torch.backends.cudnn.rnn.fp32_precision="ieee"` and `torch.backends.cudnn.conv.fp32_precision="tf32"`. Therefore, our goal for backward compatible is - If the user only uses previous APIs, it will work as previous expectations. - If the user use new API to change the status to an un-representable status for old API, and try to access the status by old API. We will raise Runtime Error and point the document for user. ### Test Plan ``` python test/test_cuda.py -k test_fp32_precision_with_tf32 python test/test_cuda.py -k test_fp32_precision_with_float32_matmul_precision python test/test_cuda.py -k test_invalid_status_for_legacy_api python test/test_mkldnn.py -k test_mlkdnn_get_set python test/test_mkldnn.py -k test_generic_precision python test/test_mkldnn.py -k test_invalid python test/test_mkldnn.py -k test_default_use_parent ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/125888 Approved by: https://github.com/jgong5, https://github.com/albanD Co-authored-by: Jiang, Yanbing <yanbing.jiang@intel.com>		2025-05-10 11:13:04 +00:00
..
amp_examples.rst	Update document for autocast on CPU (#135299 )	2024-09-13 09:11:47 +00:00
autograd.rst	Update docs of saved_tensors_hooks to avoid ref cycle (#153049 )	2025-05-07 18:54:56 +00:00
broadcasting.rst
cpu_threading_runtimes.svg
cpu_threading_torchscript_inference.rst
cpu_threading_torchscript_inference.svg
cuda.rst	refine fp32 precision api (#125888 )	2025-05-10 11:13:04 +00:00
custom_operators.rst	Redirect the custom ops landing page :D (#139634 )	2024-11-04 22:25:15 +00:00
ddp.rst
extending.func.rst
extending.rst	[autograd][docs] Add more details on why save_for_backward is important in extending autograd note (#153005 )	2025-05-09 16:36:57 +00:00
faq.rst
fsdp.rst
get_start_xpu.rst	update get start xpu document for v2.7 (#150397 )	2025-04-03 18:17:08 +00:00
gradcheck.rst
hip.rst	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00
large_scale_deployments.rst
libtorch_stable_abi.md	Improve stable library apis per Scott's feedback (#152040 )	2025-04-24 20:51:03 +00:00
mkldnn.rst	refine fp32 precision api (#125888 )	2025-05-10 11:13:04 +00:00
modules.rst	Fix to modules.rst: indent line with activation functions (#139667 )	2024-11-08 01:12:52 +00:00
mps.rst
multiprocessing.rst	Document poison fork note for accelerator APIs (#147507 )	2025-04-10 02:37:37 +00:00
numerical_accuracy.rst	Add option to configure reduced precision math backend for SDPA (#135964 )	2024-09-24 07:11:38 +00:00
out.rst	add Out Notes (#151306 )	2025-04-24 20:25:09 +00:00
randomness.rst	Fix typo in Reproducibility docs (#141341 )	2024-11-26 16:53:26 +00:00
serialization.rst	Make record/storage alignment in torch.save configurable (#147788 )	2025-03-06 12:04:46 +00:00
windows.rst	Fix broken URLs (#152237 )	2025-04-27 09:56:42 +00:00