[PT2] deprecate force_same_precision, guarded by JK (#156789)

Summary:
cuBLAS used to have strict alignment requirements for TF32 usage, even if TF32 was enabled by users; this caused a numeric SEV in the past, when Triton would use TF32 even if cuBLAS could not due to failing the alignment checks

we believe that cuBLAS no longer has alignment requirements for TF32 usage, based on some testing in D77265581; we'd like to deprecate `force_same_precision` since it no longer functions as expected

changing the default to False in fbcode, guarded by a jk so that we can quickly revert to the original behavior if needed

Test Plan:
CI

Rollback Plan:

Differential Revision: D77265930

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156789
Approved by: https://github.com/jhadidjojo, https://github.com/masnesral
This commit is contained in:
Nicolas Macchioni 2025-06-27 00:43:06 +00:00 committed by PyTorch MergeBot
parent 6215e90b7b
commit 3bdd5ae334

View File

@ -428,8 +428,11 @@ graph_partition = False
# when m, n, k are multiples of 16, 16, 8, whereas triton supports TF32 for matmul operations
# for any combinations of m, n, k, regardless of their alignment. setting this flag will ensure
# that triton does not use TF32 wherever cublas would not use TF32
force_same_precision = (
True if is_fbcode() else os.environ.get("TORCHINDUCTOR_FORCE_SAME_PRECISION") == "1"
# DEPRECATED. cuBLAS no longer has the above alignment requirements. will remove in the future.
force_same_precision: bool = Config(
justknob="pytorch/compiler:force_same_precision",
env_name_force="TORCHINDUCTOR_FORCE_SAME_PRECISION",
default=False,
)
# Specify candidate backends for gemm autotune.