pytorch/test/nn
Benjamin Glass 5aa5a5763e [inductor triton] Disable incorrect TF32 usage on CUDA capability < 8 (#145684)
Triton 2.2 and greater have a bug where allowing TF32 generation for a GPU that does not support TF32 will cause code generation errors. Patch around this problem by:

1. Adding a function to `torch.cuda` that determines whether CUDA hardware is capable of using the TF32 format.
2. Using that function to explicitly disable TF32 generation when calling Triton, where needed.

To demonstrate that this fix works, try running `test/inductor/test_max_autotune.py` on a GPU with CUDA compute capability < 8 (e.g. any NVIDIA consumer GPU) without this fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145684
Approved by: https://github.com/eqy
2025-01-28 22:01:08 +00:00
..
test_convolution.py [inductor triton] Disable incorrect TF32 usage on CUDA capability < 8 (#145684) 2025-01-28 22:01:08 +00:00
test_dropout.py Enable additional tests for MPS CI runs (#134356) 2024-10-04 21:52:38 +00:00
test_embedding.py Added Diffentiable per_sample_weights Check to EmbeddingBag.cpp (#142338) 2024-12-11 03:42:17 +00:00
test_init.py Fix unused Python variables in test/nn (#143396) 2024-12-18 03:30:54 +00:00
test_lazy_modules.py Add None return type to init -- tests (#132352) 2024-08-01 15:44:51 +00:00
test_load_state_dict.py Fix unused Python variables in test/[e-z]* (#136964) 2024-12-18 23:02:30 +00:00
test_module_hooks.py [4/N] Apply py39 ruff and pyupgrade fixes (#143257) 2025-01-04 10:47:51 +00:00
test_multihead_attention.py [ROCm] Update to AOTriton 0.7b (#134498) 2024-09-11 20:34:01 +00:00
test_packed_sequence.py [4/N] Apply py39 ruff and pyupgrade fixes (#143257) 2025-01-04 10:47:51 +00:00
test_parametrization.py Fix unused Python variables in test/nn (#143396) 2024-12-18 03:30:54 +00:00
test_pooling.py [4/N] Apply py39 ruff and pyupgrade fixes (#143257) 2025-01-04 10:47:51 +00:00
test_pruning.py Fix unused Python variables in test/nn (#143396) 2024-12-18 03:30:54 +00:00