mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Since CUDA 11.x (need to update the docs for this, current PR is saying 12.2 which is incorrect) we've been allocating cuBLAS workspaces explicitly per handle/stream combination https://github.com/pytorch/pytorch/pull/85447 According to the cuBLAS documentation, this appears to be sufficient for determinism without any explicit workspace requirements to e.g., `:4096:8` or `:16:8` as was previously expressed in PyTorch docs https://docs.nvidia.com/cuda/cublas/#results-reproducibility Planning to add an explicit determinism test as well... Pull Request resolved: https://github.com/pytorch/pytorch/pull/161749 Approved by: https://github.com/ngimel |
||
|---|---|---|
| .. | ||
| amp_examples.rst | ||
| autograd.rst | ||
| broadcasting.rst | ||
| cpu_threading_torchscript_inference.rst | ||
| cuda.rst | ||
| custom_operators.rst | ||
| ddp.rst | ||
| extending.func.rst | ||
| extending.rst | ||
| faq.rst | ||
| get_start_xpu.rst | ||
| gradcheck.rst | ||
| hip.rst | ||
| large_scale_deployments.rst | ||
| libtorch_stable_abi.md | ||
| mkldnn.rst | ||
| modules.rst | ||
| mps.rst | ||
| multiprocessing.rst | ||
| numerical_accuracy.rst | ||
| out.rst | ||
| randomness.rst | ||
| serialization.rst | ||
| windows.rst | ||