mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Summary:
Per title.
This PR introduces a global flag that lets pytorch prefer one of the many backend implementations while calling linear algebra functions on GPU.
Usage:
```python
torch.backends.cuda.preferred_linalg_library('cusolver')
```
Available options (str): `'default'`, `'cusolver'`, `'magma'`.
Issue https://github.com/pytorch/pytorch/issues/63992 inspired me to write this PR. No heuristic is perfect on all devices, library versions, matrix shapes, workloads, etc. We can obtain better performance if we can conveniently switch linear algebra backends at runtime.
Performance of linear algebra operators after this PR should be no worse than before. The flag is set to **`'default'`** by default, which makes everything the same as before this PR.
The implementation of this PR is basically following that of https://github.com/pytorch/pytorch/pull/67790.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67980
Reviewed By: mruberry
Differential Revision: D32849457
Pulled By: ngimel
fbshipit-source-id: 679fee7744a03af057995aef06316306073010a6
95 lines
2.4 KiB
ReStructuredText
95 lines
2.4 KiB
ReStructuredText
.. role:: hidden
|
|
:class: hidden-section
|
|
|
|
torch.backends
|
|
==============
|
|
|
|
`torch.backends` controls the behavior of various backends that PyTorch supports.
|
|
|
|
These backends include:
|
|
|
|
- ``torch.backends.cuda``
|
|
- ``torch.backends.cudnn``
|
|
- ``torch.backends.mkl``
|
|
- ``torch.backends.mkldnn``
|
|
- ``torch.backends.openmp``
|
|
|
|
|
|
torch.backends.cuda
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. autofunction:: torch.backends.cuda.is_built
|
|
|
|
.. attribute:: torch.backends.cuda.matmul.allow_tf32
|
|
|
|
A :class:`bool` that controls whether TensorFloat-32 tensor cores may be used in matrix
|
|
multiplications on Ampere or newer GPUs. See :ref:`tf32_on_ampere`.
|
|
|
|
.. attribute:: torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction
|
|
|
|
A :class:`bool` that controls whether reduced precision reductions (e.g., with fp16 accumulation type) are allowed with fp16 GEMMs.
|
|
|
|
.. attribute:: torch.backends.cuda.cufft_plan_cache
|
|
|
|
``cufft_plan_cache`` caches the cuFFT plans
|
|
|
|
.. attribute:: size
|
|
|
|
A readonly :class:`int` that shows the number of plans currently in the cuFFT plan cache.
|
|
|
|
.. attribute:: max_size
|
|
|
|
A :class:`int` that controls cache capacity of cuFFT plan.
|
|
|
|
.. method:: clear()
|
|
|
|
Clears the cuFFT plan cache.
|
|
|
|
.. autofunction:: torch.backends.cuda.preferred_linalg_library
|
|
|
|
|
|
torch.backends.cudnn
|
|
^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. autofunction:: torch.backends.cudnn.version
|
|
|
|
.. autofunction:: torch.backends.cudnn.is_available
|
|
|
|
.. attribute:: torch.backends.cudnn.enabled
|
|
|
|
A :class:`bool` that controls whether cuDNN is enabled.
|
|
|
|
.. attribute:: torch.backends.cudnn.allow_tf32
|
|
|
|
A :class:`bool` that controls where TensorFloat-32 tensor cores may be used in cuDNN
|
|
convolutions on Ampere or newer GPUs. See :ref:`tf32_on_ampere`.
|
|
|
|
.. attribute:: torch.backends.cudnn.deterministic
|
|
|
|
A :class:`bool` that, if True, causes cuDNN to only use deterministic convolution algorithms.
|
|
See also :func:`torch.are_deterministic_algorithms_enabled` and
|
|
:func:`torch.use_deterministic_algorithms`.
|
|
|
|
.. attribute:: torch.backends.cudnn.benchmark
|
|
|
|
A :class:`bool` that, if True, causes cuDNN to benchmark multiple convolution algorithms
|
|
and select the fastest.
|
|
|
|
|
|
torch.backends.mkl
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
.. autofunction:: torch.backends.mkl.is_available
|
|
|
|
|
|
torch.backends.mkldnn
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. autofunction:: torch.backends.mkldnn.is_available
|
|
|
|
|
|
torch.backends.openmp
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
.. autofunction:: torch.backends.openmp.is_available
|