mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Document limitations of weights_only in SECURITY.md and torch.load doc (#165645)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165645 Approved by: https://github.com/albanD
This commit is contained in:
parent
3f69b4d9b4
commit
6ecd6b23b6
|
|
@ -31,9 +31,9 @@ Be careful when running untrusted models. This classification includes models cr
|
||||||
|
|
||||||
**Prefer to execute untrusted models within a secure, isolated environment such as a sandbox** (e.g., containers, virtual machines). This helps protect your system from potentially malicious code. You can find further details and instructions in [this page](https://developers.google.com/code-sandboxing).
|
**Prefer to execute untrusted models within a secure, isolated environment such as a sandbox** (e.g., containers, virtual machines). This helps protect your system from potentially malicious code. You can find further details and instructions in [this page](https://developers.google.com/code-sandboxing).
|
||||||
|
|
||||||
**Be mindful of risky model formats**. Give preference to share and load weights with the appropriate format for your use case. [safetensors](https://huggingface.co/docs/safetensors/en/index) gives the most safety but is the most restricted in what it supports. [`torch.load`](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load) with `weights_only=True` is also secure to our knowledge even though it offers significantly larger surface of attack. Loading un-trusted checkpoint with `weights_only=False` MUST never be done.
|
**Be mindful of risky model formats**. Give preference to share and load weights with the appropriate format for your use case. [safetensors](https://huggingface.co/docs/safetensors/en/index) gives the most safety but is the most restricted in what it supports. [`torch.load`](https://pytorch.org/docs/stable/generated/torch.load.html#torch.load) has a significantly larger surface of attack but is more flexible in what it can serialize. See the documentation for more details.
|
||||||
|
|
||||||
|
|
||||||
|
Even for more secure serialization formats, unexpected inputs to the downstream system can cause diverse security threats (e.g. denial of service, out of bound reads/writes) and thus we recommend extensive validation of any untrusted inputs.
|
||||||
|
|
||||||
Important Note: The trustworthiness of a model is not binary. You must always determine the proper level of caution depending on the specific model and how it matches your use case and risk tolerance.
|
Important Note: The trustworthiness of a model is not binary. You must always determine the proper level of caution depending on the specific model and how it matches your use case and risk tolerance.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -263,12 +263,31 @@ offers a comprehensive example of using these features to manipulate a checkpoin
|
||||||
Starting in version 2.6, ``torch.load`` will use ``weights_only=True`` if the ``pickle_module``
|
Starting in version 2.6, ``torch.load`` will use ``weights_only=True`` if the ``pickle_module``
|
||||||
argument is not passed.
|
argument is not passed.
|
||||||
|
|
||||||
|
.. _weights-only-security:
|
||||||
|
|
||||||
|
weights_only security
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
As discussed in the documentation for :func:`torch.load`, ``weights_only=True`` restricts
|
As discussed in the documentation for :func:`torch.load`, ``weights_only=True`` restricts
|
||||||
the unpickler used in ``torch.load`` to only executing functions/building classes required for
|
the unpickler used in ``torch.load`` to only executing functions/building classes required for
|
||||||
``state_dicts`` of plain ``torch.Tensors`` as well as some other primitive types. Further,
|
``state_dicts`` of plain ``torch.Tensors`` as well as some other primitive types. Further,
|
||||||
unlike the default ``Unpickler`` provided by the ``pickle`` module, the ``weights_only`` Unpickler
|
unlike the default ``Unpickler`` provided by the ``pickle`` module, the ``weights_only`` Unpickler
|
||||||
is not allowed to dynamically import anything during unpickling.
|
is not allowed to dynamically import anything during unpickling.
|
||||||
|
|
||||||
|
``weights_only=True`` narrows the surface of remote code execution attacks but has the following limitations:
|
||||||
|
|
||||||
|
1. ``weights_only=True`` does not guard against denial of service attacks.
|
||||||
|
2. We try to prevent memory corruptions during ``torch.load(weights_only=True)`` but they might still be possible.
|
||||||
|
|
||||||
|
Note that even if memory corruption does not occur during ``torch.load`` itself, loading CAN create
|
||||||
|
unexpected objects for the downstream code that can also lead to memory corruption (e.g. a Tensor of
|
||||||
|
indices and values made to a sparse Tensor in user code might write/read out of bounds).
|
||||||
|
|
||||||
|
.. _weights-only-allowlist:
|
||||||
|
|
||||||
|
weights_only allowlist
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
As mentioned above, saving a module's ``state_dict`` is a best practice when using ``torch.save``. If loading an old
|
As mentioned above, saving a module's ``state_dict`` is a best practice when using ``torch.save``. If loading an old
|
||||||
checkpoint that contains an ``nn.Module``, we recommend ``weights_only=False``. When loading a checkpoint that contains
|
checkpoint that contains an ``nn.Module``, we recommend ``weights_only=False``. When loading a checkpoint that contains
|
||||||
tensor subclasses, there will likely be functions/classes that need to be allowlisted, see below for further details.
|
tensor subclasses, there will likely be functions/classes that need to be allowlisted, see below for further details.
|
||||||
|
|
|
||||||
|
|
@ -1304,6 +1304,11 @@ def load(
|
||||||
|
|
||||||
Loads an object saved with :func:`torch.save` from a file.
|
Loads an object saved with :func:`torch.save` from a file.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
:func:`torch.load()` uses an unpickler under the hood. **Never load data from an untrusted source.**
|
||||||
|
|
||||||
|
See :ref:`weights-only-security` for more details.
|
||||||
|
|
||||||
:func:`torch.load` uses Python's unpickling facilities but treats storages,
|
:func:`torch.load` uses Python's unpickling facilities but treats storages,
|
||||||
which underlie tensors, specially. They are first deserialized on the
|
which underlie tensors, specially. They are first deserialized on the
|
||||||
CPU and are then moved to the device they were saved from. If this fails
|
CPU and are then moved to the device they were saved from. If this fails
|
||||||
|
|
@ -1356,13 +1361,6 @@ def load(
|
||||||
:func:`pickle_module.load` and :func:`pickle_module.Unpickler`, e.g.,
|
:func:`pickle_module.load` and :func:`pickle_module.Unpickler`, e.g.,
|
||||||
:attr:`errors=...`.
|
:attr:`errors=...`.
|
||||||
|
|
||||||
.. warning::
|
|
||||||
:func:`torch.load()` unless `weights_only` parameter is set to `True`,
|
|
||||||
uses ``pickle`` module implicitly, which is known to be insecure.
|
|
||||||
It is possible to construct malicious pickle data which will execute arbitrary code
|
|
||||||
during unpickling. Never load data that could have come from an untrusted
|
|
||||||
source in an unsafe mode, or that could have been tampered with. **Only load data you trust**.
|
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
When you call :func:`torch.load()` on a file which contains GPU tensors, those tensors
|
When you call :func:`torch.load()` on a file which contains GPU tensors, those tensors
|
||||||
will be loaded to GPU by default. You can call ``torch.load(.., map_location='cpu')``
|
will be loaded to GPU by default. You can call ``torch.load(.., map_location='cpu')``
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user