mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
[docs] Fix backticks in docs (#60474)
Summary: There is a very common error when writing docs: One forgets to write a matching `` ` ``, and something like ``:attr:`x`` is rendered in the docs. This PR fixes most (all?) of these errors (and a few others). I found these running ``grep -r ">[^#<][^<]*\`"`` on the `docs/build/html/generated` folder. The regex finds an HTML tag that does not start with `#` (as python comments in example code may contain backticks) and that contains a backtick in the rendered HTML. This regex has not given any false positive in the current codebase, so I am inclined to suggest that we should add this check to the CI. Would this be possible / reasonable / easy to do malfet ? Pull Request resolved: https://github.com/pytorch/pytorch/pull/60474 Reviewed By: mrshenli Differential Revision: D29309633 Pulled By: albanD fbshipit-source-id: 9621e0e9f87590cea060dd084fa367442b6bd046
This commit is contained in:
parent
bb9e1150ea
commit
4e347f1242
|
|
@ -1751,7 +1751,7 @@ add_docstr_all('index_add_',
|
|||
r"""
|
||||
index_add_(dim, index, tensor, *, alpha=1) -> Tensor
|
||||
|
||||
Accumulate the elements of attr:`alpha` times :attr:`tensor` into the :attr:`self`
|
||||
Accumulate the elements of :attr:`alpha` times :attr:`tensor` into the :attr:`self`
|
||||
tensor by adding to the indices in the order given in :attr:`index`. For example,
|
||||
if ``dim == 0``, ``index[i] == j``, and ``alpha=-1``, then the ``i``\ th row of
|
||||
:attr:`tensor` is subtracted from the ``j``\ th row of :attr:`self`.
|
||||
|
|
|
|||
|
|
@ -4642,7 +4642,7 @@ Multiplies :attr:`input` by 2**:attr:`other`.
|
|||
|
||||
Typically this function is used to construct floating point numbers by multiplying
|
||||
mantissas in :attr:`input` with integral powers of two created from the exponents
|
||||
in :attr:'other'.
|
||||
in :attr:`other`.
|
||||
|
||||
Args:
|
||||
{input}
|
||||
|
|
@ -5242,7 +5242,7 @@ remaining :math:`m - n` rows of that column.
|
|||
last `m - n` columns in the case `m > n`. In :func:`torch.linalg.lstsq`, the residuals
|
||||
are in the field 'residuals' of the returned named tuple.
|
||||
|
||||
Unpacking the solution as``X = torch.lstsq(B, A).solution[:A.size(1)]`` should be replaced with
|
||||
Unpacking the solution as ``X = torch.lstsq(B, A).solution[:A.size(1)]`` should be replaced with
|
||||
|
||||
.. code:: python
|
||||
|
||||
|
|
@ -5671,10 +5671,7 @@ dimension(s) :attr:`dim`.
|
|||
while ``max(dim)``/``min(dim)`` propagates gradient only to a single
|
||||
index in the source tensor.
|
||||
|
||||
If :attr:`keepdim is ``True``, the output tensors are of the same size
|
||||
as :attr:`input` except in the dimension(s) :attr:`dim` where they are of size 1.
|
||||
Otherwise, :attr:`dim`s are squeezed (see :func:`torch.squeeze`), resulting
|
||||
in the output tensors having fewer dimension than :attr:`input`.
|
||||
{keepdim_details}
|
||||
|
||||
Args:
|
||||
{input}
|
||||
|
|
@ -6132,10 +6129,7 @@ dimension(s) :attr:`dim`.
|
|||
while ``max(dim)``/``min(dim)`` propagates gradient only to a single
|
||||
index in the source tensor.
|
||||
|
||||
If :attr:`keepdim` is ``True``, the output tensors are of the same size as
|
||||
:attr:`input` except in the dimension(s) :attr:`dim` where they are of size 1.
|
||||
Otherwise, :attr:`dim`s are squeezed (see :func:`torch.squeeze`), resulting in
|
||||
the output tensors having fewer dimensions than :attr:`input`.
|
||||
{keepdim_details}
|
||||
|
||||
Args:
|
||||
{input}
|
||||
|
|
@ -6677,7 +6671,7 @@ add_docstr(torch.narrow,
|
|||
narrow(input, dim, start, length) -> Tensor
|
||||
|
||||
Returns a new tensor that is a narrowed version of :attr:`input` tensor. The
|
||||
dimension :attr:`dim` is input from :attr:`start` to :attr:`start + length`. The
|
||||
dimension :attr:`dim` is input from :attr:`start` to ``start + length``. The
|
||||
returned tensor and :attr:`input` tensor share the same underlying storage.
|
||||
|
||||
Args:
|
||||
|
|
@ -6704,7 +6698,7 @@ nan_to_num(input, nan=0.0, posinf=None, neginf=None, *, out=None) -> Tensor
|
|||
|
||||
Replaces :literal:`NaN`, positive infinity, and negative infinity values in :attr:`input`
|
||||
with the values specified by :attr:`nan`, :attr:`posinf`, and :attr:`neginf`, respectively.
|
||||
By default, :literal:`NaN`s are replaced with zero, positive infinity is replaced with the
|
||||
By default, :literal:`NaN`\ s are replaced with zero, positive infinity is replaced with the
|
||||
greatest finite value representable by :attr:`input`'s dtype, and negative infinity
|
||||
is replaced with the least finite value representable by :attr:`input`'s dtype.
|
||||
|
||||
|
|
@ -6837,7 +6831,7 @@ nonzero(input, *, out=None, as_tuple=False) -> LongTensor or tuple of LongTensor
|
|||
When :attr:`input` is on CUDA, :func:`torch.nonzero() <torch.nonzero>` causes
|
||||
host-device synchronization.
|
||||
|
||||
**When** :attr:`as_tuple` **is ``False`` (default)**:
|
||||
**When** :attr:`as_tuple` **is** ``False`` **(default)**:
|
||||
|
||||
Returns a tensor containing the indices of all non-zero elements of
|
||||
:attr:`input`. Each row in the result contains the indices of a non-zero
|
||||
|
|
@ -6848,7 +6842,7 @@ If :attr:`input` has :math:`n` dimensions, then the resulting indices tensor
|
|||
:attr:`out` is of size :math:`(z \times n)`, where :math:`z` is the total number of
|
||||
non-zero elements in the :attr:`input` tensor.
|
||||
|
||||
**When** :attr:`as_tuple` **is ``True``**:
|
||||
**When** :attr:`as_tuple` **is** ``True``:
|
||||
|
||||
Returns a tuple of 1-D tensors, one for each dimension in :attr:`input`,
|
||||
each containing the indices (in that dimension) of all non-zero elements of
|
||||
|
|
@ -10205,9 +10199,9 @@ The operation is defined as:
|
|||
|
||||
Arguments:
|
||||
condition (BoolTensor): When True (nonzero), yield x, otherwise yield y
|
||||
x (Tensor or Scalar): value (if :attr:x is a scalar) or values selected at indices
|
||||
x (Tensor or Scalar): value (if :attr:`x` is a scalar) or values selected at indices
|
||||
where :attr:`condition` is ``True``
|
||||
y (Tensor or Scalar): value (if :attr:x is a scalar) or values selected at indices
|
||||
y (Tensor or Scalar): value (if :attr:`y` is a scalar) or values selected at indices
|
||||
where :attr:`condition` is ``False``
|
||||
|
||||
Returns:
|
||||
|
|
|
|||
|
|
@ -23,11 +23,11 @@ class Categorical(Distribution):
|
|||
relative probability vectors.
|
||||
|
||||
.. note:: The `probs` argument must be non-negative, finite and have a non-zero sum,
|
||||
and it will be normalized to sum to 1 along the last dimension. attr:`probs`
|
||||
and it will be normalized to sum to 1 along the last dimension. :attr:`probs`
|
||||
will return this normalized value.
|
||||
The `logits` argument will be interpreted as unnormalized log probabilities
|
||||
and can therefore be any real number. It will likewise be normalized so that
|
||||
the resulting probabilities sum to 1 along the last dimension. attr:`logits`
|
||||
the resulting probabilities sum to 1 along the last dimension. :attr:`logits`
|
||||
will return this normalized value.
|
||||
|
||||
See also: :func:`torch.multinomial`
|
||||
|
|
|
|||
|
|
@ -16,11 +16,11 @@ class Multinomial(Distribution):
|
|||
called (see example below)
|
||||
|
||||
.. note:: The `probs` argument must be non-negative, finite and have a non-zero sum,
|
||||
and it will be normalized to sum to 1 along the last dimension. attr:`probs`
|
||||
and it will be normalized to sum to 1 along the last dimension. :attr:`probs`
|
||||
will return this normalized value.
|
||||
The `logits` argument will be interpreted as unnormalized log probabilities
|
||||
and can therefore be any real number. It will likewise be normalized so that
|
||||
the resulting probabilities sum to 1 along the last dimension. attr:`logits`
|
||||
the resulting probabilities sum to 1 along the last dimension. :attr:`logits`
|
||||
will return this normalized value.
|
||||
|
||||
- :meth:`sample` requires a single shared `total_count` for all
|
||||
|
|
|
|||
|
|
@ -12,11 +12,11 @@ class OneHotCategorical(Distribution):
|
|||
Samples are one-hot coded vectors of size ``probs.size(-1)``.
|
||||
|
||||
.. note:: The `probs` argument must be non-negative, finite and have a non-zero sum,
|
||||
and it will be normalized to sum to 1 along the last dimension. attr:`probs`
|
||||
and it will be normalized to sum to 1 along the last dimension. :attr:`probs`
|
||||
will return this normalized value.
|
||||
The `logits` argument will be interpreted as unnormalized log probabilities
|
||||
and can therefore be any real number. It will likewise be normalized so that
|
||||
the resulting probabilities sum to 1 along the last dimension. attr:`logits`
|
||||
the resulting probabilities sum to 1 along the last dimension. :attr:`logits`
|
||||
will return this normalized value.
|
||||
|
||||
See also: :func:`torch.distributions.Categorical` for specifications of
|
||||
|
|
|
|||
|
|
@ -214,7 +214,7 @@ def einsum(*args):
|
|||
As of PyTorch 1.10 :func:`torch.einsum` also supports the sublist format (see examples below). In this format,
|
||||
subscripts for each operand are specified by sublists, list of integers in the range [0, 52). These sublists
|
||||
follow their operands, and an extra sublist can appear at the end of the input to specify the output's
|
||||
subscripts., e.g.`torch.einsum(op1, sublist1, op2, sublist2, ..., [subslist_out])`. Python's `Ellipsis` object
|
||||
subscripts., e.g. `torch.einsum(op1, sublist1, op2, sublist2, ..., [subslist_out])`. Python's `Ellipsis` object
|
||||
may be provided in a sublist to enable broadcasting as described in the Equation section above.
|
||||
|
||||
Args:
|
||||
|
|
@ -1286,7 +1286,7 @@ def norm(input, p="fro", dim=None, keepdim=False, out=None, dtype=None): # noqa
|
|||
:attr:`dim` = ``None`` and :attr:`out` = ``None``.
|
||||
dtype (:class:`torch.dtype`, optional): the desired data type of
|
||||
returned tensor. If specified, the input tensor is casted to
|
||||
:attr:'dtype' while performing the operation. Default: None.
|
||||
:attr:`dtype` while performing the operation. Default: None.
|
||||
|
||||
.. note::
|
||||
Even though ``p='fro'`` supports any number of dimensions, the true
|
||||
|
|
|
|||
|
|
@ -82,7 +82,8 @@ def annotate(the_type, the_value):
|
|||
|
||||
Though TorchScript can infer correct type for most Python expressions, there are some cases where
|
||||
type inference can be wrong, including:
|
||||
- Empty containers like `[]` and `{}`, which TorchScript assumes to be container of `Tensor`s
|
||||
|
||||
- Empty containers like `[]` and `{}`, which TorchScript assumes to be container of `Tensor`
|
||||
- Optional types like `Optional[T]` but assigned a valid value of type `T`, TorchScript would assume
|
||||
it is type `T` rather than `Optional[T]`
|
||||
|
||||
|
|
|
|||
|
|
@ -27,10 +27,12 @@ def fork(func, *args, **kwargs):
|
|||
Asynchronous execution will only occur when run in TorchScript. If run in pure python,
|
||||
`fork` will not execute in parallel. `fork` will also not execute in parallel when invoked
|
||||
while tracing, however the `fork` and `wait` calls will be captured in the exported IR Graph.
|
||||
Warning:
|
||||
|
||||
.. warning::
|
||||
`fork` tasks will execute non-deterministically. We recommend only spawning
|
||||
parallel fork tasks for pure functions that do not modify their inputs,
|
||||
module attributes, or global state.
|
||||
|
||||
Args:
|
||||
func (callable or torch.nn.Module): A Python function or `torch.nn.Module`
|
||||
that will be invoked. If executed in TorchScript, it will execute asynchronously,
|
||||
|
|
|
|||
|
|
@ -73,7 +73,8 @@ Attribute.__doc__ = """
|
|||
|
||||
Though TorchScript can infer correct type for most Python expressions, there are some cases where
|
||||
type inference can be wrong, including:
|
||||
- Empty containers like `[]` and `{}`, which TorchScript assumes to be container of `Tensor`s
|
||||
|
||||
- Empty containers like `[]` and `{}`, which TorchScript assumes to be container of `Tensor`
|
||||
- Optional types like `Optional[T]` but assigned a valid value of type `T`, TorchScript would assume
|
||||
it is type `T` rather than `Optional[T]`
|
||||
|
||||
|
|
|
|||
|
|
@ -1212,7 +1212,7 @@ If :attr:`A` is complex valued, it computes the norm of :attr:`A`\ `.abs()`
|
|||
|
||||
Supports input of float, double, cfloat and cdouble dtypes.
|
||||
|
||||
This function does not necessarily treat multidimensonal attr:`A` as a batch of
|
||||
This function does not necessarily treat multidimensonal :attr:`A` as a batch of
|
||||
vectors, instead:
|
||||
|
||||
- If :attr:`dim`\ `= None`, :attr:`A` will be flattened before the norm is computed.
|
||||
|
|
@ -1223,15 +1223,15 @@ This behavior is for consistency with :func:`torch.linalg.norm`.
|
|||
|
||||
:attr:`ord` defines the vector norm that is computed. The following norms are supported:
|
||||
|
||||
====================== ========================================================
|
||||
====================== ===============================
|
||||
:attr:`ord` vector norm
|
||||
====================== ========================================================
|
||||
====================== ===============================
|
||||
`2` (default) `2`-norm (see below)
|
||||
`inf` `max(abs(x))`
|
||||
`-inf` `min(abs(x))`
|
||||
`0` `sum(x != 0)`
|
||||
other `int` or `float` `sum(abs(x)^{ord})^{(1 / ord)}`
|
||||
====================== ========================================================
|
||||
====================== ===============================
|
||||
|
||||
where `inf` refers to `float('inf')`, NumPy's `inf` object, or any equivalent object.
|
||||
|
||||
|
|
|
|||
|
|
@ -2556,7 +2556,7 @@ def poisson_nll_loss(
|
|||
is set to ``False``, the losses are instead summed for each minibatch. Ignored
|
||||
when reduce is ``False``. Default: ``True``
|
||||
eps (float, optional): Small value to avoid evaluation of :math:`\log(0)` when
|
||||
:attr:`log_input`=``False``. Default: 1e-8
|
||||
:attr:`log_input`\ =\ ``False``. Default: 1e-8
|
||||
reduce (bool, optional): Deprecated (see :attr:`reduction`). By default, the
|
||||
losses are averaged or summed over observations for each minibatch depending
|
||||
on :attr:`size_average`. When :attr:`reduce` is ``False``, returns a loss per
|
||||
|
|
|
|||
|
|
@ -763,7 +763,7 @@ class SyncBatchNorm(_BatchNorm):
|
|||
:class:`torch.nn.SyncBatchNorm` layers.
|
||||
|
||||
Args:
|
||||
module (nn.Module): module containing one or more attr:`BatchNorm*D` layers
|
||||
module (nn.Module): module containing one or more :attr:`BatchNorm*D` layers
|
||||
process_group (optional): process group to scope synchronization,
|
||||
default is the whole world
|
||||
|
||||
|
|
|
|||
|
|
@ -768,7 +768,7 @@ class Module:
|
|||
.. function:: to(memory_format=torch.channels_last)
|
||||
|
||||
Its signature is similar to :meth:`torch.Tensor.to`, but only accepts
|
||||
floating point or complex :attr:`dtype`s. In addition, this method will
|
||||
floating point or complex :attr:`dtype`\ s. In addition, this method will
|
||||
only cast the floating point or complex parameters and buffers to :attr:`dtype`
|
||||
(if given). The integral parameters and buffers will be moved
|
||||
:attr:`device`, if that is given, but with dtypes unchanged. When
|
||||
|
|
|
|||
|
|
@ -668,7 +668,7 @@ class RandomStructured(BasePruningMethod):
|
|||
|
||||
class LnStructured(BasePruningMethod):
|
||||
r"""Prune entire (currently unpruned) channels in a tensor based on their
|
||||
Ln-norm.
|
||||
L\ ``n``-norm.
|
||||
|
||||
Args:
|
||||
amount (int or float): quantity of channels to prune.
|
||||
|
|
@ -695,7 +695,7 @@ class LnStructured(BasePruningMethod):
|
|||
Starting from a base ``default_mask`` (which should be a mask of ones
|
||||
if the tensor has not been pruned yet), generate a mask to apply on
|
||||
top of the ``default_mask`` by zeroing out the channels along the
|
||||
specified dim with the lowest Ln-norm.
|
||||
specified dim with the lowest L\ ``n``-norm.
|
||||
|
||||
Args:
|
||||
t (torch.Tensor): tensor representing the parameter to prune
|
||||
|
|
@ -824,6 +824,7 @@ def identity(module, name):
|
|||
parameter called ``name`` in ``module`` without actually pruning any
|
||||
units. Modifies module in place (and also return the modified module)
|
||||
by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
@ -855,8 +856,9 @@ def random_unstructured(module, name, amount):
|
|||
by removing the specified ``amount`` of (currently unpruned) units
|
||||
selected at random.
|
||||
Modifies module in place (and also return the modified module) by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter `name` by the pruning method.
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
original (unpruned) parameter is stored in a new parameter named
|
||||
``name+'_orig'``.
|
||||
|
|
@ -889,6 +891,7 @@ def l1_unstructured(module, name, amount, importance_scores=None):
|
|||
lowest L1-norm.
|
||||
Modifies module in place (and also return the modified module)
|
||||
by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
@ -929,6 +932,7 @@ def random_structured(module, name, amount, dim):
|
|||
along the specified ``dim`` selected at random.
|
||||
Modifies module in place (and also return the modified module)
|
||||
by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
@ -963,9 +967,10 @@ def random_structured(module, name, amount, dim):
|
|||
def ln_structured(module, name, amount, n, dim, importance_scores=None):
|
||||
r"""Prunes tensor corresponding to parameter called ``name`` in ``module``
|
||||
by removing the specified ``amount`` of (currently unpruned) channels
|
||||
along the specified ``dim`` with the lowest L``n``-norm.
|
||||
along the specified ``dim`` with the lowest L\ ``n``-norm.
|
||||
Modifies module in place (and also return the modified module)
|
||||
by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
@ -1008,6 +1013,7 @@ def global_unstructured(parameters, pruning_method, importance_scores=None, **kw
|
|||
Globally prunes tensors corresponding to all parameters in ``parameters``
|
||||
by applying the specified ``pruning_method``.
|
||||
Modifies modules in place by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
@ -1127,6 +1133,7 @@ def custom_from_mask(module, name, mask):
|
|||
by applying the pre-computed mask in ``mask``.
|
||||
Modifies module in place (and also return the modified module)
|
||||
by:
|
||||
|
||||
1) adding a named buffer called ``name+'_mask'`` corresponding to the
|
||||
binary mask applied to the parameter ``name`` by the pruning method.
|
||||
2) replacing the parameter ``name`` by its pruned version, while the
|
||||
|
|
|
|||
|
|
@ -11,20 +11,20 @@ class _LearnableFakeQuantize(torch.quantization.FakeQuantizeBase):
|
|||
In addition to the attributes in the original FakeQuantize module, the _LearnableFakeQuantize
|
||||
module also includes the following attributes to support quantization parameter learning.
|
||||
|
||||
* :attr: `channel_len` defines the length of the channel when initializing scale and zero point
|
||||
* :attr:`channel_len` defines the length of the channel when initializing scale and zero point
|
||||
for the per channel case.
|
||||
|
||||
* :attr: `use_grad_scaling` defines the flag for whether the gradients for scale and zero point are
|
||||
* :attr:`use_grad_scaling` defines the flag for whether the gradients for scale and zero point are
|
||||
normalized by the constant, which is proportional to the square root of the number of
|
||||
elements in the tensor. The related literature justifying the use of this particular constant
|
||||
can be found here: https://openreview.net/pdf?id=rkgO66VKDS.
|
||||
|
||||
* :attr: `fake_quant_enabled` defines the flag for enabling fake quantization on the output.
|
||||
* :attr:`fake_quant_enabled` defines the flag for enabling fake quantization on the output.
|
||||
|
||||
* :attr: `static_enabled` defines the flag for using observer's static estimation for
|
||||
* :attr:`static_enabled` defines the flag for using observer's static estimation for
|
||||
scale and zero point.
|
||||
|
||||
* attr: `learning_enabled` defines the flag for enabling backpropagation for scale and zero point.
|
||||
* :attr:`learning_enabled` defines the flag for enabling backpropagation for scale and zero point.
|
||||
"""
|
||||
def __init__(self, observer, quant_min=0, quant_max=255, scale=1., zero_point=0., channel_len=-1,
|
||||
use_grad_scaling=False, **observer_kwargs):
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user