mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: The title says it all. Pull Request resolved: https://github.com/pytorch/pytorch/pull/48577 Reviewed By: ejguan Differential Revision: D25224315 Pulled By: mrshenli fbshipit-source-id: 8e34e9ec29b28768834972bfcdb443efd184f9ca
This commit is contained in:
parent
eba96b91cc
commit
b84d9b48d8
|
|
@ -950,7 +950,7 @@ class MultiheadAttention(Module):
|
|||
with the zero positions will be unchanged. If a BoolTensor is provided, the positions with the
|
||||
value of ``True`` will be ignored while the position with the value of ``False`` will be unchanged.
|
||||
- attn_mask: 2D mask :math:`(L, S)` where L is the target sequence length, S is the source sequence length.
|
||||
3D mask :math:`(N*num_heads, L, S)` where N is the batch size, L is the target sequence length,
|
||||
3D mask :math:`(N*\text{num_heads}, L, S)` where N is the batch size, L is the target sequence length,
|
||||
S is the source sequence length. attn_mask ensure that position i is allowed to attend the unmasked
|
||||
positions. If a ByteTensor is provided, the non-zero positions are not allowed to attend
|
||||
while the zero positions will be unchanged. If a BoolTensor is provided, positions with ``True``
|
||||
|
|
|
|||
Loading…
Reference in New Issue
Block a user