Proposal of two float8 variants - e5m2 and e4m3 - based on https://arxiv.org/pdf/2209.05433.pdf
Hide all Float8 operator implementations behind `#if !defined(C10_MOBILE)` guard to keep Android build size almost unchanged
TODO:
- Refactor duplicated code
- Cleanup unbalanced pragma pop in dtype utils
- Add native implementation on the CUDA size
Co-authored-by: Nikita Shulga <nshulga@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104242
Approved by: https://github.com/albanD
Proposal of two float8 variants - e5m2 and e4m3 - based on https://arxiv.org/pdf/2209.05433.pdf
Hide all Float8 operator implementations behind `#if !defined(C10_MOBILE)` guard to keep Android build size almost unchanged
TODO:
- Refactor duplicated code
- Cleanup unbalanced pragma pop in dtype utils
- Add native implementation on the CUDA size
Co-authored-by: Nikita Shulga <nshulga@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104242
Approved by: https://github.com/albanD
- Add get_printoptions and printoptions context manager
- Improve edgeitems handling when it is zero
- Add render_call which can be used to conveniently print command
line arguments of a function call, while suppressing actual
tensor data
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102623
Approved by: https://github.com/albanD
I got too confused by the FakeTensor printing, so this PR fixes it to
print normally.
Before:
```
with FakeTensorMode():
x = torch.empty(2, 2, device="cpu")
print(x)
# FakeTensor(FakeTensor(..., device='meta', shape=(2, 2)), cpu)
```
After (Tensor printing doesn't print the default device):
```
FakeTensor(..., shape=(2, 2))
```
Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99205
Approved by: https://github.com/eellison
Fixes https://github.com/pytorch/functorch/issues/1026
We need to disable functorch's stack-based dispatching mechanism inside
the tensor printing. Otherwise, all operations that clean up the data of
the Tensor for printing dispatch through the entire functorch stack and
causes problems.
Disabling stack-based dispatching and printing a functorch wrapped
tensor is not a problem; we're still able to get the attributes on the
wrapped tensor that we want.
Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85556
Approved by: https://github.com/samdow
By upstreaming functorch's tensor printing logic into PyTorch. There's
no way of creating a custom print function for a TensorImpl subclass (as
opposed to a torch_dispatch or torch_function tensor subclass, which can
just override repr()) right now, so we need to directly interpose inside
regular Tensor printing in PyTorch.
Monkey patching is bad; users do not expect `import blah` to change
something about another library.
Fixes https://github.com/pytorch/functorch/issues/900
Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85430
Approved by: https://github.com/ezyang
Add support for sparse fake tensors.
- The testing strategy is to run a fake tensor cross ref test on `test_sparse.py`. This is necessary because OpInfo sparse coverage is completely nonexistent. We could have tried to turn on cross ref testing globally for all files, but that would be very time consuming and the tests I'm interested in are mostly in this file. There are some exclusions in testing for things that don't work.
- I make fake tensor converter raise a UnsupportedFakeTensorException if the meta converter fails to do a conversion (which can happen in a relatively large number of situations).
- I relax fake tensor invariants so that you can make a fake tensor from a meta tensor. This is useful because in the cross ref test sometimes we operate on meta tensors.
- Fake tensor wrapping is improved to handle the case when a function doesn't return any tensors
- Meta converter is taught how to convert sparse tensors to meta
There's still a little more cleanup that needs to be done, but this is good for review.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82172
Approved by: https://github.com/eellison
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74000
Now that we're in-core, we can just customize this.
ghstack-source-id: 151540966
Test Plan: Existing test_nestedtensor seems to pass
Reviewed By: ezyang
Differential Revision: D34665270
fbshipit-source-id: 5097944a4dc4fe80cea2b8576f0123466dbeab43
(cherry picked from commit d0315f46f9906c904639f43f218e439407f5b2a7)
Summary:
Fixes multiple compilation on xla tensor print. Please check the conversation here: https://github.com/pytorch/xla/pull/3253
This is done to avoid compilations during tensor printing. Torch performs some tensor operations like slicing to make the tensor readable. These operations result in compilations. Hence to avoid the compilations, copying the tensor to cpu before printing.
example:
```
dev = xm.xla_device()
def test_linear(input_shape=(8, 1024)):
import pdb
pdb.set_trace()
linear = torch.nn.Linear(in_features=1024, out_features=4096, bias=True).to(dev)
inp = torch.randn(*input_shape).to(dev)
output = linear(inp)
xm.mark_step()
return output
```
Returning from this function would have resulted in 63 compiles, since PDB prints the value of the return output. In this case it is a xla tensor.
Now with the current change, there is no compilation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71147
Reviewed By: shunting314
Differential Revision: D33795177
Pulled By: wconstab
fbshipit-source-id: 74b53d9a1cb7ef67f9d8b0a32064f3896be449b5
(cherry picked from commit a9e0687fc5)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69327
Original commit changeset: d44096d88265
Original Phabricator Diff: D32144240 (668574af4a)
Test Plan:
CI
original diff failed 175 builds in CI
Reviewed By: airboyang, anjali411
Differential Revision: D32809407
fbshipit-source-id: c7c8e69bcee0274992e2d5da901f035332e60071
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56058
User facing changes:
1. Adds a negative bit and corresponding new API (`is_neg()`,`resolve_neg()`)
2. `tensor.conj().imag` now returns a floating point tensor with neg bit set to 1 instead of a tensor with no notion of negative bit. Note that imag is still a view and all the view properties still hold for imag.
Non user facing changes:
1. Added a new Negative dispatch key and a backend fallback to handle it
2. Updated copy kernel to handle negative bit
3. Merged conjugate and negative bit fallback kernel
4. fixed https://github.com/pytorch/pytorch/issues/60478 (caused due to https://github.com/pytorch/pytorch/pull/54987)
Testing:
1. Added a new OpInfo based test `test_neg_view` (verifies that out-of-place and in-place operations work correctly for all operations when the input is a neg view tensor by checking the result against an actually negated tensor, verifies that autograd returns the same output for both neg view and actually negated tensors as well as it works fine when grad_out is a neg view).
2. Added a new test class containing `test_conj_view`, `test_neg_view`.
Test Plan: Imported from OSS
Reviewed By: soulitzer
Differential Revision: D29636403
fbshipit-source-id: 12214c9dc4806c51850f4a72a109db9527c0ca63
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987
Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype:
Here's a summary of the changes in this PR:
This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose).
1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor.
2. NEW API:
a) `.conj()` -- now returning a view.
b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory.
c) `.conj_physical_()`, and `out=` variant
d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0.
e) `.resolve_conj_()` in-place version of (d)
f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors.
g) `view_as_real` -- existing function, but now errors out on conjugated tensors.
3. Conjugate Fallback
a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor.
b) This fallback is well equipped to handle the following cases:
- functional operation e.g., `torch.sin(input)`
- Mutable inputs and in-place operations e.g., `tensor.add_(2)`
- out-of-place operation e.g., `torch.sin(input, out=out)`
- Tensorlist input args
- NOTE: Meta tensors don't work with conjugate fallback.
4. Autograd
a) `resolve_conj()` is an identity function w.r.t. autograd
b) Everything else works as expected.
5. Testing:
a) All method_tests run with conjugate view tensors.
b) OpInfo tests that run with conjugate views
- test_variant_consistency_eager/jit
- gradcheck, gradgradcheck
- test_conj_views (that only run for `torch.cfloat` dtype)
NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit.
Follow up work:
1. conjugate view RFC
2. Add neg bit to re-enable view operation on conjugated tensors
3. Update linalg functions to call into specialized functions that fast path with the hermitian operation.
Test Plan: Imported from OSS
Reviewed By: VitalyFedyunin
Differential Revision: D28227315
Pulled By: anjali411
fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
Summary:
This is a follow up PR of https://github.com/pytorch/pytorch/issues/48463
> Rather than requiring that users write import numbers and then use numbers.Float etc., this PEP proposes a straightforward shortcut that is almost as effective: when an argument is annotated as having type float, an argument of type int is acceptable; similar, for an argument annotated as having type complex, arguments of type float or int are acceptable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48584
Reviewed By: zhangguanheng66
Differential Revision: D25411080
Pulled By: malfet
fbshipit-source-id: e00dc1e9e6e46a8cfae77da4f2cf159c0c2b9bcc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42612
Add a new Quantizer that supports an input zero point (bias) that can be float.
The quantization equation in this case is
Xq = (Xf - bias) * inv_scale, where bias is float zero_point value
We start with per-row implementation and can extend to per-tensor in the future, if necessary
Test Plan:
python test/test_quantization.py TestQuantizedTensor
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D22960142
fbshipit-source-id: ca9ab6c5b45115d3dcb1c4358897093594313706
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40513
This PR makes the following changes:
1. Complex Printing now uses print formatting for it's real and imaginary values and they are joined at the end.
2. Adding 1. naturally fixes the printing of complex tensors in sci_mode=True
```
>>> torch.tensor(float('inf')+float('inf')*1j)
tensor(nan+infj)
>>> torch.randn(2000, dtype=torch.cfloat)
tensor([ 0.3015-0.2502j, -1.1102+1.2218j, -0.6324+0.0640j, ...,
-1.0200-0.2302j, 0.6511-0.1889j, -0.1069+0.1702j])
>>> torch.tensor([1e-3, 3+4j, 1e-5j, 1e-2+3j, 5+1e-6j])
tensor([1.0000e-03+0.0000e+00j, 3.0000e+00+4.0000e+00j, 0.0000e+00+1.0000e-05j,
1.0000e-02+3.0000e+00j, 5.0000e+00+1.0000e-06j])
>>> torch.randn(3, dtype=torch.cfloat)
tensor([ 1.0992-0.4459j, 1.1073+0.1202j, -0.2177-0.6342j])
>>> x = torch.tensor([1e2, 1e-2])
>>> torch.set_printoptions(sci_mode=False)
>>> x
tensor([ 100.0000, 0.0100])
>>> x = torch.tensor([1e2, 1e-2j])
>>> x
tensor([100.+0.0000j, 0.+0.0100j])
```
Test Plan: Imported from OSS
Differential Revision: D22309294
Pulled By: anjali411
fbshipit-source-id: 20edf9e28063725aeff39f3a246a2d7f348ff1e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38490
A meta tensor is a tensor that is a lot like a normal tensor,
except it doesn't actually have any data associated with it.
You can use them to carry out shape/dtype computations without
actually having to run the actual code; for example, this could
be used to do shape inference in a JIT analysis pass.
Check out the description in DispatchKey.h for more information.
Meta tensors are part of a larger project to rationalize how we
write kernels so that we don't have to duplicate shape logic
in CPU kernel, CUDA kernel and meta kernel (this PR makes the
duplication problem worse!) However, that infrastructure can
be built on top of this proof of concept, which just shows how
you can start writing meta kernels today even without this
infrastructure.
There are a lot of things that don't work:
- I special cased printing for dense tensors only; if you try to
allocate a meta sparse / quantized tensor things aren't going
to work.
- The printing formula implies that torch.tensor() can take an
ellipsis, but I didn't add this.
- I wrote an example formula for binary operators, but it isn't
even right! (It doesn't do type promotion of memory layout
correctly). The most future proof way to do it right is to
factor out the relevant computation out of TensorIterator,
as it is quite involved.
- Nothing besides torch.add works right now
- Meta functions are ALWAYS included in mobile builds (selective
build doesn't work on them). This isn't a big deal for now
but will become more pressing as more meta functions are added.
One reason I'm putting up this PR now is to check with Yinghai Lu
if we can unblock shape inference for accelerators, while we are
still working on a long term plan for how to unify all shape
computation across our kernels.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21935609
Pulled By: ezyang
fbshipit-source-id: f7d8636eeb8516b6bc296db99a16e56029972eee
Summary:
Minor speed up when printing.
Also allows you to print Tensors that you cannot perform autograd ops on.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39420
Differential Revision: D21889390
Pulled By: albanD
fbshipit-source-id: 4e229994eb89484795282e6eac37359ce46b5ebc
Summary:
I added the following to the docs:
1. `torch.save`.
1. Added doc for `_use_new_zipfile_serialization` argument.
2. Added a note telling that extension does not matter while saving.
3. Added an example showing the use of above argument along with `pickle_protocol=5`.
2. `torch.split`
1. Added an example showing the use of the function.
3. `torch.squeeze`
1. Added a warning for batch_size=1 case.
4. `torch.set_printoptions`
1. Changed the docs of `sci_mode` argument from
```
sci_mode: Enable (True) or disable (False) scientific notation. If
None (default) is specified, the value is defined by `_Formatter`
```
to
```
sci_mode: Enable (True) or disable (False) scientific notation. If
None (default=False) is specified, the value is defined by
`torch._tensor_str._Formatter`.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39303
Differential Revision: D21904504
Pulled By: zou3519
fbshipit-source-id: 92a324257d09d6bcfa0b410d4578859782b94488
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39483
I fixed all of the new errors that occurred because of the upgrade.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21884575
Pulled By: ezyang
fbshipit-source-id: 45c8e1f1ecb410c8d7c46dd3922ad70e982a0685
Summary:
This PR addresses Issue https://github.com/pytorch/pytorch/issues/36279.
Previously, printing of complex tensors would sometimes yield extra spaces before the elements as shown below:
```
print(torch.tensor([[1 + 1.340j, 3 + 4j], [1.2 + 1.340j, 6.5 + 7j]], dtype=torch.complex64))
```
would yield
```
tensor([[(1.0000 + 1.3400j),
(3.0000 + 4.0000j)],
[(1.2000 + 1.3400j),
(6.5000 + 7.0000j)]], dtype=torch.complex64)
```
This occurs primarily because when the max width for the element is being assigned, the formatter's max_width is calculated prior to truncating the float values. As a result, ```self.max_width``` would end up being much longer than the final length of the element string to be printed.
I address this by adding a boolean variable that checks if a complex tensor contains only ints and change the control flow for calculating ```self.max_width``` accordingly.
Here are some sample outputs of both float and complex tensors:
```
tensor([[0., 0.],
[0., 0.]], dtype=torch.float64)
tensor([[(0.+0.j), (0.+0.j)],
[(0.+0.j), (0.+0.j)]], dtype=torch.complex64)
tensor([1.2000, 1.3400], dtype=torch.float64)
tensor([(1.2000+1.3400j)], dtype=torch.complex64)
tensor([[(1.0000+1.3400j), (3.0000+4.0000j)],
[(1.2000+1.3400j), (6.5000+7.0000j)]], dtype=torch.complex64)
tensor([1.0000, 2.0000, 3.0000, 4.5000])
tensor([(1.+2.j)], dtype=torch.complex64)
```
cc ezyang anjali411 dylanbespalko
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36331
Differential Revision: D20955663
Pulled By: anjali411
fbshipit-source-id: c26a651eb5c9db6fcc315ad8d5c1bd9f4b4708f7
Summary:
See issue [https://github.com/pytorch/pytorch/issues/33494 Complex number printing inconsistent with float](https://github.com/pytorch/pytorch/issues/33494).
Changes introduces an optional argument in Formatter's ```format``` function to discern whether a tensor is a float tensor or not. This way, there is consistency between float tensors and complex tensors so that the complex tensors print in the same manner as float tensors:
- Only a decimal point and no zeros for integer values.
- Trailing zeros only if the value is truly a float.
- White space introduced to fill the gap so that +/- symbols and commas align.
Here are some example outputs.
```
print(torch.zeros((2,2), dtype=torch.float64))
```
yields
```
tensor([[0., 0.],
[0., 0.]], dtype=torch.float64)
```
```
print(torch.zeros((2,2), dtype=torch.complex64))
```
previously yielded
```
tensor([[(0.0000 + 0.0000j), (0.0000 + 0.0000j)],
[(0.0000 + 0.0000j), (0.0000 + 0.0000j)]], dtype=torch.complex64)
```
and now yields
```
tensor([[(0 + 0.j), (0 + 0.j)],
[(0 + 0.j), (0 + 0.j)]], dtype=torch.complex64)
```
This new print version is more consistent with float tensor's pretty print.
The following example mixes integer and decimals:
```
print(torch.tensor([[1 + 1.340j, 3 + 4j], [1.2 + 1.340j, 6.5 + 7j]], dtype=torch.complex64))
```
This yields:
```
tensor([[ (1.0000 + 1.3400j),
(3.0000 + 4.0000j)],
[ (1.2000 + 1.3400j),
(6.5000 + 7.0000j)]], dtype=torch.complex64)
```
The following example
```
torch.tensor([1,2,3,4.5])
```
yields
```
tensor([1.0000, 2.0000, 3.0000, 4.5000]) .
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35841
Differential Revision: D20893848
Pulled By: anjali411
fbshipit-source-id: f84c533b8957a1563602439c07e60efbc79691bc