It seems that some legacy default stream logic (e.g., present in a8ff647e42/torch/utils/dlpack.py (L114) ) is not handled on the potential receiving end in `torch/_tensor.py`.
Open to suggestions on how to make the test case less clunky, as this was the combination we arrived at after discovering flakiness in alternate versions.
Thanks to Olga Andreeva for surfacing this issue and providing a repro.
CC @Aidyn-A @ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101318
Approved by: https://github.com/ngimel
Fixes#72428 according to decision reached in comments.
I've left other instances of `w.r.t.` in tact (e.g. in parameter/return descriptions, in comments, etc) because there were many, and I didn't' want to go out-of-scope. That being said, I'm happy to change those as well if we'd prefer the consistency!
I've also fixed a typo that I came across while grepping for instances.
Will update with screenshots once docs are built.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100028
Approved by: https://github.com/albanD
# Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
# Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
# Additional Context
Reopen (#82867)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94968
Approved by: https://github.com/kit1980
Currently it falls through to a call to `storage()`, which the IPU doesn't support.
I've made the minimal change here for ease of merging (this'd help us if it was in for 1.13.1), however...
**QUESTION**: Is there any reason why `not torch._C._has_storage(self)` needs to *also* be guarded on `self.device.type == privateuseone`? in other words, could the condition for using `clone` not be this?
```python
self.is_sparse
or self.device.type
in ["lazy", "xla", "mps", "ort", "meta", "hpu", "ipu"]
or not torch._C._has_storage(self)
or (type(self) is not Tensor and self.data_ptr() == 0)
```
If the condition fails, the very next thing is a call to `self._typed_storage()` which will fail, so it feels to me like *any* case without storage shouldn't fall through to the `storage()` call.
The original PR for adding the 'no storage and device is `PrivateUse1`' condition ([86557](https://github.com/pytorch/pytorch/pull/86557)) doesn't discuss whether this could be broadened.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89129
Approved by: https://github.com/albanD
Fixes#81690
TODO:
* [x] C++ Unpickler Fix (locally tested pickled in Python and unpickled in C++)
* [x] C++ Pickler Fix (locally tested pickled in C++ and unpickled in Python)
* [x] Do quant_tensor, sparse_tensor, etc require similar changes? (Sparse and Quant don't need this)
* [x] Add Comments
* [x] How to make sure C++ and Python are in sync? (Functions in `pickler.h` help in getting and setting Tensor Metadata (math-bits for now) on a tensor. They are the only place which should handle this.)
Notes:
Quant Tensor don't support complex dtypes and for float they segfault with `_neg_view` : https://github.com/pytorch/pytorch/issues/88484
Sparse Tensor:
```python
>>> a = torch.tensor([[0, 2.], [3j, 0]]).to_sparse()
>>> a.conj().is_conj()
False
>>> a._neg_view()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NotImplementedError: Cannot access storage of SparseTensorImpl
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88182
Approved by: https://github.com/ezyang, https://github.com/anjali411
This can be critical when processing a large number of tensors
```bash
python -m timeit --setup 'import torch; t = torch.empty(1000, device="cuda")' 't.__dlpack_device__()'
```
based on 1.12.1:
before:
100000 loops, best of 5: 2.32 usec per loop
after:
500000 loops, best of 5: 844 nsec per loop
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86665
Approved by: https://github.com/SunDoge, https://github.com/soulitzer
`Tensor.split` calls `TensorBase.split` whose `handle_torch_function` statement passes `func` as `Tensor.split` which is usually correct, but not here because of the use of `super()`. Instead this calls `torch._VF.split` which correctly differentiates from `torch.split`. This is currently okay since we never hit `TensorBase.split` for types with `__torch_function__` however, once we allow skipping only one hop of `__torch_function__` this will expose the error.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83866
Approved by: https://github.com/albanD
This is a new version of #15648 based on the latest master branch.
Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.
In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)
Fixes https://github.com/pytorch/pytorch/issues/71105
@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
## Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
## Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82867
Approved by: https://github.com/kit1980
## Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
## Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81021
Approved by: https://github.com/ezyang
### Description
Since the major changes for `_TypedStorage` and `_UntypedStorage` are now complete, they can be renamed to be public.
`TypedStorage._untyped()` is renamed to `TypedStorage.untyped()`.
Documentation for storages is improved as well.
### Issue
Fixes#82436
### Testing
N/A
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82438
Approved by: https://github.com/ezyang
unflatten now has a free function version in torch.flatten in addition to
the method in torch.Tensor.flatten.
Updated docs to reflect this and polished them a little.
For consistency, changed the signature of the int version of unflatten in
native_functions.yaml.
Some override tests were failing because unflatten has unusual
characteristics in terms of the .int and .Dimname versions having
different number of arguments so this required some changes
to test/test_override.py
Removed support for using mix of integer and string arguments
when specifying dimensions in unflatten.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81399
Approved by: https://github.com/Lezcano, https://github.com/ngimel