Commit Graph

87 Commits

Author SHA1 Message Date
FFFrog
a28e6ae38f [OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg (#156401)
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156401
Approved by: https://github.com/albanD
ghstack dependencies: #156400
2025-06-22 18:40:38 +00:00
FFFrog
1d522325b4 [OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg (#156400)
As the title stated.

**Changes:**

- add resize_ for OpenReg
- migrate related tests into test_openreg.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156400
Approved by: https://github.com/albanD
2025-06-22 18:40:38 +00:00
FFFrog
e4fd0bf771 [OpenReg][4/N] Migrate cpp_extensions_open_device_registration to OpenReg (#155101)
As the title stated.

**Involved testcases**:
- test_open_device_storage_pin_memory
- test_open_device_serialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155101
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019, #154106, #154181
2025-06-14 03:44:32 +00:00
FFFrog
1e7989cad5 [OpenReg][3/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154181)
As the title stated.

**Involved testcases**:
- test_open_device_quantized
- test_open_device_random
- test_open_device_tensor
- test_open_device_packed_sequence
- test_open_device_storage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154181
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019, #154106
2025-06-14 03:44:32 +00:00
FFFrog
7e5f29b2de [OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154106)
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154106
Approved by: https://github.com/nareshrajkumar866, https://github.com/albanD
ghstack dependencies: #153947, #154018, #154019
2025-06-14 03:44:32 +00:00
FFFrog
676abded4b [OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg (#154019)
As the title stated.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154019
Approved by: https://github.com/albanD
ghstack dependencies: #153947, #154018
2025-06-14 03:44:32 +00:00
Nikita Shulga
bcad962550 [BE][Testing] Delete some unused code (#155760)
- Fix typo in class name `OpenRgistration`->`OpenRegistration`
- Use existing `common` alias of `torch.testing._internal.common_utils`, i.e. `s/torch.testing._internal.common_utils.markDynamoStrictTest/common.markDynamoStrictTest/`
- Remove unused `TEST_CUDA` and `TEST_ROCM` are unused in that file

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155760
Approved by: https://github.com/albanD
2025-06-12 18:41:53 +00:00
PyTorch MergeBot
2a3b41cbd0 Revert "[CI] Use setup-python from for Mac tests (#155698)"
This reverts commit 2b9d638e33.

Reverted https://github.com/pytorch/pytorch/pull/155698 on behalf of https://github.com/malfet due to It causes weird flaky failures in MPS and do not upload usage logs anymore ([comment](https://github.com/pytorch/pytorch/pull/155698#issuecomment-2967120676))
2025-06-12 14:42:32 +00:00
Nikita Shulga
2b9d638e33 [CI] Use setup-python from for Mac tests (#155698)
Instead of `setup-miniconda`
- Remove `CONDA_RUN` macro...
- Hack the search path in `macos-test.sh` to put both python and python3 aliases first in the path (not sure what other action are messing with path environment variable)
- Skip `TestMultiprocessing.test_fs_sharing` as even though it completes, it hangs on the shutdown both in CI and in all local setups I have
- Skip `TestCppExtensionOpenRgistration.test_base_device_registration` as it hangs on the shutdown as well
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155698
Approved by: https://github.com/atalman
ghstack dependencies: #155476, #155493, #155601, #155515, #155697
2025-06-12 04:58:00 +00:00
PyTorch MergeBot
8347268edc Revert "Make open device registration tests standalone (#153855)"
This reverts commit 8823138e47.

Reverted https://github.com/pytorch/pytorch/pull/153855 on behalf of https://github.com/clee2000 due to causing some linux aarch64 tests to fail [GH job link](https://github.com/pytorch/pytorch/actions/runs/15566289293/job/43832373302) [HUD commit link](8823138e47), should be easy fix, rename in places where its mentioned, there might be more than just aarch64 though ([comment](https://github.com/pytorch/pytorch/pull/153855#issuecomment-2960191503))
2025-06-10 18:11:24 +00:00
Joel Schlosser
8823138e47 Make open device registration tests standalone (#153855)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153855
Approved by: https://github.com/janeyx99
2025-06-10 17:33:26 +00:00
Murray Steele
0a092c7de6 Enable CPP Extension Open Registration tests on Arm (#144774)
Enables most tests under CPP Extension Open Registration as they pass on Arm now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144774
Approved by: https://github.com/aditew01, https://github.com/fadara01, https://github.com/malfet
2025-06-05 22:32:28 +00:00
PyTorch MergeBot
d4ab8e74f3 Revert "Fix the Problems About Defining Static Variable in Inline Function (#147095)"
This reverts commit c6fc11af76.

Reverted https://github.com/pytorch/pytorch/pull/147095 on behalf of https://github.com/izaitsevfb due to still fails to link internally at meta ([comment](https://github.com/pytorch/pytorch/pull/147095#issuecomment-2917221575))
2025-05-28 18:22:39 +00:00
FFFrog
c6fc11af76 Fix the Problems About Defining Static Variable in Inline Function (#147095)
Refer to https://github.com/pytorch/pytorch/issues/125465 for more informations

- Remove unused header files
- Move the inline function that defines the static variable to .cc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147095
Approved by: https://github.com/cyyever, https://github.com/albanD
2025-05-28 02:47:16 +00:00
FFFrog
c181403063 [Openreg][PrivateUse1] Improve openreg module capabilities (#151000)
----

- Add more functionalities for openreg in openreg module
- Remove related functionalities from test_cpp_extensions_open_device_registration.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151000
Approved by: https://github.com/albanD
2025-04-12 17:21:35 +00:00
PyTorch MergeBot
4926bd6004 Revert "Fix the Problems About Defining Static Variable in Inline Function (#147095)"
This reverts commit 3da14d38bd.

Reverted https://github.com/pytorch/pytorch/pull/147095 on behalf of https://github.com/atalman due to breaks internally ([comment](https://github.com/pytorch/pytorch/pull/147095#issuecomment-2787129770))
2025-04-08 17:10:36 +00:00
FFFrog
3da14d38bd Fix the Problems About Defining Static Variable in Inline Function (#147095)
Refer to https://github.com/pytorch/pytorch/issues/125465 for more informations

- Remove unused header files
- Move the inline function that defines the static variable to .cc

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147095
Approved by: https://github.com/cyyever, https://github.com/albanD
2025-04-08 10:23:02 +00:00
wizzniu
c07dc64017 Update pin memory related APIs to not pass 'device' argument (#131858)
Based on https://github.com/pytorch/pytorch/pull/126376, this PR tries to update all PT callers (e.g., `Tensor.is_pinned()`, `Tensor.pin_memory()`) to not pass `device` argument.
As for `storage/untyped_storage.is_pinned()/pin_memory()`, we keep the `device` argument but passing `device` is discouraged. And if not given, the default `device` is still 'cuda' for BC.
Additionally, based on device-agnostic pin_memory, `pin_memory_device` argument of `torch.utils.data.DataLoader` is discouraged  now. For BC, explictly passing this argument is still effective. If not given, the default `device` will be the current accelerator.

Fixes #124908
Relates https://github.com/pytorch/pytorch/pull/126376

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131858
Approved by: https://github.com/albanD

Co-authored-by: albanD <desmaison.alban@gmail.com>
2025-01-15 17:23:35 +00:00
Zhenbin Lin
cbb1ed2966 [1/N] OpenReg: Replace open_registration_extension.cpp with openreg (#141815)
As described in OpenReg [next-steps](https://github.com/pytorch/pytorch/blob/main/test/cpp_extensions/open_registration_extension/README.md#next-steps), here we replace the current `open_registration_extension.cpp` test in PyTorch CI with openreg.

The current `open_registration_extension.cpp` contains two parts:
1. Implentations to support `PrivateUse1` backend.
2. Helper functions used for UTs in `test_cpp_extensions_open_device_registration.py` and `test_transformers.py`.

For the first part, we'll replace it with openreg. For the second part, we'll migrate them to ut files step by step.

@albanD

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141815
Approved by: https://github.com/albanD
2025-01-14 15:59:00 +00:00
Tom Ritchford
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
FFFrog
f47aac6bc2 Make Context to be Device-agnostic Step by Step (3/N) (#137578)
Detailed Descriptions:
- Using unified Device-agnostic API to create new generator for accelerator.
- Add deprecated info for GeneratorForPrivateuseone

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137578
Approved by: https://github.com/cyyever, https://github.com/ezyang
2024-12-18 15:12:19 +00:00
Mikayla Gawarecki
b7b56576d8 Allow user to manually pass module.name associated with global in {add}_safe_global (#142153)
Fixes #142144

A global x is saved in checkpoint as `GLOBAL x.__module__ x.__name__`. So , after allowlisting a GLOBAL it is expected to match any GLOBAL instruction of the form `GLOBAL x.__module__ x.__name__`  but there are edge cases when for the same API from the same module, what `__module__` gives changes between versions which prevents users from allowlisting the global.

In this case, in numpy < 2.1

```
torch.save("bla", np_array)
# checkpoint has GLOBAL "np.core.multiarray" "_reconstruct"
```
In np version 2.1

```
with safe_globals([np.core.multiarray._reconstruct]):
    torch.load("bla")
```
np.core.multiarray._reconstruct.__module__ gives "np._core.multiarray" (note the extra _ before core) and see what was done [here](https://github.com/numpy/numpy/blob/main/numpy/core/multiarray.py)

Since the dictionary to access safe globals is keyed on "{foo.__module__}.{foo.__name__}", __module__, __name__ will no longer match that in the checkpoint so "np.core.multiarray._reconstruct" can no longer be properly allowlisted (instead np._core.multiarray._reconstruct is a key in the dict).

We allow `add_safe_globals/safe_globals` to optionally take tuples of (global, str of module.name) to workaround such (odd/edge case) situations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142153
Approved by: https://github.com/albanD
2024-12-06 18:56:39 +00:00
William Wen
c946d82077 [ci, 3.13] disable another failing cpp_extension test in 3.13 (#141673)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141673
Approved by: https://github.com/StrongerXi, https://github.com/atalman
ghstack dependencies: #141409, #142003, #141572, #141577, #141605, #141621, #141623
2024-12-05 00:24:42 +00:00
William Wen
cd56cd30f2 [ci, 3.13] disable failing cpp_extension test due to weights_only error in numpy 2.1 (#141623)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141623
Approved by: https://github.com/mikaylagawarecki, https://github.com/atalman
ghstack dependencies: #141409, #142003, #141572, #141577, #141605, #141621
2024-12-05 00:24:35 +00:00
hipudding
34743d8a16 Support dlpack for privateuse1 (#135331)
Fixes #129652
Pull Request resolved: https://github.com/pytorch/pytorch/pull/135331
Approved by: https://github.com/shink, https://github.com/FFFrog, https://github.com/ezyang

Co-authored-by: Jiawei Li <ljw1101.vip@gmail.com>
2024-11-13 13:13:14 +00:00
zeshengzong
cb71bcc542 Replace clone.detach with detach.clone (#140264)
Fixes #64532

As state in issue, replace `clone.detach` by `detach.clone`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140264
Approved by: https://github.com/soulitzer
2024-11-13 07:01:02 +00:00
Mikayla Gawarecki
70288c3c2d Remove dependency on numpy for serialization for XLA/open registration devices without numpy (#137444)
Related: https://github.com/pytorch/xla/issues/7799#issuecomment-2375818263

Follow ups: Do the same for maia and mtia

## Motivation

With the move to `weights_only` by default, we are making an explicit decision not to allowlist GLOBALs required to deserialize `numpy` tensors  by default. The implication is that backends relying on numpy for serialization will fail loudly when `torch.load` flips `weights_only`.

However, we make the observation that this dependency on numpy was legacy and is not actually needed anymore. So we can remove it, which aligns with our weights_only strategy.

## Why is this ok?

The following comment on why numpy is necessary for serialization is legacy

c87c9f0a01/torch/_tensor.py (L303-L312)

We no longer do the following, though it was the case 5 years ago in the PR that added this
> CPU storage is reconstructed with randomly initialized data, moved onto backend device, and then storage is updated to the serialized content

**Instead what now happens is that CPU storage is constructed with data from the file **and then** moved onto backend device.**

Old behavior (`legacy_load`): 67adda891a/torch/serialization.py (L620)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137444
Approved by: https://github.com/albanD
2024-10-09 19:35:55 +00:00
Simon Fan
b86269fab5 Unify cpp_extension build directory removal (#136059)
Keeps existing default directory clearing logic, even though it fails when TORCH_EXTENSIONS_DIR is set. To properly clear, we'd need to track all the folders we compiled the extensions to.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136059
Approved by: https://github.com/ezyang, https://github.com/albanD
2024-10-03 06:22:11 +00:00
Mikayla Gawarecki
a096f2899d Add torch.serialization.skip_data context manager (#134504)
## Semantic

The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).

```python
import torch
import torch.nn as nn

sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
    torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```

(2)  With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)

```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode

with FakeTensorMode():
    m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')

sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
    torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])

```

## Follow Ups

- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)

Differential Revision: [D62238610](https://our.internmc.facebook.com/intern/diff/D62238610)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
2024-09-05 16:53:39 +00:00
PyTorch MergeBot
2fd36086bc Revert "Add torch.serialization.skip_data context manager (#134504)"
This reverts commit 94db935749.

Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/kit1980 due to See D62082697 ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2327542276))
2024-09-03 22:21:27 +00:00
Mikayla Gawarecki
94db935749 Add torch.serialization.skip_data context manager (#134504)
## Semantic

The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).

```python
import torch
import torch.nn as nn

sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
    torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```

(2)  With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)

```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode

with FakeTensorMode():
    m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')

sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
    torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])

```

## Follow Ups

- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
2024-08-29 04:52:52 +00:00
PyTorch MergeBot
1285443994 Revert "Add torch.serialization.skip_data context manager (#134504)"
This reverts commit 202600bc23.

Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/mikaylagawarecki due to This is breaking Windows docs tests due to NamedTemporaryFile on Windows not working well ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2316543901))
2024-08-29 01:30:49 +00:00
Mikayla Gawarecki
202600bc23 Add torch.serialization.skip_data context manager (#134504)
## Semantic

The semantic is
(1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint).

```python
import torch
import torch.nn as nn

sd = nn.Linear(3, 5).state_dict()
with torch.serialization.skip_data():
    torch.save(sd, 'foo.pt')
print(torch.load('foo.pt', weights_only=True))
```

(2)  With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor)

```python
import torch
import torch.nn as nn
from torch._subclasses.fake_tensor import FakeTensorMode

with FakeTensorMode():
    m = nn.Linear(3, 5, dtype=torch.float16, device='cuda')

sd = m.state_dict()
with torch.serialization.skip_data(materialize_fake_tensors=True):
    torch.save(sd, 'bla.pt')
print(torch.load('bla.pt', weights_only=True))
# OrderedDict([('weight', tensor([[0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.],
#        [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))])

```

## Follow Ups

- [ ] `torch.load` semantic for skip_data context manager
- [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504
Approved by: https://github.com/albanD
2024-08-28 23:53:17 +00:00
Mikayla Gawarecki
d9576c9440 Fix failures when default is flipped for weights_only (#127627)
Tests on XLA shard not fixed yet but there is an issue here https://github.com/pytorch/xla/issues/7799

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127627
Approved by: https://github.com/albanD
ghstack dependencies: #132349
2024-08-16 00:22:43 +00:00
Yan Zhiwei
2a02b5cd22 [Intel GPU] Dispatch Stub support (#130019)
# Motivation
Structured codegen is beneficial for easier decoupling tensor meta setting and kernel implementation. At present, XPU operators need to handle tensor metas in hand-written way.

We plan to leverage the codegen system for auto generate structured operators. This PR facilitate the `DispatchStub` support for  Intel GPUs. Based on that, XPU operators would have possibility to register kernel functor to operator stubs.

This is a prerequisite of PR #130082, where we will modify the codegen system to generate XPU needed source files and headers.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130019
Approved by: https://github.com/EikanWang, https://github.com/gujinghui, https://github.com/albanD
2024-07-29 02:18:52 +00:00
wizzniu
8963623494 Re-implement pin_memory to be device-agnostic by leveraging the Accelerator concept (#126376)
This PR re-implements pin memory aiming to get rid of the optional `device` argument and makes all related APIs to be device-agnostic. We add two new abstract APIs in [AcceleratorHooksInterface](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/detail/AcceleratorHooksInterface.h#L12) and redefine pin memory as: "Pin memory is always pinned for the current accelerator device". In detail, it uses [getAcceleratorHooksInterface](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/Context.h#L61) in pin_memory/is_pinned to get an appropriate device and invoke the corresponding overridden interfaces, instead of using BackendSelect and then dispatching to CUDA or other specific backends' implement methods.

Note: For new backends who want to implement and use pin memory, just inherit AcceleratorHooksInterface and overwrite the `isPinnedPtr` and `getPinnedMemoryAllocator` methods.

Additional context: To avoid BC-breaking, this PR just preserves the `device` arg of related APIs and would throw a deprecation warning if `device` arg is passed. Another PR will be submitted to update all PT callers (`Tensor.is_pinned()`, `Tensor.pin_memory()`...) not to pass this arg based on this PR. In future, `device` arg will be actually removed.

Relates #124908
Relates #14560
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126376
Approved by: https://github.com/albanD
2024-07-23 01:44:15 +00:00
Shan19900305
d57af32e63 Fix undefined tensor error in _copy_from_and_resize when fallback to cpu. (#130237)
1) Add skip undefined tensor in cpu fallback when call _copy_from_and_resize;
2) Modify to_cpu function support optional tensor;
3) Add copy back to origin optional tensor when alias_info isWrite is true.

@ezyang @bdhirsh

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130237
Approved by: https://github.com/ezyang
2024-07-20 23:12:17 +00:00
PyTorch MergeBot
726b9268d2 Revert "Re-implement pin_memory to be device-agnostic by leveraging the Accelerator concept (#126376)"
This reverts commit c986aeea2d.

Reverted https://github.com/pytorch/pytorch/pull/126376 on behalf of https://github.com/atalman due to Failing internal builds ([comment](https://github.com/pytorch/pytorch/pull/126376#issuecomment-2237496633))
2024-07-18 20:25:20 +00:00
wizzniu
c986aeea2d Re-implement pin_memory to be device-agnostic by leveraging the Accelerator concept (#126376)
This PR re-implements pin memory aiming to get rid of the optional `device` argument and makes all related APIs to be device-agnostic. We add two new abstract APIs in [AcceleratorHooksInterface](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/detail/AcceleratorHooksInterface.h#L12) and redefine pin memory as: "Pin memory is always pinned for the current accelerator device". In detail, it uses [getAcceleratorHooksInterface](https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/Context.h#L61) in pin_memory/is_pinned to get an appropriate device and invoke the corresponding overridden interfaces, instead of using BackendSelect and then dispatching to CUDA or other specific backends' implement methods.

Note: For new backends who want to implement and use pin memory, just inherit AcceleratorHooksInterface and overwrite the `isPinnedPtr` and `getPinnedMemoryAllocator` methods.

Additional context: To avoid BC-breaking, this PR just preserves the `device` arg of related APIs and would throw a deprecation warning if `device` arg is passed. Another PR will be submitted to update all PT callers (`Tensor.is_pinned()`, `Tensor.pin_memory()`...) not to pass this arg based on this PR. In future, `device` arg will be actually removed.

Relates #124908
Relates #14560
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126376
Approved by: https://github.com/albanD
2024-07-18 11:54:14 +00:00
Xuehai Pan
ba48cf6535 [BE][Easy][6/19] enforce style for empty lines in import segments in test/ (#129757)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129757
Approved by: https://github.com/ezyang
2024-07-17 06:42:37 +00:00
Mikayla Gawarecki
87f79af24d Fix map_location for wrapper subclass and device tensors that go through numpy (#126728)
Fixes https://github.com/pytorch/pytorch/issues/124418

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126728
Approved by: https://github.com/albanD
2024-05-24 16:39:30 +00:00
FFFrog
5dee46266a Fix & optimize open device registration test. (#125572)
1. Fix the wrong tests about lazy init for PrivateUse1 named foo
2. Refactor the tests and make it more flexible
3. Disable the two tests temporarily
     - test_open_device_faketensor
     - test_open_device_scalar_type_fallback
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125572
Approved by: https://github.com/albanD
2024-05-07 08:30:01 +00:00
PyTorch MergeBot
ea347fa6ce Revert "Fix & optimze open device registration test. (#124712)"
This reverts commit f03cf9d4dc.

Reverted https://github.com/pytorch/pytorch/pull/124712 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/124712#issuecomment-2086971499))
2024-04-30 20:00:37 +00:00
FFFrog
f03cf9d4dc Fix & optimze open device registration test. (#124712)
Fixes #100152

1. Fix the wrong tests about lazy init for PrivateUse1 named foo
2. Fix wrong backend meta registry mechanism when compiling with clang++( compiling with g++ work well)(introduced by static variable in inline function)
3. Refactor the tests and make it more flexible
4. Disable the two tests temporarily
    - test_open_device_storage_pin_memory
    - test_compile_autograd_function_aliasing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124712
Approved by: https://github.com/albanD, https://github.com/malfet
2024-04-29 18:55:38 +00:00
Shan19900305
8d12ba9acf add methods for open device in PackedSequence module. (#124923)
1) add is_{custom_device_name}() and {custom_device_name}() for open device register;
2) fix open device failed testcases.

@ezyang  @bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124923
Approved by: https://github.com/ezyang
2024-04-26 15:26:20 +00:00
Yuanhao Ji
b3504af56e Enable UFMT on test/scripts and some files (#124137)
Part of: #123062

Ran lintrunner on:

- `test/scripts`
- `test/simulate_nccl_errors.py`
- `test/test_ao_sparsity.py`
- `test/test_autocast.py`
- `test/test_binary_ufuncs.py`
- `test/test_bundled_images.py`
- `test/test_bundled_inputs.py`
- `test/test_comparison_utils.py`
- `test/test_compile_benchmark_util.py`
- `test/test_complex.py`
- `test/test_cpp_api_parity.py`
- `test/test_cpp_extensions_aot.py`
- `test/test_cpp_extensions_jit.py`
- `test/test_cpp_extensions_open_device_registration.py`

Detail:

```bash
$ lintrunner -a --take UFMT --all-files
ok No lint issues.
Successfully applied all patches.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124137
Approved by: https://github.com/soulitzer
2024-04-19 22:01:27 +00:00
chentianyi16
83ad8e01b1 fix the problem that cpu_fallback for aten::triu_indices on custom device crashed (#121306)
Fixes #121289

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121306
Approved by: https://github.com/ezyang
2024-03-26 01:29:45 +00:00
Shan19900305
6662627c89 Add APIs for custom device using TensorIteratorBase. (#120792)
1) add operand and get_dim_names API;
2) set will_resize to true when output tensor is undefined;
3) add abs_stub for dummy device and calculate on cpu device;
4) support dummy device copy with stride;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120792
Approved by: https://github.com/ezyang
2024-03-20 03:51:09 +00:00
Chen_Liqing
291ce86a6c Modify StorageImplCreateHelper (#118459)
I want to use tensor.untyped_storage()[a:b] for ``PrivateUse1`` backend but fail. The code will go into ``THPStorage_get``:
bb6eba189f/torch/csrc/Storage.cpp (L525-L540)

Here ``torch`` will create a new ``c10::StorageImpl`` but not consider about ``PrivateUse1`` backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118459
Approved by: https://github.com/albanD
2024-03-07 06:26:55 +00:00
Shan19900305
6c3600d008 Enable optional tensorList fallback to cpu. (#119273)
add optional tensorList fallback to cpu.
Add testcases and old pr is: https://github.com/pytorch/pytorch/pull/106449

@bdhirsh
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119273
Approved by: https://github.com/bdhirsh
2024-02-07 03:54:13 +00:00