pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Pearu Peterson	c177abd217	Disable pinning check when loading sparse tensors (#154638 ) Disables pinning check as unnecessary and to fix https://github.com/pytorch/pytorch/issues/153143 when loading sparse tensor from external storage with sparse tensor invariants check enabled. Fixes https://github.com/pytorch/pytorch/issues/153143 . For FC, to be landed two weeks after https://github.com/pytorch/pytorch/pull/154617, see https://github.com/pytorch/pytorch/pull/154617#issuecomment-2919643612. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154638 Approved by: https://github.com/amjames, https://github.com/ngimel	2025-06-18 14:33:36 +00:00
Pearu Peterson	3f3c1f419f	User-controlled sparse tensor validation when loading data from external storage (#154610 ) This PR lets users to control sparse tensor invariants validation (that can be expensive, especially, for sparse tensors with many indices) when loading data from external sources. By default, the validation of sparse tensor invariants is disabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154610 Approved by: https://github.com/amjames, https://github.com/ngimel	2025-06-02 10:17:07 +00:00
Pearu Peterson	6a781619bf	Temporarily disable sparse tensor validation when loading from external storage. (#154758 ) As in the title per https://github.com/pytorch/pytorch/issues/153143#issuecomment-2917793067 . The plan is to workout a solution that will allow (1) disabling pinned memory check to fix the original issue and (2) switching off the sparse tensor validation for maximal performance in loading sparse tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154758 Approved by: https://github.com/amjames, https://github.com/ngimel	2025-05-31 19:45:44 +00:00
Yuanhao Ji	0a7eef140b	Add `torch.Tensor._make_wrapper_subclass` to `torch/_C/__init__.pyi` (#154022 ) Fixes #153790 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154022 Approved by: https://github.com/Skylion007	2025-05-27 14:10:00 +00:00
Mikayla Gawarecki	bdcc1b579b	Allow torch.load under FakeTensorMode to load FakeTensors with correct devices (for plain Tensors) (#147786 ) This only fixes _rebuild_tensor_v2 and _rebuild_tensor_v3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147786 Approved by: https://github.com/albanD	2025-03-06 12:04:32 +00:00
Xuehai Pan	c73a92fbf5	[BE][CI] bump `ruff` to 0.9.2: multiline `assert` statements (#144546 ) Reference: https://docs.astral.sh/ruff/formatter/black/#assert-statements > Unlike Black, Ruff prefers breaking the message over breaking the assertion, similar to how both Ruff and Black prefer breaking the assignment value over breaking the assignment target: > > ```python > # Input > assert ( > len(policy_types) >= priority + num_duplicates > ), f"This tests needs at least {priority+num_duplicates} many types." > > > # Black > assert ( > len(policy_types) >= priority + num_duplicates > ), f"This tests needs at least {priority+num_duplicates} many types." > > # Ruff > assert len(policy_types) >= priority + num_duplicates, ( > f"This tests needs at least {priority + num_duplicates} many types." > ) > ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/144546 Approved by: https://github.com/malfet	2025-02-27 20:46:16 +00:00
Aaron Gokaslan	7f65a20884	[BE]: Enable ruff SLOT checks (#146276 ) This enables a check that which a class which only inherits from immutable classes like str, tuple, and NamedTuple, also defined `__slots__` so they don't allocate memory unnecessarily. This also ensure contributors think about how they define their classes with subclass NamedTuples and str, of which we have many in our codebase Pull Request resolved: https://github.com/pytorch/pytorch/pull/146276 Approved by: https://github.com/aorenste	2025-02-04 19:18:23 +00:00
Marc Horowitz	f2ad2cdf1c	[utils] add try_import method for importing optional modules (#145528 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145528 Approved by: https://github.com/albanD	2025-01-25 00:14:07 +00:00
Aaron Orenstein	f2cfe8b59f	PEP585 update - mostly toplevels (#145178 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145178 Approved by: https://github.com/bobrenjc93	2025-01-22 02:21:14 +00:00
Eric Hanson	33dee721ae	Reraise worker errors as runtime errors in more cases when the original exception can't be constructed (#140911 ) related to https://github.com/pytorch/pytorch/issues/34130 when pytorch attempts to re-raise an exception from a worker process (e.g. multiprocessing dataloader), if it can't reconstruct the original exception message due to a type error, it instead raises it as a runtime error. However, if it can't reconstruct the exception for some other reason, it throws an error with a stacktrace pointing to the `ExceptionWrapper` code rather than the original underlying issue. One case in which I run into this is with boto3's [HTTPClientError](`66dc1f8d52/botocore/exceptions.py (L94)`)s. They must be constructed with a keyword argument `error`, but if `error` isn't passed, a `KeyError` is thrown instead of a `TypeError`, due to the particular way it is implemented: * [HTTPClientError](`66dc1f8d52/botocore/exceptions.py (L94)`)'s constructor excepts variable keyword arguments it passes to `super` (BotoCoreError) * [it also defines a field `fmt` with `error`](`66dc1f8d52/botocore/exceptions.py (L95)`) * BotoCoreError [expects to be able to format that string with the kwargs](`66dc1f8d52/botocore/exceptions.py (L41)`) So in this case, if a HTTPClientError occurs on a worker process, you simply get a `KeyError: error` with a stacktrace pointing to [this line](`3e2f276a14/torch/_utils.py (L710)`) which is unhelpful. Instead, I propose to reraise the error as a `RuntimeError` unconditionally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140911 Approved by: https://github.com/vmoens	2024-12-14 03:11:36 +00:00
Tom Ritchford	dc23f1944a	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-12 17:39:14 +00:00
PyTorch MergeBot	5c97ac9721	Revert "Remove unused Python variables in torch/[_-a]* (#133492 )" This reverts commit `fda975a7b3`. Reverted https://github.com/pytorch/pytorch/pull/133492 on behalf of https://github.com/clee2000 due to Sorry, I need to revert this in order to revert something else. The only thing you need to do is rebase and remerge ([comment](https://github.com/pytorch/pytorch/pull/133492#issuecomment-2536635516))	2024-12-11 17:29:12 +00:00
Tom Ritchford	fda975a7b3	Remove unused Python variables in torch/[_-a]* (#133492 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/133492 Approved by: https://github.com/albanD	2024-12-10 21:48:44 +00:00
Nikita Shulga	c30dd35877	[Device] Add "mps" to `torch._utils._get_device_attr` (#142447 ) Follow up after https://github.com/pytorch/pytorch/pull/141098 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142447 Approved by: https://github.com/kit1980	2024-12-10 20:57:17 +00:00
Xuehai Pan	e1196dfe51	Deprecate `torch._utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-12-08 22:55:36 +00:00
Yuanhao Ji	2ee2dcb736	[Device] Add mps as device type in `torch._utils._get_available_device_type()` (#141098 ) As the title states Pull Request resolved: https://github.com/pytorch/pytorch/pull/141098 Approved by: https://github.com/malfet	2024-11-20 20:45:59 +00:00
PyTorch MergeBot	1d28b8b6d5	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit `e84d1121ad`. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally. More details in D65483292 ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2458381056))	2024-11-05 23:10:38 +00:00
Xuehai Pan	e84d1121ad	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-11-05 10:44:56 +00:00
Mikayla Gawarecki	37149d032c	Fix .to(cpu) for Storage (#138011 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/138011 Approved by: https://github.com/albanD	2024-10-23 01:31:48 +00:00
Mikayla Gawarecki	70288c3c2d	Remove dependency on numpy for serialization for XLA/open registration devices without numpy (#137444 ) Related: https://github.com/pytorch/xla/issues/7799#issuecomment-2375818263 Follow ups: Do the same for maia and mtia ## Motivation With the move to `weights_only` by default, we are making an explicit decision not to allowlist GLOBALs required to deserialize `numpy` tensors by default. The implication is that backends relying on numpy for serialization will fail loudly when `torch.load` flips `weights_only`. However, we make the observation that this dependency on numpy was legacy and is not actually needed anymore. So we can remove it, which aligns with our weights_only strategy. ## Why is this ok? The following comment on why numpy is necessary for serialization is legacy `c87c9f0a01/torch/_tensor.py (L303-L312)` We no longer do the following, though it was the case 5 years ago in the PR that added this > CPU storage is reconstructed with randomly initialized data, moved onto backend device, and then storage is updated to the serialized content Instead what now happens is that CPU storage is constructed with data from the file and then moved onto backend device. Old behavior (`legacy_load`): `67adda891a/torch/serialization.py (L620)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/137444 Approved by: https://github.com/albanD	2024-10-09 19:35:55 +00:00
Mikayla Gawarecki	a096f2899d	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Differential Revision: [D62238610](https://our.internmc.facebook.com/intern/diff/D62238610) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-09-05 16:53:39 +00:00
PyTorch MergeBot	2fd36086bc	Revert "Add torch.serialization.skip_data context manager (#134504 )" This reverts commit `94db935749`. Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/kit1980 due to See D62082697 ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2327542276))	2024-09-03 22:21:27 +00:00
Mikayla Gawarecki	94db935749	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-08-29 04:52:52 +00:00
PyTorch MergeBot	1285443994	Revert "Add torch.serialization.skip_data context manager (#134504 )" This reverts commit `202600bc23`. Reverted https://github.com/pytorch/pytorch/pull/134504 on behalf of https://github.com/mikaylagawarecki due to This is breaking Windows docs tests due to NamedTemporaryFile on Windows not working well ([comment](https://github.com/pytorch/pytorch/pull/134504#issuecomment-2316543901))	2024-08-29 01:30:49 +00:00
Mikayla Gawarecki	202600bc23	Add torch.serialization.skip_data context manager (#134504 ) ## Semantic The semantic is (1) By default `torch.serialization.skip_data(materialize_fake_tensors=False)` will make `torch.save` skip writing storages (but reserve space for them in the checkpoint). ```python import torch import torch.nn as nn sd = nn.Linear(3, 5).state_dict() with torch.serialization.skip_data(): torch.save(sd, 'foo.pt') print(torch.load('foo.pt', weights_only=True)) ``` (2) With `torch.serialization.skip_data(materialize_fake_tensors=True)`If FakeTensor is passed to `torch.save` the pickler will treat these FakeTensors as being "materialized" space will be reserved in the checkpoint for the associated storage bytes, and when loading the type will be Tensor instead of FakeTensor) ```python import torch import torch.nn as nn from torch._subclasses.fake_tensor import FakeTensorMode with FakeTensorMode(): m = nn.Linear(3, 5, dtype=torch.float16, device='cuda') sd = m.state_dict() with torch.serialization.skip_data(materialize_fake_tensors=True): torch.save(sd, 'bla.pt') print(torch.load('bla.pt', weights_only=True)) # OrderedDict([('weight', tensor([[0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.], # [0., 0., 0.]], device='cuda:0', dtype=torch.float16)), ('bias', tensor([0., 0., 0., 0., 0.], device='cuda:0', dtype=torch.float16))]) ``` ## Follow Ups - [ ] `torch.load` semantic for skip_data context manager - [ ] Mechanism for getting offsets of storages saved via this method (for writing in a separate pass) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134504 Approved by: https://github.com/albanD	2024-08-28 23:53:17 +00:00
PyTorch MergeBot	cbee9c1fd2	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit `0e7e61f7ce`. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2272370386))	2024-08-07 00:05:20 +00:00
Xuehai Pan	0e7e61f7ce	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-08-03 09:43:38 +00:00
Mikayla Gawarecki	25cec43678	Remove dependency on private _compat_pickle in CPython (#129509 ) Use the IMPORT_MAPPING and NAME_MAPPING from here https://github.com/python/cpython/blob/main/Lib/_compat_pickle.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/129509 Approved by: https://github.com/malfet ghstack dependencies: #129239, #129396	2024-06-26 14:20:27 +00:00
Xuehai Pan	f85d1e845a	[BE] enable UFMT for `torch/nn/*.py` (#128593 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128593 Approved by: https://github.com/mikaylagawarecki	2024-06-23 16:05:13 +00:00
PyTorch MergeBot	cc8193c707	Revert "[BE] enable UFMT for `torch/nn/functional.py` (#128592 )" This reverts commit `f6e6e55fa7`. Reverted https://github.com/pytorch/pytorch/pull/128592 on behalf of https://github.com/fbgheith due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/128592#issuecomment-2181783936))	2024-06-21 00:44:16 +00:00
Xuehai Pan	f6e6e55fa7	[BE] enable UFMT for `torch/nn/functional.py` (#128592 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128592 Approved by: https://github.com/mikaylagawarecki ghstack dependencies: #128596, #128594	2024-06-17 16:29:29 +00:00
PyTorch MergeBot	90bb510ece	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit `348b181a97`. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/clee2000 due to sorry I think https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456 is still relevant, I will reach out to them to see what needs to be done in internal to get this remerged ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2159248859))	2024-06-10 20:44:42 +00:00
Aaron Orenstein	afe15d2d2f	Flip default value for mypy disallow_untyped_defs [3/11] (#127840 ) See #127836 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127840 Approved by: https://github.com/oulgen	2024-06-08 18:28:01 +00:00
Xuehai Pan	348b181a97	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007	2024-06-08 15:25:03 +00:00
PyTorch MergeBot	033e733021	Revert "[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 )" This reverts commit `749a132fb0`. Reverted https://github.com/pytorch/pytorch/pull/126898 on behalf of https://github.com/fbgheith due to switching typing-extensions=4.3.0 to 4.9.0 causes internal failure ([comment](https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456))	2024-05-31 19:47:24 +00:00
Xuehai Pan	749a132fb0	[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 ) Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing. Note that only warnings that their messages contain `[Dd]eprecat(ed\|ion)` are updated in this PR. UPDATE: Use `FutureWarning` instead of `DeprecationWarning`. Resolves #126888 - #126888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898 Approved by: https://github.com/albanD	2024-05-29 12:09:27 +00:00
Xuehai Pan	ba3b05fdf3	[1/N][Easy] fix typo for `usort` config in `pyproject.toml` (`kown` -> `known`): sort stdlib (#127122 ) The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127122 Approved by: https://github.com/kit1980	2024-05-25 08:25:50 +00:00
Mikayla Gawarecki	87f79af24d	Fix map_location for wrapper subclass and device tensors that go through numpy (#126728 ) Fixes https://github.com/pytorch/pytorch/issues/124418 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126728 Approved by: https://github.com/albanD	2024-05-24 16:39:30 +00:00
Dmitry Rogozhkin	8f0c207e18	xpu: implement xpu serialization (#125530 ) Fixes: #125529 BC-breaking note: The deprecated "async" argument to the Storage.cuda and Storage.hpu has been removed. Use non_blocking instead. CC: @jbschlosser, @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/125530 Approved by: https://github.com/guangyey, https://github.com/albanD	2024-05-16 20:22:17 +00:00
Joel Schlosser	8573d9551a	Fix to preserve tensor wrapper subclass dtype through multiprocessing serialization (#125615 ) Fixes #125583 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125615 Approved by: https://github.com/albanD	2024-05-07 14:35:48 +00:00
egienvalue	73744a2c00	torch.mtia module for MTIA device backend (#123612 ) MTIA device has its own Module in PyTorch now. torch.mtia has following APIs similar to other backends. The lazy_init is also supported. ``` __all__ = [ "init", "is_available", "synchronize", "device_count", "current_device", "current_stream", "default_stream", "set_stream", "stream", "device", ] ``` ------------ For device management. We expand AccleratorHooksInterface to support generic device management and it can be used in both C++ and PyThon. ``` def _accelerator_hooks_device_count() -> _int: ... def _accelerator_hooks_set_current_device(device_index: _int) -> None: ... def _accelerator_hooks_get_current_device() -> _int : ... def _accelerator_hooks_exchange_device(device_index: _int) -> _int : ... def _accelerator_hooks_maybe_exchange_device(device_index: _int) -> _int : ... ``` --------- Adding get_device_module API to retrieve device modules for different device types. ``` def get_device_module(device: Optional[Union[torch.device, str]] = None) ``` --------- Pull Request resolved: https://github.com/pytorch/pytorch/pull/123612 Approved by: https://github.com/albanD ghstack dependencies: #123611	2024-04-26 16:17:54 +00:00
PyTorch MergeBot	e04c7b19f4	Revert "torch.mtia module for MTIA device backend (#123612 )" This reverts commit `381653de63`. Reverted https://github.com/pytorch/pytorch/pull/123612 on behalf of https://github.com/jeffdaily due to this PR broke ROCm with message RuntimeError: Cannot have MTIA with other devices ([comment](https://github.com/pytorch/pytorch/pull/123612#issuecomment-2077649762))	2024-04-25 16:06:46 +00:00
egienvalue	381653de63	torch.mtia module for MTIA device backend (#123612 ) MTIA device has its own Module in PyTorch now. torch.mtia has following APIs similar to other backends. The lazy_init is also supported. ``` __all__ = [ "init", "is_available", "synchronize", "device_count", "current_device", "current_stream", "default_stream", "set_stream", "stream", "device", ] ``` ------------ For device management. We expand AccleratorHooksInterface to support generic device management and it can be used in both C++ and PyThon. ``` def _accelerator_hooks_device_count() -> _int: ... def _accelerator_hooks_set_current_device(device_index: _int) -> None: ... def _accelerator_hooks_get_current_device() -> _int : ... def _accelerator_hooks_exchange_device(device_index: _int) -> _int : ... def _accelerator_hooks_maybe_exchange_device(device_index: _int) -> _int : ... ``` --------- Adding get_device_module API to retrieve device modules for different device types. ``` def get_device_module(device: Optional[Union[torch.device, str]] = None) ``` --------- Differential Revision: [D56443356](https://our.internmc.facebook.com/intern/diff/D56443356) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123612 Approved by: https://github.com/albanD ghstack dependencies: #123611	2024-04-24 20:51:20 +00:00
PyTorch MergeBot	929242a15c	Revert "torch.mtia module for MTIA device backend (#123612 )" This reverts commit `d7e1bf9ff9`. Reverted https://github.com/pytorch/pytorch/pull/123612 on behalf of https://github.com/jeffdaily due to This broke ROCm. see test_overrides.py ([comment](https://github.com/pytorch/pytorch/pull/123611#issuecomment-2067363780))	2024-04-19 22:44:26 +00:00
egienvalue	d7e1bf9ff9	torch.mtia module for MTIA device backend (#123612 ) MTIA device has its own Module in PyTorch now. torch.mtia has following APIs similar to other backends. The lazy_init is also supported. ``` __all__ = [ "init", "is_available", "synchronize", "device_count", "current_device", "current_stream", "default_stream", "set_stream", "stream", "device", ] ``` ------------ For device management. We expand AccleratorHooksInterface to support generic device management and it can be used in both C++ and PyThon. ``` def _accelerator_hooks_device_count() -> _int: ... def _accelerator_hooks_set_current_device(device_index: _int) -> None: ... def _accelerator_hooks_get_current_device() -> _int : ... def _accelerator_hooks_exchange_device(device_index: _int) -> _int : ... def _accelerator_hooks_maybe_exchange_device(device_index: _int) -> _int : ... ``` --------- Adding get_device_module API to retrieve device modules for different device types. ``` def get_device_module(device: Optional[Union[torch.device, str]] = None) ``` --------- @exported-using-ghexport Differential Revision: [D52923602](https://our.internmc.facebook.com/intern/diff/D52923602/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123612 Approved by: https://github.com/albanD ghstack dependencies: #123611	2024-04-18 17:38:06 +00:00
Yu, Guangye	eb7adc3ae0	Refactor gpu trace to be device-agnostic (#121794 ) # Motivation Refactor gpu trace to be device-agnostic. gpu trace is usually used in runtime components, including Device, Stream, Event, Guard, and Allocator. It should be device-agnostic and can be shared among each device backend. # Solution move `_cuda_trace.py` to `_gpu_trace.py`, which makes each device backend owns their callback, respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121794 Approved by: https://github.com/jgong5, https://github.com/albanD, https://github.com/EikanWang, https://github.com/gujinghui	2024-03-30 13:04:38 +00:00
PyTorch MergeBot	968c4c4154	Revert "Refactor gpu trace to be device-agnostic (#121794 )" This reverts commit `74deacbf31`. Reverted https://github.com/pytorch/pytorch/pull/121794 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it breaks ROCm jobs in trunk `74deacbf31`, please help take a look and reland the change ([comment](https://github.com/pytorch/pytorch/pull/121794#issuecomment-2013674083))	2024-03-21 20:33:17 +00:00
Yu, Guangye	74deacbf31	Refactor gpu trace to be device-agnostic (#121794 ) # Motivation Refactor gpu trace to be device-agnostic. gpu trace is usually used in runtime components, including Device, Stream, Event, Guard, and Allocator. It should be device-agnostic and can be shared among each device backend. # Solution move `_cuda_trace.py` to `_gpu_trace.py`, which makes each device backend owns their callback, respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121794 Approved by: https://github.com/jgong5, https://github.com/albanD, https://github.com/EikanWang, https://github.com/gujinghui	2024-03-21 01:52:58 +00:00
PyTorch MergeBot	f9ed1c432d	Revert "Refactor gpu trace to be device-agnostic (#121794 )" This reverts commit `0ff1109e26`. Reverted https://github.com/pytorch/pytorch/pull/121794 on behalf of https://github.com/jeanschmidt due to Reverting to see if rocm trunk errors are related ([comment](https://github.com/pytorch/pytorch/pull/121794#issuecomment-2007519408))	2024-03-19 15:40:26 +00:00
Yu, Guangye	0ff1109e26	Refactor gpu trace to be device-agnostic (#121794 ) # Motivation Refactor gpu trace to be device-agnostic. gpu trace is usually used in runtime components, including Device, Stream, Event, Guard, and Allocator. It should be device-agnostic and can be shared among each device backend. # Solution move `_cuda_trace.py` to `_gpu_trace.py`, which makes each device backend owns their callback, respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121794 Approved by: https://github.com/jgong5, https://github.com/albanD, https://github.com/EikanWang, https://github.com/gujinghui	2024-03-19 06:02:28 +00:00

1 2 3 4

172 Commits