Yu, Guangye
|
d7114f05b1
|
Add DeviceAllocator as the base device allocator (#138222)
# Motivation
In line with [RFC] [A device-agnostic Python device memory related API design for stream-based accelerators](https://github.com/pytorch/pytorch/issues/134978), some memory-related APIs are widely used in popular repositories, such as HuggingFace [so many if-else conditional code](https://github.com/search?q=repo%3Ahuggingface%2Faccelerate%20torch.cuda.empty_cache&type=code). We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases.
<div align="center">
<table>
<tr>
<td> Device-specific memory APIs torch.xxx.foo</td> <td> Device-agnostic memory APIs torch.accelerator.foo</td>
</tr>
<tr>
<td>
```python
torch.xxx.empty_cache
```
</td>
<td>
```python
torch.accelerator.empty_cache
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_peak_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_peak_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_accumulated_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_accumulated_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_stats
```
</td>
<td>
```python
torch.accelerator.memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_allocated
```
</td>
<td>
```python
torch.accelerator.memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_allocated
```
</td>
<td>
```python
torch.accelerator.max_memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_reserved
```
</td>
<td>
```python
torch.accelerator.memory_reserved
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_reserved
```
</td>
<td>
```python
torch.accelerator.max_memory_reserved
```
</td>
</tr>
</table>
</div>
# Solution
This design follows a similar pattern to `HostAllocator`. We're introducing a base class `DeviceAllocator`, from which `CUDAAllocator` and `XPUAllocator` will inherit. This allows us to provide a unified call path like: `torch.accelerator.empty_cache()` -> `GetDeviceAllocator(allocator)->empty_cache()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138222
Approved by: https://github.com/albanD, https://github.com/Camyll
|
2025-08-08 17:41:10 +00:00 |
|
PyTorch MergeBot
|
f3a4d742ec
|
Revert "Add DeviceAllocator as the base device allocator (#138222)"
This reverts commit f7a66da5f9.
Reverted https://github.com/pytorch/pytorch/pull/138222 on behalf of https://github.com/jithunnair-amd due to Broke ROCm periodic runs on MI300 e.g. https://github.com/pytorch/pytorch/actions/runs/16764977800/job/47470050573 ([comment](https://github.com/pytorch/pytorch/pull/138222#issuecomment-3164941815))
|
2025-08-07 16:34:36 +00:00 |
|
Yu, Guangye
|
f7a66da5f9
|
Add DeviceAllocator as the base device allocator (#138222)
# Motivation
In line with [RFC] [A device-agnostic Python device memory related API design for stream-based accelerators](https://github.com/pytorch/pytorch/issues/134978), some memory-related APIs are widely used in popular repositories, such as HuggingFace [so many if-else conditional code](https://github.com/search?q=repo%3Ahuggingface%2Faccelerate%20torch.cuda.empty_cache&type=code). We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases.
<div align="center">
<table>
<tr>
<td> Device-specific memory APIs torch.xxx.foo</td> <td> Device-agnostic memory APIs torch.accelerator.foo</td>
</tr>
<tr>
<td>
```python
torch.xxx.empty_cache
```
</td>
<td>
```python
torch.accelerator.empty_cache
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_peak_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_peak_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_accumulated_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_accumulated_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_stats
```
</td>
<td>
```python
torch.accelerator.memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_allocated
```
</td>
<td>
```python
torch.accelerator.memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_allocated
```
</td>
<td>
```python
torch.accelerator.max_memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_reserved
```
</td>
<td>
```python
torch.accelerator.memory_reserved
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_reserved
```
</td>
<td>
```python
torch.accelerator.max_memory_reserved
```
</td>
</tr>
</table>
</div>
# Solution
This design follows a similar pattern to `HostAllocator`. We're introducing a base class `DeviceAllocator`, from which `CUDAAllocator` and `XPUAllocator` will inherit. This allows us to provide a unified call path like: `torch.accelerator.empty_cache()` -> `GetDeviceAllocator(allocator)->empty_cache()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138222
Approved by: https://github.com/albanD, https://github.com/Camyll
|
2025-08-06 00:40:29 +00:00 |
|
PyTorch MergeBot
|
95b658427d
|
Revert "Add DeviceAllocator as the base device allocator (#138222)"
This reverts commit 1179e33323.
Reverted https://github.com/pytorch/pytorch/pull/138222 on behalf of https://github.com/ZainRizvi due to Very sorry but this is still breaking internally. @albanD would you be able to help get this past the finish line? D78496124 has more details on the failure and the workaround might be to do something like what's in D78684669. To validate the fixes internally, you can follow the instructions here to ghimport the changes: https://fburl.com/fixing-ghfirst-reverts ([comment](https://github.com/pytorch/pytorch/pull/138222#issuecomment-3100195370))
|
2025-07-22 01:01:41 +00:00 |
|
Yu, Guangye
|
1179e33323
|
Add DeviceAllocator as the base device allocator (#138222)
# Motivation
In line with [RFC] [A device-agnostic Python device memory related API design for stream-based accelerators](https://github.com/pytorch/pytorch/issues/134978), some memory-related APIs are widely used in popular repositories, such as HuggingFace [so many if-else conditional code](https://github.com/search?q=repo%3Ahuggingface%2Faccelerate%20torch.cuda.empty_cache&type=code). We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases.
<div align="center">
<table>
<tr>
<td> Device-specific memory APIs torch.xxx.foo</td> <td> Device-agnostic memory APIs torch.accelerator.foo</td>
</tr>
<tr>
<td>
```python
torch.xxx.empty_cache
```
</td>
<td>
```python
torch.accelerator.empty_cache
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_peak_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_peak_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_accumulated_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_accumulated_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_stats
```
</td>
<td>
```python
torch.accelerator.memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_allocated
```
</td>
<td>
```python
torch.accelerator.memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_allocated
```
</td>
<td>
```python
torch.accelerator.max_memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_reserved
```
</td>
<td>
```python
torch.accelerator.memory_reserved
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_reserved
```
</td>
<td>
```python
torch.accelerator.max_memory_reserved
```
</td>
</tr>
</table>
</div>
# Solution
This design follows a similar pattern to `HostAllocator`. We're introducing a base class `DeviceAllocator`, from which `CUDAAllocator` and `XPUAllocator` will inherit. This allows us to provide a unified call path like: `torch.accelerator.empty_cache()` -> `GetDeviceAllocator(allocator)->empty_cache()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138222
Approved by: https://github.com/albanD, https://github.com/Camyll
|
2025-07-17 01:56:01 +00:00 |
|
PyTorch MergeBot
|
3dd872e6d5
|
Revert "Add DeviceAllocator as the base device allocator (#138222)"
This reverts commit 92409b6c89.
Reverted https://github.com/pytorch/pytorch/pull/138222 on behalf of https://github.com/Camyll due to internal build failures ([comment](https://github.com/pytorch/pytorch/pull/138222#issuecomment-3002206756))
|
2025-06-25 00:11:35 +00:00 |
|
Yu, Guangye
|
92409b6c89
|
Add DeviceAllocator as the base device allocator (#138222)
# Motivation
In line with [RFC] [A device-agnostic Python device memory related API design for stream-based accelerators](https://github.com/pytorch/pytorch/issues/134978), some memory-related APIs are widely used in popular repositories, such as HuggingFace [so many if-else conditional code](https://github.com/search?q=repo%3Ahuggingface%2Faccelerate%20torch.cuda.empty_cache&type=code). We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases.
<div align="center">
<table>
<tr>
<td> Device-specific memory APIs torch.xxx.foo</td> <td> Device-agnostic memory APIs torch.accelerator.foo</td>
</tr>
<tr>
<td>
```python
torch.xxx.empty_cache
```
</td>
<td>
```python
torch.accelerator.empty_cache
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_peak_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_peak_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.reset_accumulated_memory_stats
```
</td>
<td>
```python
torch.accelerator.reset_accumulated_memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_stats
```
</td>
<td>
```python
torch.accelerator.memory_stats
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_allocated
```
</td>
<td>
```python
torch.accelerator.memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_allocated
```
</td>
<td>
```python
torch.accelerator.max_memory_allocated
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.memory_reserved
```
</td>
<td>
```python
torch.accelerator.memory_reserved
```
</td>
</tr>
<tr>
<td>
```python
torch.xxx.max_memory_reserved
```
</td>
<td>
```python
torch.accelerator.max_memory_reserved
```
</td>
</tr>
</table>
</div>
# Solution
This design follows a similar pattern to `HostAllocator`. We're introducing a base class `DeviceAllocator`, from which `CUDAAllocator` and `XPUAllocator` will inherit. This allows us to provide a unified call path like: `torch.accelerator.empty_cache()` -> `GetDeviceAllocator(allocator)->empty_cache()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138222
Approved by: https://github.com/albanD
|
2025-06-23 08:49:30 +00:00 |
|