mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
Added op: `tile_reduce(Tensor input, Tensor(a!) out, int root, str group_name)`
For now supports only:
- NVSHMEM backed symmetric tensor;
- 2D tensor and tile;
- torch.float.
Testing on right-bottom quandrant:
```
rank 0:
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 1., 1.],
[0., 0., 0., 0., 1., 1., 1., 1.],
[0., 0., 0., 0., 1., 1., 1., 1.],
[0., 0., 0., 0., 1., 1., 1., 1.]], device='cuda:0')
PASSED
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162243
Approved by: https://github.com/ngimel
|
||
|---|---|---|
| .. | ||
| ddp | ||
| bench_nvshmem_tile_reduce.py | ||