mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-06 12:20:52 +01:00
By reusing `c10/metal/atomic.h` This also fixes `GPUTests.test_index_put_fallback[12]_mps` that is unrolled by inductor, so no need for dedicated atomic_add support TODOs: - Get rid of indexing kernel and compute it directly when kernel is run - Simulate atomic_add for int64 types as series of int32 atomic-add-and-fetch - Setup tolerances correctly to pass float16/bfloat16 tests (as CPU always takes sequential strategy) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151869 Approved by: https://github.com/Skylion007, https://github.com/dcci |
||
|---|---|---|
| .. | ||
| _internal | ||
| __init__.py | ||
| _comparison.py | ||
| _creation.py | ||
| _utils.py | ||