mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
By reusing `c10/metal/atomic.h` This also fixes `GPUTests.test_index_put_fallback[12]_mps` that is unrolled by inductor, so no need for dedicated atomic_add support TODOs: - Get rid of indexing kernel and compute it directly when kernel is run - Simulate atomic_add for int64 types as series of int32 atomic-add-and-fetch - Setup tolerances correctly to pass float16/bfloat16 tests (as CPU always takes sequential strategy) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151869 Approved by: https://github.com/Skylion007, https://github.com/dcci |
||
|---|---|---|
| .. | ||
| codegen | ||
| data | ||
| distributed | ||
| generated | ||
| opinfo | ||
| optests | ||
| test_module | ||
| __init__.py | ||
| autocast_test_lists.py | ||
| autograd_function_db.py | ||
| check_kernel_launches.py | ||
| common_cuda.py | ||
| common_device_type.py | ||
| common_dist_composable.py | ||
| common_distributed.py | ||
| common_dtype.py | ||
| common_fsdp.py | ||
| common_jit.py | ||
| common_methods_invocations.py | ||
| common_mkldnn.py | ||
| common_modules.py | ||
| common_mps.py | ||
| common_nn.py | ||
| common_optimizers.py | ||
| common_pruning.py | ||
| common_quantization.py | ||
| common_quantized.py | ||
| common_subclass.py | ||
| common_utils.py | ||
| composite_compliance.py | ||
| custom_op_db.py | ||
| custom_tensor.py | ||
| dist_utils.py | ||
| dynamo_test_failures.py | ||
| fake_config_module.py | ||
| fake_config_module2.py | ||
| fake_config_module3.py | ||
| hop_db.py | ||
| hypothesis_utils.py | ||
| inductor_utils.py | ||
| jit_metaprogramming_utils.py | ||
| jit_utils.py | ||
| logging_tensor.py | ||
| logging_utils.py | ||
| quantization_torch_package_models.py | ||
| static_module.py | ||
| subclasses.py | ||
| torchbind_impls.py | ||
| triton_utils.py | ||
| two_tensor.py | ||