pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Nikita Shulga 3aecf2dc52 [MPS] Extend index_put to half precision floats (#151869 ) By reusing `c10/metal/atomic.h` This also fixes `GPUTests.test_index_put_fallback[12]_mps` that is unrolled by inductor, so no need for dedicated atomic_add support TODOs: - Get rid of indexing kernel and compute it directly when kernel is run - Simulate atomic_add for int64 types as series of int32 atomic-add-and-fetch - Setup tolerances correctly to pass float16/bfloat16 tests (as CPU always takes sequential strategy) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151869 Approved by: https://github.com/Skylion007, https://github.com/dcci		2025-04-22 22:00:08 +00:00
..
codegen
data
distributed	Fix DTensorTestBase to barrier with device ids (#150896 )	2025-04-22 20:22:55 +00:00
generated
opinfo	Avoid overflow in vector_norm for scalar input (#144073 )	2025-04-07 17:10:10 +00:00
optests
test_module
__init__.py
autocast_test_lists.py
autograd_function_db.py
check_kernel_launches.py
common_cuda.py	[ROCm] opportunistic fastatomics for ReduceAdd operations for MI300 GPUs (#146264 )	2025-04-22 21:55:40 +00:00
common_device_type.py	Fix setUpClass() / tearDownClass() for device-specific tests (#151129 )	2025-04-16 02:18:42 +00:00
common_dist_composable.py
common_distributed.py	Reapply "ProcessGroupGloo: support lazy_init (#150801 )" (#151031 )	2025-04-11 01:58:35 +00:00
common_dtype.py
common_fsdp.py
common_jit.py
common_methods_invocations.py	Make torch._chunk_cat support non-contiguous inputs (#151263 )	2025-04-16 04:18:46 +00:00
common_mkldnn.py	[BE]: Apply ruff PERF403 to use dict comprehensions more often (#149257 )	2025-03-18 00:46:07 +00:00
common_modules.py
common_mps.py	[MPS] Extend index_put to half precision floats (#151869 )	2025-04-22 22:00:08 +00:00
common_nn.py
common_optimizers.py
common_pruning.py
common_quantization.py	[Codemod][AddExplicitStrictExportForTrainingInferenceArg] caffe2/ (#149595 )	2025-04-03 23:50:13 +00:00
common_quantized.py	wire torch._scaled_mm with fp4 operands to the cublas nvfp4 kernel (#148792 )	2025-03-27 17:32:20 +00:00
common_subclass.py
common_utils.py	Code Clean: Using the new builtin function provides by python 3.8 later (#150839 )	2025-04-10 01:17:39 +00:00
composite_compliance.py	[pytree] add APIs to determine a class is a namedtuple or PyStructSequence (#113257 )	2025-04-01 10:40:43 +00:00
custom_op_db.py
custom_tensor.py
dist_utils.py
dynamo_test_failures.py
fake_config_module.py
fake_config_module2.py
fake_config_module3.py
hop_db.py
hypothesis_utils.py
inductor_utils.py	cpp_wrapper: Fix even more tests (#147225 )	2025-04-07 14:20:06 +00:00
jit_metaprogramming_utils.py
jit_utils.py
logging_tensor.py
logging_utils.py
quantization_torch_package_models.py
static_module.py
subclasses.py
torchbind_impls.py	Fakify torchbind objects in compile_fx and add tests for SigridTransformsInstanceTorchBind (#149529 )	2025-03-21 18:58:28 +00:00
triton_utils.py	[AOTInductor] Fix autotuning code's codegen (#150522 )	2025-04-03 00:08:19 +00:00
two_tensor.py	Support subclass constructor capturing in export (#147014 )	2025-03-16 18:19:19 +00:00